报告题目:Addressing vision tasks with large foundation models: how far can we go without training?
报告人:王逸鸣博士
工作单位:意大利FBK研究院
时间:2024年12月26日(星期四)15:00
地点:合肥工业大学翡翠科教楼A座1104
报告简介:
Recent advancements in Vision and Language Models (VLMs) have significantly impacted computer vision research, particularly thanks to their ability to interpret multimodal information within a unified representation space. Notably, the generalisation capability of VLMs, honed through extensive web-scale data pre-training, has shown remarkable performance in zero-shot recognition.
As direct competition in developing such large models is not a viable option for most public institutes due to their limited resources, we have explored new research opportunities following a training-free methodology by leveraging pre-trained models and existing databases that contain rich world knowledge.
In this talk, I will first present how we exploit VLMs to approach image classification without training, particularly, in the low-resource domains where images or their annotations are scarce, achieving very competitive performance. Then, I will present how VLMs and Large Language Models (LLMs) can be synergised in a training-free manner to advance video understanding, in particular in recognising anomalous patterns in video content.
报告人简介:
Yiming Wang is a Researcher in the Deep Visual Learning (DVL) Unit in Fondazione Bruno Kessler (FBK), Italy. She has expertise on vision-based scene understanding that facilitates automation and social good, covering diverse topics on static scene modeling, semantic understanding and video analysis.
She obtained the PhD in 2018 from Queen Mary University of London (QMUL) under the supervision of Prof. Andrea Cavallaro. Previously, she was a post-doc in the Pattern Analysis and Computer Vision (PAVIS) research line at Istituto Italiano di Tecnologia (IIT), working mostly on active 3D vision. Her recent research focuses on training-free methods leveraging foundation models to address vision tasks.
She has served as Reviewer in many top-tier vision/robotics conferences and journals (Outstanding Reviewer BMVC 2021), and as Area Chair for ICRA'24, ECCV'24 and CVPR'25. She is Associate Editor in International Journal of Social Robotics (SoRo). She is currently responsible for a funded innovative project on low-carbon learning algorithms funded by CariVerona. She is an ELLIS member.