报告题目:On the Connection between Vision and Language
报告人:张含望博士
单位:美国哥伦比亚大学
报告时间:2016年12月16日(周五)下午2:00-3:00
报告地点:逸夫科教楼508会议室
报告人简介: 张含望博士于2009年在浙江大学获得学士学位,于2014年在新加坡国立大学获得博士学位,现为美国哥伦比亚大学计算机系DVMM实验室的研究员。张博士长期致力于多媒体与机器视觉领域研究,在CVPR、ICCV、ACM Multimedia、SIGIR、AAAI、TKDE、TIST、TIP、TOMMCAP等多个顶级国际会议和期刊发表论文四十余篇(包括十次oral论文),曾获得多媒体领域顶级国际会议ACM Multimedia最佳演示提名奖(2012),ACM Multimedia最佳学生论文奖(2013),新加坡国立大学计算机学院最佳博士论文奖(2014),SIGIR最佳论文提名奖(2016)。张博士担任了Multimedia Tools and Applications、Neurocomputing等多个国际期刊副主编,亦是TIP、TMM、TCSVT、TOMMCAP等顶尖学术期刊审稿人。
报告内容摘要:We are experiencing an unprecedented evolution of deep learning techniques. As a holy grail of Artificial Intelligence, the task of connecting vision and language perhaps gains the most appreciable benefits in recent years. For example, today’s machines are able to outperform humans in large-scale visual recognition, describe or answer questions about an image/video in natural language; and none of the above is thinkable just a decade ago. In this talk, I will first provide a brief retrospect of the progress of the connection in the pre- and current deep learning era. Then, I will introduce our recent works in addressing what is missing in the state-of-the-art paradigm, including the groundings of open-vocabulary, social networks, and deeper scene understanding. At last, several interesting future research directions are discussed.
计算机与信息学院