陈平教授讲座

来源: 日期:2014-01-03编辑人:张平洋
主讲 时间
地点

题目:Study of Two Critical Problems in Lexical Semantics

主讲人:陈平 教授 Department of Computer Science University of Massachusetts Boston

时间:2013年7月10日(周三)上午9:00

地点:西一楼第一会议室344室

欢迎广大师生前来参加!

陈平教授简历:

Dr. Ping Chen is an Associate Professor of Computer Science and Director of Artificial Intelligence Lab at the University of Massechussetts-Boston. His research interests include Data Mining and Computational Semantics. Dr. Chen has received seven National Science Foundation grants, one grant from Department of Homeland Security, one grant from Veteran Affairs, and published over 50 papers in major Data Mining, Artificial Intelligence, and Computational Linguistics conferences and journals. Dr. Ping Chen received his BS degree on Information Science from Xi'an Jiao Tong University, MS degree on Computer Science from Chinese Academy of Sciences, and Ph.D degree on Information Technology at George Mason University.

讲座内容:

Lexical semantics studies how and what the words of a language denote, an important topic in Computational Linguistics. In this presentation we will discuss two critical tasks to understand semantics at the lexicon level: word sense disambiguation and co-reference resolution. We have built state-of-art techniques for both tasks and performed rigorous evaluations in real-world settings.

(1) Word sense disambiguation is the process of determining which sense of a word is used in a given context. Due to its importance in understanding semantics and many real-world applications, word sense disambiguation has been extensively studied in Natural Language Processing and Computational Linguistics. However, existing methods either narrowly focus on a few specific words due to their reliance on expensive manually annotated training text, or give only mediocre performance in real-world settings. Broad coverage and disambiguation quality are critical for real-world natural language processing applications. We have developed a fully automatic disambiguation method that utilizes two readily available knowledge sources: a dictionary and knowledge extracted from unannotated text. Such an automatic approach overcomes the knowledge acquisition bottleneck and makes broad-coverage word sense disambiguation feasible in practice. We evaluated our system with SemEval 2007 Task 07 corpus, our system significantly outperforms the best unsupervised system and achieves the similar performance as the top-performing supervised systems.

(2) Co-reference resolution is the process of linking together concepts that refer to the same entity. The ability to have computers automatically find this type of relation in text documents is of interest to the fields of Artificial Intelligence and Computational Linguistics. We fully developed a knowledge-based co-reference resolution system using UMLS and WordNet. To evaluation our system in a real-world setting, we participated in the 2011 i2b2 Natural Language Processing Challenge, which targets on co-reference resolution in medical documents. Concept mentions have been annotated in clinical texts, and the mentions that co-refer in each document are to be linked by co-reference chains. Our system achieved 89.6% overall performance and was ranked in the top tier along with other three resolutions systems

陈妍

西安交通大学计算机科学与技术系

& 陕西省天地网技术重点实验室

Tel: +86-029-82668642-8017

E_Mail: chenyan@mail.xjtu.edu.cn

----------------------------------------------

2013-07-03