个人信息
Personal information
教授 博士生导师 硕士生导师
性别:男
在职信息:在职
所在单位:计算机科学与技术学院
学历:研究生(博士)毕业
学位:工学博士学位
毕业院校:浙江大学
学科:计算机系统结构曾获荣誉:
2015 湖北省优秀硕士论文指导老师
2013 湖北省优秀硕士论文指导老师
2009 湖北省优秀学士论文指导老师
论文类型:论文集
第一作者:Wei Lu
合写作者:Zhaobo Zhang,Pingpeng Yuan,Hai Jin,Qiangsheng Hua
发表刊物:CIKM'22
收录刊物:EI
所属单位:计算机科学与技术学院
学科门类:工学
一级学科:计算机科学与技术
文献类型:C
发表时间:2022-08-15
摘要:Learning Chinese word embeddings is important in many tasks of
Chinese language information processing, such as entity linking,
entity extraction and knowledge graph. A Chinese word consists of Chinese characters, which can be decomposed into sub-characters (radical, component, stroke, etc). Similar to roots in English words, sub-characters also indicate the origins and basic semantics of Chinese characters. So, many researches follow the approaches designed for learning embeddings of English words to improve Chinese word embeddings. However, some Chinese characters sharing the same sub-characters have different meanings. Furthermore, with more cultural interaction and the popularization of the Internet and web, many neologisms, such as transliterated loanwords and network terms, are emerging, which are only close to the pronunciation of their characters, but far from their semantics. Here, a tripartite weighted graph is proposed to model the semantic relationship among words,
characters and sub-characters, in which the semantic relationship
is evaluated according to the Chinese linguistic information. So,
the semantic relevance hidden in lower components (sub-characters, characters) can be used to further distinguish the semantics of corresponding higher components (characters, words). Then, the tripartite weighted graph is fed into our Chinese word embedding model insideCC to reveal the semantic relationship among different language components, and learn the embeddings of words. Extensive experimental results on multiple corpora and datasets verify that our proposed methods outperform the state-of-the-art counterparts by a significant margin