タイトル: Study on Supervised Learning of Vietnamese Word Sense Disambiguation Classifiers
著者: Nguyen, Minh Hai
Shirai, Kiyoaki
キーワード: Word Sense Disambiguation
Supervised Machine Learning
発行日: 2012-03
出版者: The Association for Natural Language Processing
誌名: Journal of Natural Language Processing
巻: 19
号: 1
開始ページ: 25
終了ページ: 50
抄録: It is said that Vietnamese is a language with highly ambiguous words. However, there has been no published Word Sense Disambiguation (WSD hereafter) research on this language. This current research is the first attempt to study Vietnamese WSD. Especially, we would like to explore the effective features for training WSD classifiers and verify theapplicability of the `pseudoword' technique to both investigating effectiveness of features and training WSD classifiers. Three tasks have been conducted, using two corpora which were built manually based on Vietnamese Treebankand automatically by applying pseudowords technique. Experiment results showed that Bag-Of-Word feature performs well for all three categories of words (verbs, nouns, and adjectives). However,its combination with POS, Collocation or Syntactic features can not significantly improve the performance of WSD classifiers. Moreover, the experiment results confirmed that pseudoword is a suitable technique to explore the effectiveness of features in disambiguation of Vietnamese verbs and adjectives. Furthermore, we empirically evaluated the applicability of the pseudoword technique as an unsupervised learning method for real Vietnamese WSD.
Rights: Copyright (C) 2012 The Association for Natural Language Processing. Minh Hai Nguyen, Kiyoaki Shirai, Journal of Natural Language Processing, 19(1), 2012, 25-50.
