JAIST Repository >
School of Knowledge Science >
Articles >
Journal Articles >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10119/9212

Title: Improving effectiveness of mutual information for substantival multiword expression extraction
Authors: Zhang, Wen
Yoshida, Taketoshi
Tang, Xijin
Ho, Tu Bao
Keywords: Substantival multiword expression
Mutual information
Enhanced mutual information
Collocation optimization
Issue Date: 2009-02-20
Publisher: Elsevier
Magazine name: Expert Systems with Applications
Volume: 36
Number: 8
Start page: 10919
End page: 10930
DOI: 10.1016/j.eswa.2009.02.026
Abstract: One of the deficiencies of mutual information is its poor capacity to measure association of words with unsymmetrical co-occurrence, which has large amounts for multi-word expression in texts. Moreover, threshold setting, which is decisive for success of practical implementation of mutual information for multi-word extraction, brings about many parameters to be predefined manually in the process of extracting multiword expressions with different number of individual words. In this paper, we propose a new method as EMICO (Enhanced Mutual Information and Collocation Optimization) to extract substantival multiword expression from text. Specifically, enhanced mutual information is proposed to measure the association of words and collocation optimization is proposed to automatically determine the number of individual words contained in a multiword expression when the multiword expression occurs in a candidate set. Our experiments showed that EMICO significantly improves the performance of substantival multiword expression extraction in comparison with a classic extraction method based on mutual information.
Rights: NOTICE: This is the author's version of a work accepted for publication by Elsevier. Wen Zhang, Taketoshi Yoshida, Xijin Tang, and Tu-Bao Ho, Expert Systems with Applications, 36(8), 2009, 10919-10930, http://dx.doi.org/10.1016/j.eswa.2009.02.026
URI: http://hdl.handle.net/10119/9212
Material Type: author
Appears in Collections:a10-1. 雑誌掲載論文 (Journal Articles)

Files in This Item:

File Description SizeFormat
13876.pdf545KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.


Contact : Library Information Section, Japan Advanced Institute of Science and Technology