JAIST Repository >
科学技術開発戦略センター 2003~2008 >
z2-70. JAIST PRESS 発行誌等 >
KSS'2007 >
このアイテムの引用には次の識別子を使用してください:
http://hdl.handle.net/10119/4124
|
タイトル: | A Comparative Study on the Distribution of Chinese and English Multi-Words |
著者: | Zhang, Wen Yoshida, Taketoshi Tang, Xijin |
キーワード: | multi-word term-distribution Poisson distribution G-distribution |
発行日: | Nov-2007 |
出版者: | JAIST Press |
抄録: | A study about the distribution of multi-words in both Chinese text and English text was carried out to explore a theoretical basis for probabilistic term-weighting scheme. Poisson distribution and G-distribution are comparatively studied to describe the relationship between words' frequency and number of occurrences, for both technical multi-words and non-technical multi-words. Also, a rule-based multi-word extraction algorithm was proposed to extract the multi-words from texts based on occurring structures and syntactical patterns in texts. Our experimental results demonstrated that G-distribution has a better capability than Poisson distribution in description of the relationship between multi-words' frequency and number of occurrences for technical multi-words and non-technical multi-words. |
記述: | The original publication is available at JAIST Press http://www.jaist.ac.jp/library/jaist-press/index.html Proceedings of KSS'2007 : The Eighth International Symposium on Knowledge and Systems Sciences : November 5-7, 2007, [Ishikawa High-Tech Conference Center, Nomi, Ishikawa, JAPAN] Organized by: Japan Advanced Institute of Science and Technology |
言語: | ENG |
URI: | http://hdl.handle.net/10119/4124 |
ISBN: | 9784903092072 |
出現コレクション: | KSS'2007
|
このアイテムのファイル:
ファイル |
記述 |
サイズ | 形式 |
10.pdf | | 110Kb | Adobe PDF | 見る/開く |
|
当システムに保管されているアイテムはすべて著作権により保護されています。
|