JAIST Repository >
b. 情報科学研究科・情報科学系 >
b10. 学術雑誌論文等 >
b10-1. 雑誌掲載論文 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/7832

タイトル: Automatic Extraction of the Fine Category of Person Named Entities from Text Corpora
著者: NGUYEN, Tri-Thanh
キーワード: fine person categories extraction
named entities
pattern extraction
発行日: 2007-10-01
出版者: 電子情報通信学会
誌名: IEICE TRANSACTIONS on Information and Systems
巻: E90-D
号: 10
開始ページ: 1542
終了ページ: 1549
DOI: 10.1093/ietisy/e90-d.10.1542
抄録: Named entities play an important role in many Natural Language Processing applications. Currently, most named entity recognition systems rely on a small set of general named entity (NE) types. Though some efforts have been proposed to expand the hierarchy of NE types, there are still a fixed number of NE types. In real applications, such as question answering or semantic search systems, users may be interested in more diverse specific NE types. This paper proposes a method to extract categories of person named entities from text documents. Based on Dual Iterative Pattern Relation Extraction method, we develop a more suitable model for solving our problem, and explore the generation of different pattern types. A method for validating whether a category is valid or not is proposed to improve the performance, and experiments on Wall Street Journal corpus give promising results.
Rights: Copyright (C)2007 IEICE. Tri-Thanh Nguyen, Akira Shimazu, IEICE TRANSACTIONS on Information and Systems, E90-D(10), 2007, 1542-1549. http://www.ieice.org/jpn/trans_online/
URI: http://hdl.handle.net/10119/7832
資料タイプ: publisher
出現コレクション:b10-1. 雑誌掲載論文 (Journal Articles)


ファイル 記述 サイズ形式
A11970.pdf430KbAdobe PDF見る/開く



お問合せ先 : 北陸先端科学技術大学院大学 研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)