タイトル: A Primary Study on Summarization of Documents in Vietnamese
著者: Thanh, Le Ha
Quyet, Thang Huynh
Chi, Mai Luong
キーワード: Text Summarization
Sentence Extraction
Linear Combination
Statistical Methods
発行日: Nov-2005
出版者: JAIST Press
抄録: There are some statistical-based sentence extraction methods applied to English documents to get the automatically summaries. In this paper, we present a Vietnamese text summarization case-study based on evaluation and extraction of highly informative sentences to abstract documents, assisting users in reducing the time required to study and grasp information in Vietnamese, particularly appropriating to news from Vietnamese sites. Our case-study combines various statistical sentences extraction methods which do not require more linguistic resources whereas provide fast approaches. From a set of sentences, we choose the most important ones depending on an input compression rates and generate a summarized document. Particularly, we use mostly Vietnamese linguistic characteristics to preprocess the source and improve the result. After using some content evaluating methods, comparing with other current approaches, our investigation shows interesting and satisfactory results. We get approximately 0.73 for the precision of traditional evaluating method, approximately 0.67 for the average of content similarities.
記述: The original publication is available at JAIST Press http://www.jaist.ac.jp/library/jaist-press/index.html
IFSR 2005 : Proceedings of the First World Congress of the International Federation for Systems Research : The New Roles of Systems Sciences For a Knowledge-based Society : Nov. 14-17, 2118, Kobe, Japan
Symposium 5, Session 2 : Data/Text Mining from Large Databases Text Mining
言語: ENG
URI: http://hdl.handle.net/10119/3908
ISBN: 4-903092-02-X
出現コレクション:IFSR 2005


