Keyword extraction and headline generation using novel word features

Xu, S; Yang, S; Lau, FCM

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-77958586107

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Keyword extraction and headline generation using novel word features

Title	Keyword extraction and headline generation using novel word features
Authors	Xu, S Yang, S Lau, FCM
Issue Date	2010
Citation	Proceedings Of The National Conference On Artificial Intelligence, 2010, v. 3, p. 1461-1466 How to Cite?
Abstract	We introduce several novel word features for keyword extraction and headline generation. These new word features are derived according to the background knowledge of a document as supplied by Wikipedia. Given a document, to acquire its background knowledge from Wikipedia, we first generat e a query for searching the Wikipedia corpus based on the key facts present in the document. We then use the query to find articles in the Wikipedia corpus that are closely related to the contents of the document. With the Wikipedia search result article set, we extract the inlink, outlink, category and infobox information in each article to derive a set of novel word features which reflect the document's background knowledge. These newly introduced word features of fer valuable indications on individual words' importance in the input document. They serve as nice complements to the traditional word features derivable from explicit information of a document. In addition, we also introduce a word-document fitness feat ure to characterize the influence of a document's genre on the keyword extraction and headline generation process. We study the effectiveness of these novel word features for keyword extraction and headline generation by experiments and have obtained very encouraging results. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Persistent Identifier	http://hdl.handle.net/10722/151980
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Xu, S	en_US
dc.contributor.author	Yang, S	en_US
dc.contributor.author	Lau, FCM	en_US
dc.date.accessioned	2012-06-26T06:31:51Z	-
dc.date.available	2012-06-26T06:31:51Z	-
dc.date.issued	2010	en_US
dc.identifier.citation	Proceedings Of The National Conference On Artificial Intelligence, 2010, v. 3, p. 1461-1466	en_US
dc.identifier.uri	http://hdl.handle.net/10722/151980	-
dc.description.abstract	We introduce several novel word features for keyword extraction and headline generation. These new word features are derived according to the background knowledge of a document as supplied by Wikipedia. Given a document, to acquire its background knowledge from Wikipedia, we first generat e a query for searching the Wikipedia corpus based on the key facts present in the document. We then use the query to find articles in the Wikipedia corpus that are closely related to the contents of the document. With the Wikipedia search result article set, we extract the inlink, outlink, category and infobox information in each article to derive a set of novel word features which reflect the document's background knowledge. These newly introduced word features of fer valuable indications on individual words' importance in the input document. They serve as nice complements to the traditional word features derivable from explicit information of a document. In addition, we also introduce a word-document fitness feat ure to characterize the influence of a document's genre on the keyword extraction and headline generation process. We study the effectiveness of these novel word features for keyword extraction and headline generation by experiments and have obtained very encouraging results. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.	en_US
dc.language	eng	en_US
dc.relation.ispartof	Proceedings of the National Conference on Artificial Intelligence	en_US
dc.title	Keyword extraction and headline generation using novel word features	en_US
dc.type	Conference_Paper	en_US
dc.identifier.email	Lau, FCM:fcmlau@cs.hku.hk	en_US
dc.identifier.authority	Lau, FCM=rp00221	en_US
dc.description.nature	link_to_subscribed_fulltext	en_US
dc.identifier.scopus	eid_2-s2.0-77958586107	en_US
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-77958586107&selection=ref&src=s&origin=recordpage	en_US
dc.identifier.volume	3	en_US
dc.identifier.spage	1461	en_US
dc.identifier.epage	1466	en_US
dc.identifier.scopusauthorid	Xu, S=7404439278	en_US
dc.identifier.scopusauthorid	Yang, S=36620658700	en_US
dc.identifier.scopusauthorid	Lau, FCM=7102749723	en_US

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Keyword extraction and headline generation using novel word features

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats