Quality-biased ranking of short texts in microblogging services
Date
2011
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Asian Federation of Natural Language Processing (AFNLP)
Abstract
The abundance of user-generated content
comes at a price: the quality of content may
range from very high to very low. We propose
a regression approach that incorporates
various features to recommend short-text documents
from Twitter, with a bias toward quality
perspective. The approach is built on top
of a linear regression model which includes
a regularization factor inspired from the content
conformity hypothesis - documents similar
in content may have similar quality. We
test the system on the Edinburgh Twitter corpus.
Experimental results show that the regularization
factor inspired from the hypothesis
can improve the ranking performance and that
using unlabeled data can make ranking performance
better. Comparative results show that
our method outperforms several baseline systems.
We also make systematic feature analysis
and find that content quality features are
dominant in short-text ranking.
Description
Meeting: 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, November 8 - 13, 2011
item.page.type
Conference Paper
item.page.format
Text
Keywords
USER GENERATED CONTENT, BLOGGING, DATA QUALITY
Citation
Minlie Huang, Yi Yang, & Xiaoyan Zhu (2011). Quality-biased Ranking of Short Texts in Microblogging Services. Proceedings of the 5th International Joint Conference on Natural Language Processing, 373-382.