Learning to Identify Review Spam

Li, Fangtao; Huang, Minlie; Yang, Yi; Zhu, Xiaoyan

Learning to Identify Review Spam

dc.contributor.author	Li, Fangtao
dc.contributor.author	Huang, Minlie
dc.contributor.author	Yang, Yi
dc.contributor.author	Zhu, Xiaoyan
dc.contributor.editor	Walsh, Toby
dc.date.accessioned	2012-05-22T19:56:08Z
dc.date.available	2012-05-22T19:56:08Z
dc.date.copyright	2011
dc.date.issued	2011-07
dc.description	Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence	en
dc.description.abstract	In the past few years, sentiment analysis and opinion mining becomes a popular and important task. These studies all assume that their opinion resources are real and trustful. However, they may encounter the faked opinion or opinion spam problem. In this paper, we study this issue in the context of our product review mining system. On product review site, people may write faked reviews, called review spam, to promote their products, or defame their competitors’ products. It is important to identify and filter out the review spam. Previous work only focuses on some heuristic rules, such as helpfulness voting, or rating deviation, which limits the performance of this task. In this paper, we exploit machine learning methods to identify review spam. Toward the end, we manually build a spam collection from our crawled reviews. We first analyze the effect of various features in spam identification. We also observe that the review spammer consistently writes spam. This provides us another view to identify review spam: we can identify if the author of the review is spammer. Based on this observation, we provide a two-view semi-supervised method, co-training, to exploit the large amount of unlabeled data. The experiment results show that our proposed method is effective. Our designed machine learning methods achieve significant improvements in comparison to the heuristic baselines.	en
dc.format	Text	en
dc.format.extent	1 digital file (p. 2488-2493 : ill.)	en
dc.identifier.citation	Li, F., Huang, M., Yang, Y., & Zhu, X. (2011). Learning to Identify Review Spam. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, ES. (p. 2488-2493). AAAI / International Joint Conferences on Artificial Intelligence Press. doi:10.5591/978-1-57735-516-8/IJCAI11-414	en
dc.identifier.isbn	978-1-57735-513-7
dc.identifier.uri	http://hdl.handle.net/10625/49046
dc.language.iso	en
dc.publisher	AAAI Press / International Joint Conferences on Artificial Intelligence, Menlo Park, California	en
dc.subject	SENTIMENT ANALYSIS	en
dc.subject	SPAM	en
dc.subject	PRODUCT REVIEWS	en
dc.subject	MACHINE LEARNING	en
dc.subject	MARKETING	en
dc.title	Learning to Identify Review Spam	en
dc.type	Conference Paper	en
idrc.copyright.holder	International Joint Conferences on Artificial Intelligence
idrc.dspace.access	IDRC Only	en
idrc.noaccess	Due to copyright restrictions the full text of this research output is not available in the IDRC Digital Library or by request from the IDRC Library. / Compte tenu des restrictions relatives au droit d'auteur, le texte intégral de cet extrant de recherche n'est pas accessible dans la Bibliothèque numérique du CRDI, et il n'est pas possible d'en faire la demande à la Bibliothéque du CRDI.	en
idrc.project.componentnumber	104519006
idrc.project.number	104519
idrc.project.title	International Research Chairs Initiative (IRCI)	en
idrc.rims.adhocgroup	IDRC SUPPORTED	en

Collections

IDRC Research Results / Résultats de recherches du CRDI
2010-2019 / Années 2010-2019
Breaking the barriers to Internet access / Faire tomber les obstacles entravant l’accès à Internet

Learning to Identify Review Spam

Files

Collections