Learning to Identify Review Spam

dc.contributor.authorLi, Fangtao
dc.contributor.authorHuang, Minlie
dc.contributor.authorYang, Yi
dc.contributor.authorZhu, Xiaoyan
dc.contributor.editorWalsh, Toby
dc.date.accessioned2012-05-22T19:56:08Z
dc.date.available2012-05-22T19:56:08Z
dc.date.copyright2011
dc.date.issued2011-07
dc.descriptionProceedings of the Twenty-Second International Joint Conference on Artificial Intelligenceen
dc.description.abstractIn the past few years, sentiment analysis and opinion mining becomes a popular and important task. These studies all assume that their opinion resources are real and trustful. However, they may encounter the faked opinion or opinion spam problem. In this paper, we study this issue in the context of our product review mining system. On product review site, people may write faked reviews, called review spam, to promote their products, or defame their competitors’ products. It is important to identify and filter out the review spam. Previous work only focuses on some heuristic rules, such as helpfulness voting, or rating deviation, which limits the performance of this task. In this paper, we exploit machine learning methods to identify review spam. Toward the end, we manually build a spam collection from our crawled reviews. We first analyze the effect of various features in spam identification. We also observe that the review spammer consistently writes spam. This provides us another view to identify review spam: we can identify if the author of the review is spammer. Based on this observation, we provide a two-view semi-supervised method, co-training, to exploit the large amount of unlabeled data. The experiment results show that our proposed method is effective. Our designed machine learning methods achieve significant improvements in comparison to the heuristic baselines.en
dc.formatTexten
dc.format.extent1 digital file (p. 2488-2493 : ill.)en
dc.identifier.citationLi, F., Huang, M., Yang, Y., & Zhu, X. (2011). Learning to Identify Review Spam. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, ES. (p. 2488-2493). AAAI / International Joint Conferences on Artificial Intelligence Press. doi:10.5591/978-1-57735-516-8/IJCAI11-414en
dc.identifier.isbn978-1-57735-513-7
dc.identifier.urihttp://hdl.handle.net/10625/49046
dc.language.isoen
dc.publisherAAAI Press / International Joint Conferences on Artificial Intelligence, Menlo Park, Californiaen
dc.subjectSENTIMENT ANALYSISen
dc.subjectSPAMen
dc.subjectPRODUCT REVIEWSen
dc.subjectMACHINE LEARNINGen
dc.subjectMARKETINGen
dc.titleLearning to Identify Review Spamen
dc.typeConference Paperen
idrc.copyright.holderInternational Joint Conferences on Artificial Intelligence
idrc.dspace.accessIDRC Onlyen
idrc.noaccessDue to copyright restrictions the full text of this research output is not available in the IDRC Digital Library or by request from the IDRC Library. / Compte tenu des restrictions relatives au droit d'auteur, le texte intégral de cet extrant de recherche n'est pas accessible dans la Bibliothèque numérique du CRDI, et il n'est pas possible d'en faire la demande à la Bibliothéque du CRDI.en
idrc.project.componentnumber104519006
idrc.project.number104519
idrc.project.titleInternational Research Chairs Initiative (IRCI)en
idrc.rims.adhocgroupIDRC SUPPORTEDen

Files