Methodology White Papers

Combining Labeled and Unlabeled Data for MultiClass Text Categorization

Overview Supervised learning techniques for text classification often require a large number of labeled examples to learn accurately. Current text learning techniques for combining labeled and unlabeled, such as EM and Co-Training are mostly applicable for classification tasks with a small number of classes and do not scale up well for large multiclass problems. In this paper, Accenture develop a framework to incorporate unlabeled data in the Error-Correcting Output Coding (ECOC) setup by first decomposing multiclass problems into multiple binary problems and then using Co-Training to learn the individual binary classification problems.

Further White Paper Details
PublisherAccenture File FormatPDF, requires Acrobat Rdr 5
Date PublishedApril 2002 Downloads94
FormatWhite Papers   
Topics
E4 embraces web 2.0 audience

E4 embraces web 2.0 audience

Case study: How the Channel 4's teen channel put its mind to building a community website... more

Cheat Sheet: Cloud computing

Cheat Sheet: Cloud computing

A tech storm is brewing...  more


Quick Sitemap Links: