Network Security White Papers
Linked Latent Dirichlet Allocation in Web Spam Filtering
Overview Latent Dirichlet Allocation (LDA) is a fully generative statistical language model on the content and topics of a corpus of documents. This paper applies an extension of LDA for web spam classification. The inferred LDA model can be applied for classification as dimensionality reduction similarly to latent semantic indexing. They test linked LDA on the WEBSPAMUK2007 corpus. By using BayesNet classifier, in terms of the AUC of classification, they achieve 3% improvement over plain LDA with BayesNet, and 8% over the public link features with C4.5. The addition of this method to log-odds based combination of strong link and content baseline classifiers results in a 3% improvement in AUC. Their method even slightly improves over the best Web Spam Challenge 2008 result.
| Publisher | Association for Computing Machinery | File Format | |
|---|---|---|---|
| Date Published | April 2009 | ||
| Format | White Papers | ||
| Topics | |||
Balancing Security Against Productivity
What makes for great security? Is it about keeping the bad guys out or letting the good guys in? About defending attacks or preventing them? When IDG Research Services queried...
Security: New strides in preventing intrusions.
Need help eliminating risk in your IT environment? This ForwardView webshow describes how security appliances, which incorporate an array of security functions, can help you ward off security breaches without...
MessageLabs Intelligence : 2009 security Predictions
Having analyzed the global threat landscape for almost a decade, MessageLabs Team Skeptic™ is comprised of many world-renowned malware and spam experts who have a global view of threats across...
IDC Vendor Spotlight
Organised ubiquity is a must for organisations to sucessfully "project" their users in any given landspace, at any given time, with secuirty policy. This White Paper covers issues surrounding secure...
Trend Micro Enterprise Security white paper
This white paper reviews the content security threat landscape and how it has evolved into a more dangerous and high risk environment. The paper discussed how conventional content security approaches...



