Herself’s Artificial Intelligence

Humans, meet your replacements.

Herself’s Artificial Intelligence header image 1

Use artficial intelligence to sort link spam from legitimate links

Anyone running a website has been plagued by link spam. It shows up in your access-log files, in false comments on a blog, even in user registrations on a blog. An incredible amount of resources are being put into both sides of this battle. Do a search on any search engine and you’ll find many sites in the top ten results who are just pages of links, or partially scraped content from several other sites. These sites hold little value to the users and hurt legitimate sites.

Identifying and preventing spam was cited as one of the top challenges in web search engines in a 2002 paper. Amit Singhal, principal scientist of Google Inc. estimated that the search engine spam industry had a revenue potential of $4.5 billion in year 2004 if they had been able to completely fool all search engines on all commercially viable queries. Due to the large and ever increasing financial gains resulting from high search engine ratings, it is no wonder that a significant amount of human and machine resources are devoted to artificially inflating the rankings of certain web pages. . . . [ read more Spam Rank - Full automatic link spam detection work in progress ( pdf )]

Link spam is different than regular links in that it shows up in access-logs, links in are often only found in comments on blogs, links may be hidden ( text same color as background ) or cloaked ( show users some thing different than you show search engine bots ) and often all appear in a very short period of time. Other tells include lots of links from low page rank sites or lots of links from sites with the same page rank. Page rank follows a power law and incoming links should do the same.

More information:
Transductive link spam detection (pdf)
Spam Rank - Full automatic link spam detection work in progress ( pdf )
Detecting link spam using temporal information (pdf )

Tags: artificial intelligence in the news · bots

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

You must log in to post a comment.