Google

Wednesday, August 30, 2006

Content spam


These techniques involve altering the logical view that a search engine has over the page's contents. They all aim at variants of the vector space model for information retrieval on text collections.

  • Hidden or invisible text
    • Disguising keywords and phrases by making them the same (or almost the same) color as the background, using a tiny font size or hiding them within the HTML code such as "no frame" sections, ALT attributes and "no script" sections. This is useful to make a page appear to be relevant for a web crawler in a way that makes it more likely to be found. Example: A promoter of a Ponzi scheme wants to attract web surfers to a site where he advertises his scam. He places hidden text appropriate for a fan page of a popular music group on his page, hoping that the page will be listed as a fan site and receive many visits from music lovers. However, hidden text is not always spamdexing: it can also be used to enhance accessibility.
  • Keyword stuffing
    • This involves the insertion of hidden, random text on a webpage to raise the keyword density or ratio of keywords to other words on the page. Older versions of indexing programs simply counted how often a keyword appeared, and used that to determine relevance levels. Most modern search engines have the ability to analyze a page for keyword stuffing and determine whether the frequency is above a "normal" level.
  • Meta tag stuffing
    • Repeating keywords in the Meta tags, and using keywords that are unrelated to the site's content.
  • Gateway or doorway pages
    • Creating low-quality web pages that contain very little content but are instead stuffed with very similar key words and phrases. They are designed to rank highly within the search results. A doorway page will generally have "click here to enter" in the middle of it.
  • Scraper sites
    • Scraper sites, also known as Made for AdSense sites, are created using various programs designed to 'scrape' search engine results pages or other sources of content and create 'content' for a website. These types of websites are generally full of advertising, or redirect the user to other sites.

0 Comments:

Post a Comment

<< Home