The World of Spamdexing

Search engines, notably Google, Yahoo! and Bing, have been used more than often by internet users and web surfers all over the globe to search what they want and need. Search engines have made searches on the internet a lot easier with the help of search engine crawlers that crawl websites in the World Wide Web.

By knowing that most website traffic comes from search engines, many website owners use any possible ways to improve their websites' rank up in the Search Engine Results Page (SERP). One of the most used and popular method to improve website rank and traffic is by using spamdexing.

Spamdexing, which is a word derived from “spam” and “indexing,” refers to the practice of search engine spamming. It is a form of SEO spamming. SEO is an abbreviation for Search Engine Optimization, which is the art of having a website optimized, or attractive, to the major search engines for optimal indexing. Spamdexing is the practice of creating websites that will be illegitimately indexed with a high position in the search engines. Sometimes, spamdexing is used to try and manipulate a search engine’s understanding of a category.

Spamdexing, also known as search spam, web spam, search engine spam or search engine poisoning, is a form that manipulates search engines to ensure that people land on the site that utilizes it when they search for specific things on search engines. Spamdexing is typically an offensive SEO where it is more concentrated to search engines crawler, and therefore the material is less relevant to the search than the searcher might desire. Because many people become frustrated by constantly finding spam sites when they look for legitimate content, most popular search engines have tools in place to defeat spamdexing.

Algorithms are used by the search engines in order to figure out where to rank a website and figure out what keywords and category the website is relevant to. Some search engines base their rankings on the Meta-tag sections of the website; whereas, others simply look for keywords within the URL or content of the website, font weight, available links, etc.. There are many techniques which are used to spamdex a site. They include creating superfluous backlinks (linking to higher ranked pages) and creating spam content – articles or comments that contain nonsense to a reader, but search engines will see relevant keywords.

Spamdexing can cause a lot of damage in the ways we find information on the internet, measures have been taken to curb it with some success. Spamdexing was a big problem in the 1990s when it started appearing in print, and search engines were fairly useless because they were compromised by it. Once the search engines improve their algorithm, that all changed – The 3 most popular search engines (Google, Yahoo! and Bing) developed a page ranking system that fought against spamdexing quite well, discounting spam sites and awarding true, relevant websites with high page rankings.

These popular search engines improve their algorithm periodically to avoid any website that utilize black hat SEO and/or violate their TOS (Terms of Service) and policy to rank high. They might also blacklist the sites that does not meet their TOS and policy. Many search engines check for instances of spamdexing and will remove suspect pages from their indexes. Also, people working for a search-engine companies can quickly block the results-listing from entire websites that use spamdexing.

Common spamdexing techniques can be classified into two broad classes: content spam and link spam.

Content Spam

These techniques involve altering the logical view that a search engine has over the page's contents.

  • Meta-tag stuffing: This involves repeating keywords in the Meta tags, and using meta keywords that are unrelated to the site's content.
  • Keyword stuffing: Involves the calculated placement of keywords within a page to raise the keyword count, variety, and density of the page.
  • Invisible text: Unrelated hidden text is disguised by making it the same color as the background, using a tiny font size, or hiding it within HTML code.
  • Article spinning: Involves rewriting existing articles, as opposed to merely scraping content from other sites.
  • Cloaking: refers to any several means to serve a page to the search-engine crawlers that is different from that seen by human users.
  • Gateway/doorway pages: Low-quality web pages created with very little content but are instead stuffed with very similar keywords and phrases.
  • Scraper sites: Created by using various programs designed to "scrape" search-engine results pages or other sources of content and create "content" for a website.
  • Mirror websites: Hosting of multiple websites with conceptually similar content but using different URLs.

Link and URL Spam

Link spam is defined as links between website pages that are present for reasons other than merit. Link spam takes advantage of link-based ranking algorithms, which gives websites higher rankings as more other highly ranked websites link to it.

  • Link-building software: Using a software to automate the SEO process.
  • Link farms: Web sites that all hyperlink to every other site in the group.
  • Hidden links: Putting hyperlinks where visitors can not see.
  • URL redirect: A method to take users to another page without their.
  • Buying expired domains: Expired domain bought by spammers for the pages to be replaced with links to their pages.
  • Spam blogs: Blogs created solely for commercial promotion and the passage of link authority to target sites.
  • Blog spam/comment spam: Placing links randomly on other sites. The targets are guest books, forums, blogs, and any site that accepts visitors' comments.
  • Page hijacking: Creating a rogue copy of a popular website and redirects web surfers to unrelated/malicious websites.
  • Cookie stuffing: Involves placing an affiliate tracking cookie on a website visitor's computer without their knowledge.
  • World-writable pages: Websites that can be edited by users can be used by spamdexers to insert links.
  • Wiki spam: Wiki spam is a form of link spam on wiki pages.
  • Referrer log spam: Involves making repeated website requests using a fake referrer URL that points to the site the spammer wishes.
  • Sybil attack: Creating multiple websites at different domain names that all link to each other.

While search engines offered a lot of hope to fight spamdexing, this method apparently continues to grow on a marginal level. The fight against spam continues as more and more people are connected to the internet. Spamdexing was the almost inevitable result of the rise of the internet, as people began to realize that there was immense monetizing potential in websites, and spam began to proliferate not only in in-boxes, but on the web. The number of spam sites on the internet is not fully known, but it is estimated to be an extremely high percentage of the total internet population.