Google Search is one of the most used web services on the internet, and it's holding that title for a very good reason.
By allowing people to enter queries about anything they would like to know, Search replies with 'answers' that are just fractions of a second away. How does Search actually work? How can Google manages all information on the web, and make them ready whenever someone asks them?
While Search is the most common way for people to use Google, it also comes with extensions which are specialized to narrow down search results. These extensions are additional to its normal search engine protocol. With them, users can search for related images according to the keyword input, news articles or footage, products and services purchased, blogs, contents in books, videos, papers and others.
Many users are taking Search for granted, thinking what Google is showing are valid and they rarely bother to wonder what's going on behind the curtain. But to those that are curious about how it works, Google is not magic.
The Growing Web Vs. Google
The web is constantly changing, and it evolves into something bigger and bigger. Like all search engines, Google uses special algorithm to work. And to know the web, Google has to start somewhere.
While Google may share general facts about how it works, most are kept secret under its hood.
Google Search is made of three distinct parts: its web crawler, its indexer, and its query processor. And for about how it works, Search can be described to have three main functions: crawling and indexing, algorithms and providing answers, and fighting spams.
Crawling and Indexing
Before Google can ever start working and process any queries, it obviously needs something to show. So to know the web, Google must first learn the web. It started doing this by navigating it by crawling links from page to page.
Links are the thing that connect pages on the web, and they're the ones that gave the internet the 'network' of networks. Links are used by Google's crawlers to jump from one page to another, accessing over than 60 trillion web pages.
Using links, Google is able to get fragments of each pages to sort them out according to their content. In order to keep track of all that, Google keeps everything in its index.
That massive number of pages is more than 100 million gigabytes in size.
Algorithms and Providing Answers
Users expectations are high. Whenever they use Search, they want to find relevant answers, not just a list of websites. People want an answer and they want it fast.
To deliver what users are expecting, Google runs on a large distributed network that is made up of many computers that enables it to carry out fast parallel processing. The method of computation is able to make many calculations simultaneously in order for fast data processing. The method is indeed useful, especially when Google serves billions of searches each day.
When a user inputs a query in its Search field, Google's algorithms quickly work to look for clues to better understand what the user means.
Similar to other search engines, Google has a large database of keywords and places where those words can be found. What makes Google different from other search engines is its way to rank search results, and this determines the order Google displays in its search result.
To do this, Google uses its trademarked PageRank algorithm which assigns each and every web page a relevancy score. PageRank is the measure of the importance of a page based on the incoming links from other pages. In short, each link to a page of a site from another site is counted as a vote, and it adds to the site's PageRank.
So when its algorithms get a the clues about the user's query, Google pulls the appropriate information from its index before ranking them over 200 factors, one of which is the PageRank for a given page. After doing those, Google then show the results the users are most likely to expect.
Google manages to do all those in just 1/8th of a second.
Google is always updating its algorithms to meet the current trends and demand, and that is just to make Search able to give the most relevant answer, fast and effectively. While some updates are minor, others can be really huge as they can really alter how the search results page is showing.
- Google's Panda Algorithm to Lower the Rank of Low Quality Websites
- Introducing the Google Penguin
- Google and the Hummingbird Algorithm
- Google's Pigeon Algorithm: Pushing Local Search Forward
As the web grows, it continues to bring new contents and information for Google to crawl and index. Since not all information is available for good reasons, Google doesn't want to populate its database with something not worth its time, nor for its users.
Google constantly fights spam to keep its search results relevant. While most of the spam removal process is done automatically, Google is also examining other questionable documents manually. If Google finds something it considers a spam, it can notify the site owners so they can fix it.
Millions of websites have been marked as spam by Google, and the numbers are still rising.
Fighting spam is something that isn't just for Google. Every businesses on the web are trying to eliminate spams from obscuring real information and occupying unnecessary space.
Related article: How Search Engines Process Your Queries Determines Your Satisfaction