If you have used Google, Bing or Yahoo to perform a search, then you have already used one of the most popular Search Engine in the U.S. and might have an idea of what they are. Search Engines are special sites on the Web that are designed to help people find information stored on other sites. What you may not know is the difference between Search Engines and How They Work
In general they all perform three basic tasks:
- They search the Internet – or select pieces of the Internet – based on important words,
2. They keep an index of the words they find, and where they find them, and
3. They allow users to look for words or combinations of words found in that index.
Early Search Engines held an index of a few hundred thousand pages and documents, and received maybe one or two thousand inquiries each day. Today, a top Search Engine will index hundreds of millions of pages, and respond to tens of millions of queries per day.
So lets examine search engines and how they work to perform a search
When you look for a file or document through a Search Engine, to find information on the hundreds of millions of Web pages that exist, special software robots, called spiders, build lists of the words found on Web sites.
When a spider is building its lists, the process is called web crawling.
Think of all the information a spider must go through in order to build and maintain a useful list of words! How does any spider start its travels over the Web?
First, it goes through the lists of heavily used servers and very popular pages. It indexes the words on its pages and follows every link found within the site. The “spidering” system quickly begins to travel, spreading out across the most widely used portions of the Web.
Once the spiders have completed the task of finding information on Web pages, the Search Engine must store the information in a way that makes it useful.
There are two key components involved in making the gathered data accessible to users:
The information stored with the data, and
The method by which the information is indexed.
A Search Engine could just store the word and the URL where it was found, however, it would make the tool of limited use. There would be no way to tell if the word was used in an important or a trivial way, if the word was used once or many times, or whether the page contained links to other pages containing the word. There would be no way of building the ranking list that tries to present the most useful pages at the top of the list of search results.
Why do you get different search results from different Search Engines?
Search Engines index information in different ways. They store the number of times that the word appears on a page, and have different formulas for assigning weight to the words in its index.
An index has a single purpose: it allows information to be found as quickly as possible. There are quite a few ways for an index to be built, but one of the most effective ways is to build a hash table. In hashing, a formula is applied to attach a numerical value to each word.
The formula is designed to evenly distribute the entries across a predetermined number of divisions. This numerical distribution is different from the distribution of words across the alphabet, and that is the key to a hash table’s effectiveness.
When a person requests a search on a keyword or phrase, the Search Engine software searches the index for relevant information. The software then provides a report back to the searcher with the most relevant web pages listed first.
I hope you got some key information about search engines and how they work and I look forward to getting you more great information like this soon.
Be sure to join my community if you haven’t already – http://www.FastActionResults.com
PS – If you’d like to get a great deal on my top selling product – The Science of Social Media – and take advantage of a 90% savings – go here and once you have joined the community over 51 training videos, mindmaps, cheat sheets, and more can be yours > http://www.FastActionResults.com