Back in 1996, two smart people from Stanford, Larry Page and Sergey Brin, had an idea for a new kind of search engine. Instead of just counting how many times a keyword showed up on a website, they thought it would be better to look at how pages were connected to each other. They called their idea “BackRub” because it looked at the links pointing back to a page to determine its importance.
This was a pretty big change from how search engines worked at the time. Today, Google, which is the search engine that Page and Brin made, gets 5.5 billion searches every single day. That’s a crazy number! It looks at over 130 trillion web pages all over the internet to pick the best results for each search, and it does all of this in less than a second.
Behind the scenes, there’s a lot of work that goes into making sure Google and other search engines can find the right results for you. Even though companies like Google don’t share exactly how their search engines work, it’s still important for marketers and website owners to understand the basics. By knowing how search engines find and pick results, people can make changes to their websites that help them show up higher in search results.
How Search Engines Work: The Basics
When you type something into a search bar on the internet, a search engine goes to work. It’s a collection of tools that work together to find things like images, videos, and web pages that match your search words. If you own a website, you might use Search Engine Optimization (SEO) to make sure your content shows up in those search results.
There are three main parts of a search engine:
Web crawlers: These are little bots that explore the internet all the time to find new web pages. They gather information about each page and use links to find even more pages to explore and catalog. Search index: Think of this as a big book that lists every web page out there. It’s organized in a way that makes it easy for the search engine to find pages that are relevant to what you’re searching for. Search algorithms: These are like secret formulas that search engines use to figure out which web pages are the best match for your search. They look at things like the quality and popularity of the pages to decide how to rank them in your search results.
Search engines want to show you the best results possible, so you’ll keep using them. That’s how they make money through advertising. For example, Google made an incredible $116 billion in 2018!
How Search Engines Crawl, Index, and Rank Content
Have you ever used a search engine to find something on the internet? It’s pretty amazing – you just type in a few words, and it gives you a list of websites that might have what you’re looking for.
But did you know that search engines are doing a lot of work behind the scenes to make that happen? It’s a big job that goes on all the time, even when you’re not searching for anything.
First, there are little robots called web crawlers that go out and explore the internet, visiting web pages and collecting information about them. Then, the search engine organizes all that information into a big book called an index, which lists all the web pages out there.
Finally, there are secret formulas that the search engine uses to figure out which web pages are the best match for your search. It looks at things like how relevant and popular a page is to decide which ones to show you first.
So next time you use a search engine, remember that there’s a lot of hard work going on behind the scenes to help you find what you’re looking for!
Crawling
To gather information from the web, search engines use crawlers which are automated scripts. The crawlers begin with a list of websites, and algorithms, which are sets of computational rules, determine which sites to crawl. These algorithms also determine the number of pages to crawl and the frequency of crawling.
The crawlers visit each website on the list in a systematic manner, following links through tags such as HREF and SRC to navigate to internal or external pages. As the crawlers continue to move through the pages, they build a map of interconnected pages that continually expands over time.
Takeaway for Marketers
To make sure your website appears in search results, you need to ensure it’s easily accessible to crawlers. Here are some tips to help:
- Logical site hierarchy: Organize your website in a logical manner, from domain to category to subcategory. This helps crawlers navigate your site more efficiently, so your site stays within its crawl budget.
- Links: Include internal links on every page. Crawlers need links to move between pages. Pages without links can’t be crawled or indexed.
- XML sitemap: Create a list of all your website’s pages, including blog posts. This helps crawlers know which pages to crawl. You can use plugins like Yoast and Google XML Sitemaps to generate and update your sitemap when you publish new content.
If you’re not sure whether your site is accessible to crawlers, try using our Site Audit tool. It identifies accessibility issues and provides advice on how to fix them. It also sends you a technical SEO report for your site every two weeks, so you can monitor your site’s visibility for crawlers.
Indexing
When a bot finds a page, it fetches or renders it just like a browser does, including all the visual content like images and videos. The bot then categorizes this content into different types, such as text and keywords, CSS and HTML, and images. This helps the crawler understand what’s on the page, which is essential for determining its relevance to particular keyword searches.
The information is then stored in an index, a massive database with an entry for every word seen on every webpage indexed. Google’s Caffeine Index, for instance, takes up about 100 million gigabytes and fills server farms worldwide with thousands of computers that operate continuously.
Takeaway for Marketers
To ensure that search engine crawlers index your site the way you want, you can control which parts of your site they are allowed to access. Here are two tools you can use to do this:
- URL Inspection Tool: This tool shows you what the crawlers see when they land on your site, and you can use it to find out why certain pages are not being indexed or request that Google crawls a specific page.
- Robots.txt: You may not want crawlers to show every page of your site in search results. For example, you may want to exclude author pages or pagination pages. You can use a robots.txt file to tell the crawlers which pages they can and cannot access.
Blocking crawlers from certain areas of your site won’t harm your search rankings. Instead, it can help crawlers focus on the most important pages, which can improve your overall search performance.
Ranking
In the final stage, search engines utilize search algorithms to sift through indexed information and provide relevant results for each query. These algorithms are sets of rules that evaluate what the user is searching for and determine which results best match their query.
Various factors are used by algorithms to determine the quality of pages in the index. Google employs several algorithms to rank relevant results, and many of the ranking factors they use assess the popularity and user experience of a page. These factors include:
- Quality of backlinks
- Mobile responsiveness
- Freshness of content
- User engagement
- Page load speed
To ensure that the algorithms function effectively, Google utilizes human Search Quality Raters to test and refine them. This is one of the rare instances where humans, rather than programs, are involved in the operation of search engines.
Takeaway for Marketers
Search engines aim to provide the most relevant and useful results to their users to maintain their satisfaction and generate ad revenue. Consequently, many of the ranking factors used by search engines are the same ones that humans use to evaluate content, such as page speed, freshness, and links to other helpful content.
To enhance website rankings, it is advisable to optimize page speed, readability, and keyword density during the design and updating process. Improving engagement metrics such as time-on-page and bounce rate can also have a positive impact on rankings.
What Happens When a Search Is Performed?
We have learned about the process of how search engines return relevant results, which involves crawling, indexing, and ranking. But how does this process help search engines answer your search query? Let’s take a step-by-step look at how search engines answer queries, starting from the moment you enter a term in the search bar.
Step 1: Search Engines Parse Intent
Sophisticated language models are used by search engines to understand the search intent behind a term in order to return relevant results. The query is broken down into chunks of keywords and parsed for meaning.
For instance, Google’s synonym system can recognize when groups of words mean the same thing. When you enter “dark colored dresses,” search engines will return results for black dresses and dark tones since the engine understands that dark is often synonymous with black.
Broad “categories” of search intent are also understood by search engines through keywords. In the example of “dark colored dresses,” the term “buy” signals to search engines that it should display product pages to match a shopping searcher’s intent.
To understand searcher intent, search results use “freshness” algorithms that identify trending keywords and return newer pages. This is evident for terms like “election results,” which produce vastly different SERP results during election time compared to non-election time.
Step 2: Search Engines Match Pages to Query Intent
After understanding what you’re looking for, search engines search for matching pages. They use various factors to determine the best pages, including relevance of the title and content, the types and quality of content, the quality and freshness of the website, page popularity, and the language of your search query.
For example, if you search for “best places to eat sushi,” search engines will match pages with the word “sushi” or related terms (such as “Japanese food”) in the title and content. They’ll then sort the results by popularity, freshness, and quality.
Additionally, search engines may provide enhanced search results such as a knowledge graph or image carousel depending on your search intent.
Step 3: Search Engines Apply ‘Localized’ Factors
Search engines use a variety of factors to determine which results to show you, and these factors can be specific to your individual search. For instance, if you search for “best frozen cheese pizza,” your results may differ from those of a friend in another state due to factors like:
- Location: Even for non-location-specific searches, search engines may prioritize results that are relevant to your location. So if you search for “football,” you may see pages about the Steelers if you’re in Pittsburgh and pages about the 49ers if you’re in San Francisco.
- Search settings: Your search settings can also impact which results you see. For example, if you set a preferred language or turned on SafeSearch (which filters out explicit results), the results you see may be tailored to those preferences.
- Search history: Your search history can influence the results you see as well. If you search for “Hemingway,” for example, and click on several results about the writer, future searches for “Hemingway” may prioritize results related to the writer rather than the Hemingway editing app.
Takeaway for Marketers
Search results can vary greatly depending on the individual searcher and the specific search query. It’s difficult to anticipate exactly how and when your website will appear to each person. The most effective strategy is to send clear signals of relevance to search engines by conducting thorough keyword research, implementing strong technical SEO, and executing a well-planned content strategy. By doing so, you increase the likelihood of appearing in search engine results pages that are genuinely pertinent to your content.
Use This Knowledge To Boost Results
Understanding how search engines function is essential for creating websites that are easily crawled and indexed. By providing the appropriate signals to search engines, you can ensure that your pages appear in relevant results pages, which is crucial for any online business’s success.
If you want to ensure your content is in good shape for crawlers, try the Site Audit tools available in SemRush’s Advanced Plan. With this plan, you’ll receive comprehensive reports that identify technical and on-page optimization opportunities you may have overlooked.