Optimize your Knowledge Base for better visibility by allowing search engine crawling and indexing

Optimize your Knowledge Base for better visibility by allowing search engine crawling and indexing

All you need to know about no-follow and no-index in KB.


What are article crawlability and indexability?  

Crawlability and indexability are vital for making an article visible and accessible to search engines. When a search engine crawls an article, it reviews the site's content to assess its topic and relevance to specific search queries. If a site lacks crawlability, meaning search engines struggle to scan it effectively, the content will not be indexed and will consequently not appear in search results.
 
Additionally, a site’s content must be appropriately optimized for search engine crawlers, allowing it to be indexed accurately and appear in relevant search outcomes. Ensuring every article is crawlable and indexable can significantly enhance its visibility and improve its search engine rankings.

How do you block search indexing?

There are two methods to apply the no-index directive: using a <meta> tag or via an HTTP response header. Both methods achieve the same result, so select the one that suits your website and content type best. Additionally, you can pair the no-index directive with other indexing rules. For instance, you can combine a no-follow directive with an index like this: <meta name="robots" content=" index, nofollow" />.
Quote<meta> tag 

To prevent all search engines that support the no-index rule from indexing a page on your site, place the following <meta> tag into the<head> section of your page:

Quote<meta name="robots" content="noindex"> 
Notes
Certain search engines may interpret the no-index directive in various ways. Therefore, your page could still appear in other search engines' results.

Usage of the no-index tag 

The no-index rule can be implemented using a <meta> tag or an HTTP response header to restrict search engines that recognize the no-index directive, like Google, from indexing certain content. When Googlebot comes across a page with this tag or header, it will remove that page entirely from Google Search results, regardless of whether there are links from other websites.

Notes
To ensure the index rule works, the page or resource must be accessible and not blocked by a robots.txt file. If blocked, the crawler won't see the index rule, though the page might still show in search results if linked from other pages. An index is beneficial if you lack root server access, as it lets you manage page access individually.

What are no-follow links? 

Typically, it's best to allow robots to follow all links on a webpage. Being overly strict in indicating which links should be followed or marked as no-follow may give the impression that the site is trying to influence how a robot views it.

 

Page sculpting is the practice of using no-follow commands to influence the transfer of signals between pages. In the best-case scenario, these strategies to manipulate bots are ineffective. In the worst-case scenario, trying to control bots with nofollow may result in penalties.

When should you use meta no-follow?

 Consequently, there are limited scenarios in which using meta robots no-follow on websites is appropriate. Typically, encountering meta robots no-follow during SEO audits may indicate that a website has been over-optimized.

What are dofollow links?

 Dofollow links, often called "follow" links, are typical online hyperlinks. Unlike links with specific attributes such as “nofollow” or “sponsored,” these links do not carry unique identifiers. As a result, the following links contribute to the ranking control of the pages they connect to.

Experimenting with crawlability and indexability

Before a website's pages can be displayed in search results, search engine bots must first discover all the pages on the site and then analyze them to determine their rankings. The initial phase of finding pages is called crawling, while the subsequent analysis is called indexing.

 

Crawling starts with bots identifying all the URLs of a website's pages. They primarily find these URLs through internal links within the site or through backlinks from external sites. Once a bot identifies a page’s URL, it retrieves content, including the title, text, images, and additional data, such as the date it was last updated. Restrictions can dictate which files and pages a bot can crawl.

 

Indexing occurs after the crawling process. During this stage, bots assess all the information collected from the crawl. They evaluate whether the content is valuable and authoritative and identify the topics associated with the page and how it compares to other relevant pages. Additionally, search engine bots will determine which search results a page might appear in, if any, and its position within those results.

No-follow vs no-index

Let me give you a typical example of an ad campaign landing page designed solely for traffic from typical ads. It's not ideal to prevent organic search traffic from reaching these pages. Implementing meta robots and a no-index tag will keep search engines from ranking them. 

Summing up

  • No-index tells a robot not to index the page. It should be used to keep pages out of search results or to help with low-quality content issues.
  • Meta nofollow tells a robot not to follow a specific link or all of the links on a page. This should typically not be used.
  • Robots should be allowed to determine what gets indexed and shown in search results. If a page is not intended to be visible in search results, it often shouldn't be part of the website.

 

Please watch this space for more detailed use cases of SEO tags that you can apply to your organisation's knowledge base.

 

Cheers,

 

Kavya Rao,

The Zoho Desk Team