Indexing helps Google to note down your keywords and relevant details which help in display of your website when relevant keywords are searched. Google is equipped with ‘spiders’ that crawl the web regularly for new sites which need indexing. By default, Google indexes all website except those with a few technical glitches.
Please do not block Google entirely from accessing your site. If at all, you wish that Google should not index a section of your website while letting it stay visible, then you can adopt the following simple measures –
- You can always use the disallow directive of the robots.txt file to block pages, files and directories. For excluding individual files use the following commands in the robots.txt file :
User-agent: *
Disallow: /directory/name-of-file.html
- If you wish to exclude whole directories then you need to add the following to the robots.txt file
User-agent: *
Disallow: /first-directory/
Disallow: /second-directory/
Be careful!! Before you proceed, run a double check on the robots.txt file that you haven’t mentioned those directories which need to be indexed. Remember, pages excluded from indexing through the robots.txt file will be visible to people visiting your website.
- Use meta robots noindex tag to prevent individual pages from being indexed. When the meta robots noindex tag is mentioned against a particular page name, Google will not index that page.
The code to be placed in the robots.txt file is as follows –
<meta name=”robots” content=”noindex, nofollow”>
By placing this code, Google will not index the page and also not follow the links on that page.
For Google to follow the links on the page and not index it, you need to place the following code on the page –
<meta name=”robots” content=”noindex, follow”>
- Use server header status – In general, every web page has a “200 OK” status code which directs visitors to it. Also, there are different server header status code which when placed on the page direct the visitors and the robots to different places within the site.
For instance –
Use of “301 moved permanently” – For redirecting the visitor to a new URL
Use of “ 403 forbidden” : For eliciting no response to the request
It is advisable to use a 301 redirect as it directs visitors to new pages instead of the old ones within your website. This practice accounts as a good SEO practice for diverting website traffic to relevant pages as required.
- Use of Password – You can provide select access to visitors to the content of your website by furnishing them the password of your website. Yes, it is feasible to avoid traffic from visiting your site by protecting it through the use of the password. This activity is advisable during the initial testing stages and has to be removed later on for best results. Search engines robots do not enjoy access to password protected pages and hence no indexing or crawling will take place.
- Use of Cookies and JavaScript – Google will not index pages with cookies and javascript especially those with complex JavaScript program codes. Also, the presence of cookies deters search engines from indexing these pages which then serves its purpose, if you wish to.
LogicSpeak:
Instead of blocking Google entirely from searching your website, it pays to choose and select a few pages or directories you wish to block by using time tried and tested options which definitely serve the purpose while ensuring your website is visible during search as well.
This exercise should be carried when your website is undergoing an upgrade as in addition/removal of new/old products & services, or when fresh content has yet to be uploaded. At the same time, you are ensuring that your website welcomes relevant traffic without misguiding them while they are at your site.
Previous Post: Why should you pay attention to your Quality Score?