Top reasons why websites don’t get index and how to fix them

Question: My website still haven’t been indexed after a month, what can I do to help speed up the process of the website being indexed by Google?

Answer: Before an answer can be given, you must ask yourself a few questions regarding the launch of your website. That is, did you add enough links pointing to your new website so that GoogleBot can find it? does your website have a proper internal linking structure as well as a good navigational system so that GoogleBot can effective crawl your website? Have you added a sitemap for notify where your urls are?

I could definitely ask more questions to figure out a website have not been indexed yet, but to make life easier, I have compiled a list of top 5 reasons why websites don’t get indexed, as well as possible solutions.

Top 5 reasons for non indexing of website

1) Website has no or insufficient inbound links

The most common cause for the slow indexing of a website, mainly because GoogleBot can’t find your website to crawl it. As you can imagine, it is only logical that if you don’t get found you can’t get indexed so the solution to this would be to add links. The remedy is equally logical, that is, add more links from related websites as well as adding a link on your existing website to establish a relationship. If the link is on a frequently crawled website, you can bet that your website would be indexed within a few days. If the link is on a related website, google would establish relevancy and give your props on your own topic.

2) The website has bad internal structure

This could mean that your website have bad internal linking structure as well as navigational structure. Having a flash website would be a good example, as search engines in general are unable to understand encoded flash files hence they would not be able to follow any of the links encrypted in the file. Another example would be using javascript links, crawlers are unable to run javascripts, so links hidden behind javascripts cannot be crawled (yet at least). There are a host of other examples which I will leave for you to figure out, but the logic behind it is simple, if the crawler cannot crawl throughout your website, then a fraction of your uncrawlable website will not be indexed.

Fixing this problem is simple if your website is not built entirely on flash. If it is, then you’d need to get your team of developers to rebuild from scratch. If not, give yourself a pat on the back as your job is half done. Remove those javascript links, you can be sure that being crawled by Google is worth a lot more than whatever fancy thing you are trying to do. Next step you can do is add a HTML Sitemap to your website which gets linked to from every page. This will help users quickly find what they are looking for, as well as help Google find all your pages in one place. Next, it would be preferable for you to have a proper navigation system for both usability and search engine optimisation. I will explain how this would help in a later post as they deserve a completely new post of their own.

3) Robots.txt blocking Googlebot

After a month of no Google love (as in the website I was SEO-ing had not be indexed), I was concerned and worried that I had done something wrong. After thorough investigation, it turns out that ALL crawlers have been blocked by the robots.txt, hence the website was not indexed.The robots.txt in question looked like this:

User-agent: *
Disallow: *

If this can happen to me, it can damn well happen to you or anyone else, so MAKE SURE that your robots.txt looks like this:

User-agent: *
Disallow:

You can fill in the disallow statement with places you actually want to block, but if you don’t want to block anything, then the above would work fine and would definitely NOT block your website from being indexed, though it won’t have any affect on it being indexed.

4) Duplicate content filter

This is a debatable issue and I believe that many SEOs would think that I’m nuts for even mentioning that this would affect page indexing. However, there is no denying the fact that the Sandbox effect has almost identical traits as the Duplicate content filter. In other words, new websites gets filtered by Google, so that they don’t rank well on SERPs (sandbox effect). As for duplicate content filter, Google filter identical (almost identical) content so that only the most trust worthy and oldest website’s content get displayed, the rest gets sent to supplementary. So if you have duplicate content, your site will not get indexed, and if you do get indexed you will almost never rank.

Solution is simple, remove the duplicate content, rebuild a new sitemap and get some new links in to attract Google’s attention again. You will soon be forgiven and your mistakes forgotten, which would mean you can start embracing organic traffic sent to you courtesy Google inc.

5) Keyword spamming, cloaking and other nasty things

This would without a doubt get you banned from Google altogether, let alone de-indexed. As your website is new, the chances of recovery after a banning is slim, and even if you get of their black list you will still need to wait atleast 3-6months before being re-included. For those who don’t understand what these techniques are I will quickly explain it below.

Keyword spamming

The act of stuffing as much related keywords as possible in the homepage or every page of the website. These keywords are usually hidden at the bottom of the page in the same font as the background of the page. Keyword spamming may also happen in the meta keyword tag.

Cloaking

This is when a webmaster makes a single page which displays different content to the crawler as it does to the user. This method is used to trick search engines into seeing a page with proper content filled with keywords about a certain topic. When a user visits this page, the user is shown a completely different page as to the one shown to the crawler, it may even redirect the user to a different website.

Other nasty things

This can include but not limited to, link spamming, trackback spamming, linking to bad neighborhood and more.

If you have performed one or more of the above, then I would suggest that you stop your evil ways and come back to the Light side of White hat SEO. I say this because Google’s search engine algorithm changes all the time, you may get away with it now, but in 6 months time when the algorithm changes, you lose everything you have worked for (if you call that work). This is not a long term sustainable solution for your business and would be a complete waste of time if they change the algorithm soon.

I will post methods on how to get indexed within 48hours of launching your website sometime this week. Its simple and if you follow the steps you should be able to pull it off (as I have for the pass couple of years).

No comments yet.

Write a comment: