Q and A: Why doesn’t Google index my entire site?

Question

Dear Kalena…

I have been on the internet since 2006, I re-designed my site and for the past year it still has only indexed 16 pages out of 132.

Why doesn’t google index the entire site? I use a XML site map. I also wanted to know if leaving my old product pages up will harm my ratings. I have the site map setup to only index the new stuff and leave the old alone. I have also got the robots.txt file doing this as well. What should I do?

Jason

Hi Jason

I’ve taken a look at your site and I see a number of red flags:

  • Google hasn’t stored a cache of your home page. That’s weird. But maybe not so weird if you’ve stopped Google indexing your *old* pages.
  • I can’t find your robots.txt file. The location it should be in leads to a 404 page that contains WAY too many links to your product pages. The sheer number of links on that page and the excessive keyword repetition may have tripped a Googlebot filter. Google will be looking for your robots.txt file in the same location that I did.
  • Your XML sitemap doesn’t seem to contain links to all your pages. It should.
  • Your HTML code contains duplicate title tags. Not necessarily a problem for Google, but it’s still extraneous code.

Apart from those things, your comments above worry me. What do you mean by “old product pages”? Is the content still relevant? Do you still sell those products? If the answer is no to both, then remove them or 301 redirect them to replacement pages.

Why have you only set up your sitemap and robots.txt to index your new pages? No wonder Google hasn’t indexed your whole site. Googlebot was probably following links from your older pages and now it can’t. Your old pages contain links to your new ones right? So why would you deliberately sabotage the ability to have your new pages indexed? Assuming I’m understanding your actions correctly, any rankings and traffic you built up with your old pages have likely gone also.

Some general advice to fix the issues:

  • Run your site through the Spider Test to see how search engines index it.
  • Remove indexing restrictions in your robots-txt file and move it to the most logical place.
  • Add all your pages to your XML sitemap and change all the priority tags from 1  (sheesh!).
  • Open a Google Webmaster Tools account and verify your site. You’ll be able to see exactly how many pages of your site Google has indexed and when Googlebot last visited. If Google is having trouble indexing the site, you’ll learn about it and be given advice for how to fix it.
  • You’ve got a serious case of code bloat on your home page. The more code you have, the more potential indexing problems you risk. Shift all that excess layout code to a CSS file for Pete’s sake.
  • The number of outgoing links on your home page is extraordinary. Even Google says don’t put more than 100 links on a single page. You might want to heed that advice.
Spread the joy!