Q and A: Why doesn’t Google index my entire site?

Question

Dear Kalena…

I have been on the internet since 2006, I re-designed my site and for the past year it still has only indexed 16 pages out of 132.

Why doesn’t google index the entire site? I use a XML site map. I also wanted to know if leaving my old product pages up will harm my ratings. I have the site map setup to only index the new stuff and leave the old alone. I have also got the robots.txt file doing this as well. What should I do?

Jason

Hi Jason

I’ve taken a look at your site and I see a number of red flags:

  • Google hasn’t stored a cache of your home page. That’s weird. But maybe not so weird if you’ve stopped Google indexing your *old* pages.
  • I can’t find your robots.txt file. The location it should be in leads to a 404 page that contains WAY too many links to your product pages. The sheer number of links on that page and the excessive keyword repetition may have tripped a Googlebot filter. Google will be looking for your robots.txt file in the same location that I did.
  • Your XML sitemap doesn’t seem to contain links to all your pages. It should.
  • Your HTML code contains duplicate title tags. Not necessarily a problem for Google, but it’s still extraneous code.

Apart from those things, your comments above worry me. What do you mean by “old product pages”? Is the content still relevant? Do you still sell those products? If the answer is no to both, then remove them or 301 redirect them to replacement pages.

Why have you only set up your sitemap and robots.txt to index your new pages? No wonder Google hasn’t indexed your whole site. Googlebot was probably following links from your older pages and now it can’t. Your old pages contain links to your new ones right? So why would you deliberately sabotage the ability to have your new pages indexed? Assuming I’m understanding your actions correctly, any rankings and traffic you built up with your old pages have likely gone also.

Some general advice to fix the issues:

  • Run your site through the Spider Test to see how search engines index it.
  • Remove indexing restrictions in your robots-txt file and move it to the most logical place.
  • Add all your pages to your XML sitemap and change all the priority tags from 1  (sheesh!).
  • Open a Google Webmaster Tools account and verify your site. You’ll be able to see exactly how many pages of your site Google has indexed and when Googlebot last visited. If Google is having trouble indexing the site, you’ll learn about it and be given advice for how to fix it.
  • You’ve got a serious case of code bloat on your home page. The more code you have, the more potential indexing problems you risk. Shift all that excess layout code to a CSS file for Pete’s sake.
  • The number of outgoing links on your home page is extraordinary. Even Google says don’t put more than 100 links on a single page. You might want to heed that advice.
Share this post with others

Q and A: Does Google automatically search sub-directories?

QuestionDear Kalena…

Does Google automatically search sub-directories? Or do I have to have a ‘Links’ page to force google to index the sub-directories?

Also, I was reading about ‘redundant’ content. I have a business directory which will eventually have thousands of pages with the only main difference in content being: {Company} {City} {ST} and {Subject1}. Will Google view this as redundant content?

Best Regards,

Steve

Dear Steve,

For Google to index your sub-directories, you will need some links pointing to them. These links can simply be internal navigation links and if you have a large website, it’s also advisable to include a sitemap that links to all your pages and sub-directories within your site.

In regards to your redundant content query – it’s best SEO practice to have at least 250 words of unique content per page. So if all the pages are the same other than the contact details – then yes, it would be considered redundant content.

My advice would be to offer a one-page listing for each company and on that page have a small blurb about the company, their contact details and a feature that allow users to add feedback/comments/reviews. This should provide enough information for Google to index without causing redundant or duplicate content issues.

Hope this helps!

Peter Newsome
SiteMost

Share this post with others

Q and A: Do sitemap crawl errors hurt me in Google?

QuestionDear Kalena

I have a new site just built in late Sep 2008. I have it submitted to google and verified. Every week when it is crawled it comes up with the same errors.

I’ve been back to my designer multiple times and have done everything he has said to do and the errors still exist. These pages are not mine, they belong to a friend who had his site designed at the same place over a year ago.

My question is: Does this hurt me with google by continuing the same errors? If so, what can I do about it?

Thanks

Doug

————————————————————–

Dear Doug

No and nothing. Hope this helps!

Share this post with others

Q and A: Do I need to submit alternative descriptions for each search engine?

QuestionDear Kalena…

I have recently optimized a friend’s website. The site was already listed with Google and Yahoo etc. I have noticed that since uploading the site a few weeks ago the new description and title for the home page is now listed and a few of the new page extensions.

In the SEO 201 course, you recommended submitting different listing descriptions for each search engine/directory. However, all the search engines are just using the title and description from each page they have listed.

1) Should I be listing pages not listed on the popular search engines or wait till the find them.

2) Should I only submit alternative descriptions where the site is not currently listed and do I only need to submit the home page?

With thanks

Peta

Dear Peta

You generally don’t need to submit sites to search engines as they will be discovered, provided there is at least one site pointing to them. But what you should make sure of is that each page on your site is being indexed. You can do this by creating an XML sitemap of your site and submitting it to Google via Webmaster Tools (also via Yahoo). More info is available at www.sitemaps.org.

Regarding different descriptions and titles – search engines will use whatever they think is the most relevant snippet from a page in relation to the search query. This could be taken from the title tag, the description or from the text on the page itself. You can control this to some extent by making sure each page on your site is optimized for a small range of target keywords/phrases so that each page has the opportunity to rank on it’s own merit.

When I talk about submitting different descriptions, I am generally talking about when submitting your site to niche directories and search engines that don’t automatically crawl sites to discover new pages.  If you use different descriptions for these submissions, you can easily track keyword referrals in your log files and recognize which sites are bringing you the most traffic. I hope this answers your question.

Share this post with others

Q and A: What is an XML Sitemap and why do I need one?

QuestionHi Kalena

I am not sure what a XML sitemap is. I have gone to websites that will automatically generate a site map and the code they create is not understandable to me and they can only index the first 500 pages.

There are pages on my site that are important to be indexed and others that don’t matter. I have no idea how to create a XML sitemap that only lists the pages I want indexed. How can I do this? Can you clarify what a XML sitemap is and if I can have only my important pages indexed on it?

Beverly

Hi Beverly

Thanks for the caffeine donation, I’ll be sure to use it tomorrow when I visit Starbucks.

A sitemap is simply a way for search engines and visitors to find all the pages on your site more easily. XML is simply a popular format for the delivery of the sitemap. To quote Sitemaps.org:

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

I personally use XML Sitemaps to build all sitemaps for my own sites and my client’s sites. I paid for the stand alone version so I can create sitemaps for sites with over 500 pages. At under USD 20, I believe the price is pretty reasonable and their support is pretty good so it might be worth the investment for you. Apart from that, the instructions for using their web version are quite clear – perhaps you need to have a closer look? These sitemap FAQs shoud also help.

You can either create a full sitemap of your entire site and edit out any pages you don’t want indexed later, or instruct the generator to avoid certain files or sub-directories before running. Once you’ve created and downloaded the XML sitemap file for your site, simply upload it to your web server and follow the instructions to ensure it is indexed by search engines. If you’ve created a Google Webmaster Tools account, you can login and enter your sitemap URL directly into the control panel.

Like this post? Prove it! Please press the big feed button at the top left. Thanks.

Share this post with others