Q and A: Why is Google having trouble indexing our site?

QuestionHello Kalena

One of the sites we manage has a problem.

The homepage at [URL removed] is not getting indexed anymore by Google. The site was made using Sitefinity 3.7 and the hosting is provided by Rackspace. Something similar already happened two times in the past which we resolved using the option “index this page” on the page generated by Sitefinity (1st time) and by re-creating the XML sitemap and linking it directly to Google Webmaster tools (2nd time).

This time we can’t seem to find the reason. We checked if the end-user that works as the back-end has made any changes or if there was any notification from Google Webmaster Tools reports but nothing came up. Here are some more technical details:

1) The site homepage is [URL removed]. But the site root is [URL removed] which is an empty page with a redirect to the home page using a 301 redirect.

2) In Google Webmaster Tools we set up 2 Sitemaps:

  • The first at [URL removed] is indexing the Top pages of the Home page (static)
  • The second [URL removed] gets populated with the pages content generated by Sitefinity (dynamic)

3) Also, from the back-end options, a metatag ROBOTS was set at page level for the top pages, as Google suggests.

4) Google reports 5 blocked URLs when crawling our robots.txt with the message: “Google tried to crawl these URLs in the last 90 days, but was blocked by robots.txt”. This seems suspicious, because I can’t seem to understand what could be blocking it, the robot is pretty simple and not restrictive.

Could you give us an hand? I’ve left a generous donation for your coffee fund.

Thanks!
Jim

————————————-

Hi Jim

First up, thanks for the caffeine donation :-)

As for your problem, oh boy. You’ve got a few different issues going on, so let me address each of them separately:

1) Your XML sitemaps are missing contextual data specified by the Sitemaps protocol. In particular, your < loc > child entries per URL are messed up. I’m surprised this hasn’t generated an error in Webmaster Tools, but I’m pretty sure it would be confusing Googlebot. Go check your sitemaps against the protocol and re-generate them if necessary. Maybe use one of the XML generator tools recommended by Google. Personally, I like XML Sitemaps (yes that’s my affiliate link).

Also, why 2 separate sitemaps for HTML pages? I can understand having separate ones for RSS feeds or structured data stuff, but your standard site pages should all be listed in the one file so you can better manage the content and keep track of indexing history in Webmaster Tools.

2) Your robots.txt file is blocking a number of pages that you have listed in your XML sitemap. So on the one hand you’re telling Google to index pages within a certain directory, but on the other, you’re telling Google they are not allowed to access that directory. This is what the error message is about. You’ve also got conflicting instructions on some of your pages in terms of robots meta tags vs. robots.txt.

3) The 301 redirect on your root directory is your major problem. In fact, that empty landing page is your major problem. Why do you need it? You don’t use Flash and it doesn’t appear to have an IP sniffer for geo-location purposes so I can’t understand why you wouldn’t just put your home page content at the root level and let search engines index it as expected.

The way you have it set up right now is essentially telling Google that you have moved all your content to a new location, when you really haven’t. It’s adding another step to the indexing process and you are also shooting yourself in the foot as every 301 contributes to some lost PageRank. Google clearly doesn’t like the set up or isn’t processing it for some reason. There also appear to be several hundred 301s in place for other pages, so I’m not sure what that’s about. I don’t have access to your .htaccess file, but I can imagine it reads like a book!

4) Unless you specifically need a robots meta tag for a particular page scenario, I would avoid using them on every page. You can achieve the same results with your robots.txt file and it’s easier to manage robot instructions in one location rather than having to edit page by page – avoiding conflicting issues as you have now.

Apart from the obvious issues mentioned above – have you considered switching away from Sitefinity and over to WordPress? I’ve struggled optimizing Sitefinity sites for years – it’s a powerful CMS but it was never built with search engines in mind and always requires clunky hacks to get content optimized. Plus that’s a really out-dated version of Sitefinity.

Given the other issues, it might be time for a total site rebuild?

Best of luck

——————————————————————-

Like to learn SEO with a view to starting your own business? Access your Free SEO Lessons. No catch!

 

Spread the joy!

Q and A: Should I 301 redirect my penalized domain to a new site?

QuestionHi Kalena

If my site example.com gets penalized and de-indexed from Google (some competitor spammed my site hard), can I 301 that site to my new site with the exact same content? Would my new site get penalized too?

And what happens if my new site gets penalized from spam again… can I 301 it to another domain using the same content? I wonder if i can 301 the past two domains to my new site, passing on the link juice.

What do you think?

Sam

————————————-

Hi Sam

GREAT question and one that I thought I knew the answer to, but it prompted me to do a little more research to make sure.

My instincts told me that if you could simply recover from a penalized domain by implementing 301 redirects to a new domain, then there would be more incentive for spammers to create and burn keyword-stuffed sites as a tactic to gain short term traffic and long term links. This is not a situation I could imagine Google being comfortable with.

But at the same time, if penalized domains pass their penalties on via 301 redirects, what is stopping a competitor from 301 redirecting their penalized site to your non-penalized site as a nasty negative SEO tactic?

So, after digging into the topic, here’s what I found out:

1) We know that 301 redirects are Google’s preferred method of directing traffic between pages and sites, and for transferring link juice from an old domain to a new one. However, any page redirected from one domain to another via 301 is going to lose some PageRank.  So it follows that implementing a 301 redirect on a penalized site WILL pass on some of the link and PageRank value of the redirected site to the new site. Therefore, you should NOT implement a 301 redirect on a penalized site, because any link or PageRank-related penalties will be passed on to the new site as well.

2) If you 301 redirect more than one penalized domain to a new domain, you are probably going to pass on double the negative PageRank and link juice to your clean domain, so don’t do that either, unless you want double the drama.

3) If you are thinking of simply scraping the entire content of your penalized domain and republishing it on a new domain, think again. There is new evidence that Google can track the content that earned you the penalty in the first place and penalize it in the new location, even if you don’t use 301s or tell Google about the move via the site migration tool in Webmaster Tools.

4) If you’re concerned that a competitor might have used negative SEO tactics against you by 301 redirecting their penalized site to your non-penalized site, don’t be. Google is apparently quite good at ferreting out this particular negative SEO technique. If you’re still worried, you can use the Disavow Links tool in Webmaster Tools to instruct Google to ignore any links from the penalized site.

Hope this helps!

——————————————————————-

Need to learn more about legitimate SEO tactics but not sure where to start? Access your Free SEO Lessons. No catch!

 

Spread the joy!

Latest Google algorithm penalizes web spam

Google has released a new update to their ranking algorithm this week, aimed at isolating and penalizing websites that use particular spam techniques. From the official blog post :

“In the next few days, we’re launching an important algorithm change targeted at webspam. The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines.”

So what constitutes a violation of Google guidelines? While deliberately avoiding being specific, Google has highlighted these tactics as problematic and likely to be targeted:

  • Duplicate Content
  • Keyword Stuffing
  • Link Schemes
  • Cloaking
  • Deliberate Redirects
  • Doorway / Gateway Pages
  • Unlike Panda, this algorithmic update has no cutesy name, simply the *webspam algorithm update* according to Search Engine Land.

    As much as this update is a slap on the wrist for aggressive search engine optimizers, Google were very careful to endorse the methodology of so-called *white hat* search engine optimizers in their announcement and isolate those “acceptable” tactics from the tactics they will be punishing with this update:

    “We want people doing white hat search engine optimization (or even no search engine optimization at all) to be free to focus on creating amazing, compelling web sites.”

    It’s interesting to see them so eager to support the SEO industry but probably a sign that they’re expecting webmasters to be confused by the changes and the possibility that they might accidently over-optimize their sites.

    The algorithm change has already started to roll out and Google claims it will affect approximately 3 percent of search queries.

    UPDATE 27 April 2012: You know how I said above that the new algorithm revision doesn’t have a cutesy name? Scrap that. Google has now decided to call it *Penguin*

    Spread the joy!

    Q and A: Why have my Google rankings dropped for my key phrases?

    QuestionHi Kalena

    I’ve contacted you because I feel frustrated. Until last night my site was listed within the second page results in google with the key words “learn Spanish free”. Thanks to the SEO course at SEC and my work , I was proud to see results like this. However for some strange reason this morning (UK time) I am nowhere to be seen with this key words. I have checked other search engines (yahoo, bing) and I am listed there (3rd and 2nd pages). Would you be able to tell if I have done anything to upset google? And if I have what?

    Furthermore, I have checked all key phrases that have brought me visits before and I am nowhere to be seen within google index. (I used to appear on the first page with these key phrases). The only time I am in google listing is when I search the words “Spanish aid” ( the name of my domain) or the full URL of my web pages. I am also appear in google listing with the  word “Spanish colours” under Images. I find this extremely weird as it seem that google has penalised me for something I don’t know of. As I said on my previous email I was progressing and I was happy that I was learning an seeing positive result, now it seems that I have taken a big step back. I hope you can give my an explanation as at the moment I am banging my head against the wall.
    Thanks a lot for your help

    Thank you
    Luis

    Hi Luis

    Fluctuations in your Google rankings are completely normal. Sometimes, they’ll make a slight tweak to their ranking algorithm which can result in other sites ranking above yours and/or lowering your previous ranking for certain keywords. But this doesn’t mean you’re suffering a penalty.

    See some previous blog posts about this, particularly:

    Why is my CMS based website only ranking for the home page

    Why does my website not rank high on search engines?

    Your site is still in the Google index and you rank #1 for your brand name so you haven’t been penalized. I could see nothing wrong with your content that would trigger any alarm bells with Google.

    However, the big problem with your site is the low Google Toolbar PageRank score (1/10) reflecting the very low number of incoming links pointing to your site. Has your PR score always been 1/10? If it has recently dropped from a 2 or something, that might partially explain the ranking drop. While a higher PageRank score is not a pre-requisite to high rankings, it can be a key indicator of your site’s link popularity, which in turn has a strong influence on your ultimate keyword positions in Google.

    The more links you have pointing to your site from related sites and using relevant keywords in the anchor text of the link the better you should rank for those keywords. The best thing you can do for your site right now is to build links pointing to it and to add new content. That will gradually improve your PageRank score and your link popularity – then the rankings will follow.

    If you’re still worried, you can take the steps outlined in these posts:

    How do I fix ranking penalties?

    Why doesn’t Google index my entire site?

    Kalena

    ———————————————-

    Finding that optimizing your own site is a challenge? Download our Free SEO Lesson. No catch!

    Spread the joy!

    Q and A: Is this a legitimate form of link building?

    QuestionDear Kalena

    So, I’m a freelance writer, cruising through Elance.com, looking for projects to bid on. I see a project for a site called buildmyrank.com. They say they are looking for 150-word blog posts that will be website summaries. I can do this work, but…are these sites legitimate SEO tools or just ways to get around link building that is considered acceptable?

    Thanks for your thoughts!

    Denise

    Hi Denise

    I think your spidey-senses are accurate! This site looks and smells fishy. They’re also hiding their domain registration details, which, while not necessarily suspicious, is a common practice amongst sites employing less than legitimate SEO methods.

    There is a very easy way to determine if they are *white hat*, have a look at their Google Toolbar PageRank. Oh look! A zero PageRank score. If Google doesn’t think they’re trustworthy, that’s a big red flag right there.

    I would avoid them like the plague.

    Spread the joy!