Q and A: Do regional domains constitute a duplicate content problem?

QuestionDear Kalena…

First of all I find the info on your site extremely useful -  I always look forward towards the newletter! I have been trying to find the time to do the SEO course but finding the time is always a problem! However, its still on my to do list.

I am trying to sort out a problem regarding duplicate content on my sites. We run local sites for each language/country we trade in (e.g. .fr for France and .co.uk for England). Unfortunately whilst growing the business I never had time to research SEO optimisation practices so I ended up with a lot of sites with the same duplicate content in them including title tags, descriptions etc. I had no idea how bad this was of course for organic ranking!

I have now created unique title tags and description for ALL the pages on ALL the sites. I have also changed the content into unique content for the home page and the paternity testing page (our main pages) for each site in English. The only site with complete unique content pages is .com and parts of .co.uk. For the rest of the pages that still have double content I have also put a NO INDEX, FOLLOW code on the pages that have duplicate content so that the spiders will not index the duplicate content pages. I did a FOLLOW as opposed to NO FOLLOW as I still want the internal links in the pages to be picked up - does this make sense ?

Also having made such changes how long does it normally take for Google to refresh its filters and starting ranking the site? The changes are now about a month old however the site is still not ranking.

Also should this not work - do you have any experience with submitting a re-consideration through the webmaster tools? What are the upside and downside of this?

Any advice would be greatly appreciated.

Regards
Kevin

Dear Kevin

Thanks for your coffee donation and I’m glad you like the newsletter. Now, about your tricky problem:

1) First up, take a chill pill. There’s no need to lodge a reinclusion request to Google. According to Google’s Site Status Tool, your main site is being indexed and hasn’t been removed from their datacenter results. A standard indexed page lookup shows 32 pages from your .com site have been indexed by Google, while a backward link lookup reveals at least 77 other sites are linking to yours. If you’ve put NoIndex tags on any dupe pages, you’ve covered yourself.

2) Next, pour yourself a drink and put your feet up. Your .fr site is also being indexed by Google, but there isn’t a dupe content issue because the site is in French, meaning that Googlebot sees the content as being completely different. Your .co.uk site is also being indexed by Google and again, there isn’t a dupe content issue because it looks like you have changed the content enough to ensure it doesn’t trip any duplicate content filters.

3) Now you’re relaxed, login to Google Webmaster Tools and make sure each of your domains are set to their appropriate regional search markets. To do this, click on each domain in turn and choose “Set Geographic Target” from the Tools menu. Your regional domains should already be associated with their geographic locations i.e. .co.uk should already be associated with the UK, meaning that Google will automatically be giving preference to your site in the SERPs shown to searchers in the UK. For your .com site, you can choose whether to associate it with the United States only (recommended as it is your main market), or not to use a regional association at all.

4) Now it’s time to do a little SEO clean up job on your HTML code. Fire or unfriend whoever told you to include all these unecessary META tags in your code:

  • Abstract
  • Rating
  • Author
  • Country
  • Distribution
  • Revisit-after

All these tags are un-supported by the major search engines and I really don’t know why programmers still insist on using them! All they do is clog up your code and contribute to excessive code bloat.

5) Finally, you need to start building up your site’s link popularity and boost your Google PageRank beyond the current 2 out of 10. And by link building, I mean the good old-fashioned type - seeking out quality sites in your industry and submitting your link request manually, NOT participating in free-for-all link schemes or buying text links on low quality link farms.

Good luck!

Spread the joy!

Google Now Helps You Clean Up 404 Links

Google logoGoogle has just announced the easiest way to obtain inbound links to your site in a short space of time.

Webmaster Tool’s new Crawl Error Sources feature allows you to identify the sources of 404 Not Found errors that are found on your site. Listed next to “Crawl Errors” in the Webmaster Tools control panel, you’ll now find a “Linked From” column that lists the number of pages that link to a specific “Not found” URL on your site.  Clicking on an item in the “Linked From” column opens a separate dialog box which lists each page that links to this URL (both internal and external) along with the date it was discovered. You can even download all your crawl error sources to an Excel file.

If your webserver doesn’t comprehend 404s or fetch error pages very well, Google has also introduced a widget for Apache or IIS that consists of 14 lines of JavaScript that you can paste into your custom 404 page template to helps your visitors find what they’re looking. It provides suggestions based on the incorrect URL.

You can use the “Linked From” source information to fix the broken links in your site, place redirects to a more appropriate URL on your site and/or contact the webmasters linking to missing pages or using malformed links and ask them to fix the links.

Webmasters have been asking for something like this for a long time, so it’s a relief to see it live at last. The official post about the feature is on Google’s Webmaster Central Blog and Matt Cutts goes into more detail on his blog.

Spread the joy!

Google’s Cross-Product Webinar

Google have announced a free cross-product webinar for webmasters to learn more about three of their most used products, Google Webmaster Tools, Google Analytics and Google Website Optimizer, and how they can work together to enhance your website.

The webinar will be held 8th July 2008, 9:00am PT (Pacific Time). To attend you need to register. Those that can’t make it will be able to access an archived version of the presentation via the same registration URL. This is the first time Google have offered a joint webinar for these products.

Spread the joy!

Q and A: What is an XML Sitemap and why do I need one?

QuestionHi Kalena

I am not sure what a XML sitemap is. I have gone to websites that will automatically generate a site map and the code they create is not understandable to me and they can only index the first 500 pages.

There are pages on my site that are important to be indexed and others that don’t matter. I have no idea how to create a XML sitemap that only lists the pages I want indexed. How can I do this? Can you clarify what a XML sitemap is and if I can have only my important pages indexed on it?

Beverly

Hi Beverly

Thanks for the caffeine donation, I’ll be sure to use it tomorrow when I visit Starbucks.

A sitemap is simply a way for search engines and visitors to find all the pages on your site more easily. XML is simply a popular format for the delivery of the sitemap. To quote Sitemaps.org:

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

I personally use XML Sitemaps to build all sitemaps for my own sites and my client’s sites. I paid for the stand alone version so I can create sitemaps for sites with over 500 pages. At under USD 20, I believe the price is pretty reasonable and their support is pretty good so it might be worth the investment for you. Apart from that, the instructions for using their web version are quite clear - perhaps you need to have a closer look? These sitemap FAQs shoud also help.

You can either create a full sitemap of your entire site and edit out any pages you don’t want indexed later, or instruct the generator to avoid certain files or sub-directories before running. Once you’ve created and downloaded the XML sitemap file for your site, simply upload it to your web server and follow the instructions to ensure it is indexed by search engines. If you’ve created a Google Webmaster Tools account, you can login and enter your sitemap URL directly into the control panel.

Like this post? Prove it! Please press the big feed button at the top left. Thanks.

Spread the joy!

Q and A: Why is my client’s site no longer ranking in Google?

QuestionHi Kalena

I’ve been reading your articles and find your answers to many people very helpful. So, here is my issue.

I am helping a friend with his website that I built. I felt like we did a pretty decent job with SEO and we had some fairly high ranking in some key terms like “lasik in chicago” 6th and “lasik in Oakbrook” 2nd.

All of a sudden I was changing the index page to put up a larger flash video. I also added some additional text that looks similiar to some of the higher ranking sites that are competitors of my friend Dr. Sloane. Since then I have noticed he has been moved down to page three for the same ranking. When I went into Google Webmaster Tools, I noticed that it shows that Googlebot hasn’t accessed the homepage since 2007. Also, I see all my pages rank very low on PageRank.

I’m just a little bit confused and was hoping that you could give me a little advice on getting his site on the right track. He has been around on the net since mid 90’s, so the domain has some age.

Shannon

Hi Shannon

First of all, thank you for the caffeine donation, that helps a lot when I’m answering these questions in the wee hours. As for your issue, I’ve taken a look and wow, where do I start? How about here:

1) The first major content on your client’s home page HTML is a huge Flash file. Quite apart from the fact that it’s visually distracting and goes against every web site usability rule possible, you’ve stuck it right after the header tags, meaning it’s the first thing search engines are going to try and index. The file isn’t optimized so it doesn’t tell Googlebot and others anything about your page, it simply pushes the meatier content further down the code.

2) You seem to have some weird link to the iFrance site embedded in an iframe. What’s that about? It looks dodgy and search engines don’t like iframes so it’s probably triggered a red flag or two.

3) Your current home page looks and smells like a doorway page. There’s no obvious formatting, no navigation menu, the design is not consistent with the rest of the site and it doesn’t load properly in Firefox. I was half expecting to see user-agent sniffer code in the HTML, but perhaps it’s just really poor design.

4) We’re up to number 4 already, and this is probably your main problem: there seems to be some type of delayed meta refresh that kicks in after 5 seconds and redirects people to a different URL on the same domain. This is retro spam at it’s finest and is like waving a huge red flag at Google saying “HEY, I’M DOING SOMETHING DODGY OVER HERE! PENALIZE ME QUICK”

Spammers like to use meta refreshes in order to bait and switch i.e. show Googlebot a family safe DVD page like Driving Miss Daisy and then redirect human searchers to a porn site of the… ahem… same name. Ditch the redirect pronto. Decide which home page you want to show both users and search engines and stick with it.

Surprisingly, your Title and META tags check out ok, although there’s a bit of excessive keyword repetition in your META Keywords tag. Googlebot last cached your home page on 13 April so check your Webmaster Tools account again.

That’s it for now, I hate to say it but my coffee’s run out.


Like this post? Why not prove it? Please press the big feed button at the top left. Thanks!

Spread the joy!