Hello Kalena,
Something weird is going on with my website, which is the fact that google started indexing 2 different pages as one page. So now: www.mysite.com/index.htm (actual site provided) and www.mysite.com are indexed as the same page, though they are different pages. And when you view the cache of www.mysite.com/index.htm it gives you the cache of www.mysite.com. Can you please advise?
Regards, Mais
Dear Mais
I’ve looked at the URLs (provided), and yes indeed, the pages you’ve referenced are in fact completely different. It appears that you have an index.php (the default home page) and a different page at index.htm.
Historically there have been a number of “standard” or “default” filenames used for home pages (see list below), and the order of precedence for these is determined by your server configuration.
I suspect that the problem you’ve experienced has been caused by confusion (by the search engines) over which page is in fact your default one. I recommend that you rename your index.htm to something else (and adjust the links to it accordingly).
Listed below is a (possibly incomplete) list of the filenames that could be setup as a default home page.
To save confusion, I suggest that you try to have ONLY ONE of these files existing in any one directory. This list is roughly in order of precedence (but can vary depending on server configuration) :
1. default.html
2. default.htm
3. index.php
4. index.shtml
5. index.html
6. index.htm
7. home.html
8. home.htm
9. index.php5
10. welcome.html
11. welcome.htm
Regards,
Andy Henderson
Ireckon Web Marketing
thanks Andy – very useful information here. I was struggling for a while with my website when the index.php took precedence over the index.html. Now I know why. I’m thinking that perhaps nowadays index.php/html (only one of them) is the right one for the job.
index.php or index.html ?
Which filename is used inside your server is irrelevant, the canonical URL for the root of your site should be http://www.example.com/ with no filename mentioned at all.
In fact, any request for any index or default file at any folder depth should be served with a 301 redirect to strip that name off, the new URL ending with a slash.
On Apache, that’s just a couple of lines of code.
I agree g1smd – the canonical form should not include a file path – however, a problem still arises when site owners (or developers) create different pages, with different content for more than one of the widely recognised “home pages” – and then link to them internally.
It shouldn’t be an issue if internal links are consistent, as the server should also default to the same “home” page – but it’s just not very clever to create different pages for more than one of the pages listed above.