It’s Tim here. I’m the developer for a website – [URL removed for privacy reasons] – and as of Thursday or Friday last week, Google has crawled my whole site. It shouldn’t have been able to do this, but it has.
Part of the site is written in PHP and Google has cached all the pages, several of which contain information that shouldn’t really be in the public domain.
I’ve submitted the FQDN to Google asking them to remove the URL which will hopefully prevent any results being shown in a Google search. The request is currently in a ‘pending’ state and I’m wondering how long this would actually take to be purged.
I’ve not personally lodged a take down request with Google, so I’m afraid I’m not speaking from experience, however I’ve had colleagues tell me this can take up to 3 months if a site has already been crawled.
Your email doesn’t make it clear what happened, but it may also depend on how sensitive the content is and why it was indexed in the first place.
A couple of things you can do while you’re waiting:
1) If Google managed to crawl your whole site, you might have conflicting instructions in your robots.txt file, your robots meta tags on those pages or you might be including content you don’t want public in your sitemap.xml file that Google is indexing. Check all those areas so the problem doesn’t re-occur.
2) Ask Google to remove content through the Webmaster Search Console. This is often faster than the formal take down request you submitted via email. It requires you to verify ownership / admin of the site via the Search Console first.
Keep in mind that even after you’ve blocked the pages from being indexed, they can take a while to fall off the Google search results, depending on the number of data-centers that have cached them and where they are serving results from.
Best of luck!
Like to teach yourself AdWords? Start here.
There are several options to look at, depending on your situation;
That’s G’s help page that should cover the different steps based on your situation.
In the mean time, to protect yourself, you could try setting the server to pick-up traffic from Google SERPs and redirect them. That way anyone clicking will end up somewhere else.
(That won’t help with the SERP Snippets nor the Cached pages! But it’s a partial measure till the URL RRs are processed.)
Agree with R. Rogerson. Best way to go about it is to redirect.
I think first thing is to block website from robots.txt and after development put redirection.
What’s the current thinking on having your blog posted and hosted on a separate website with many links going to the main site? We’ve had some advice to bring the separate blog inside the main site which would give us two blogs on the website listed below. Would that be better for overall SEO?