I’m interested in knowing whether search engines index deep content such as pdf’s. We have several pdf’s available for download across a few of our sites. Roughly speaking, what level of weight/importance is given to deep content vs on-site surface content? And is it worthwhile re-visiting all of our pdf’s and optimising the text content they contain?
Yes, most search engines index PDFs. In fact, after HTML files, PDFs are the most popular file format on the web.
In terms of deep content – most search engines only dig down a certain number of folders when indexing a site for the first time. They generally come back to dig deeper, but may not index all the files on your site. For this reason, you should place all your most important content in the top 2 or 3 levels (e.g. not buried too many sub-folders deep).
Provided you have included all your pages and files in your XML sitemap and submitted that to Google and other engines, your most important content should get indexed regularly.
Regarding optimising PDFs, yes, I would recommend you do this. Here’s a terrific article from SEOmoz that should give you some tips for doing this.
Struggling to get better search rankings? Download our Free SEO Lesson. No catch!
Great to see some new blog posts on here. Really like the point about “top 2 or 3 levels”. As a rule I never go beyond this when building a site – even on large eCommerce websites. Also its good to see other SEOs mentioning sitemaps, IMHO, these are one of the most important parts on a website.
Thanks Dave, it’s great to be back!