Q and A: How can I get my .EML (email) files crawled and indexed?

Question

Hi Kalena,

The website I maintain is informational and features largely political news. Much material reaches me in the form of e-mails which I wish to upload and make available to visitors. Can you point me to a website search engine which will index the site’s contents, including the email (.eml) files. The Windows Search facility on my computer (Windows XP) does this quite competently but I have been unable to trace a similar web search engine with the appropriate filter which will index the eml files (some of which have attachments (mainly Word or PDF). I should be grateful for any guidance.

With thanks Ezra

Hi Ezra,

As you are probably aware (but for the sake of other readers) the .EML file extension is used for Mail Messages saved from Outlook Express. The main purpose of an EML file is to store e-mail messages (and as you have highlighted may include attachment data as well).  EML files can be used with most e-mail clients, but can not be viewed directly by web browsers.  However, since EML files are plain text and formatted much like MHT (MIME HTML) files, they can be opened directly in most popular browsers (Internet Explorer, Mozilla Firefox and Opera), by changing the file extension from .eml to .mht.

Although search engines do crawl and index a wide variety of filetypes (see the filetypes that Google can index) as far as I am aware no search engines crawl or index EML file types.

EML files typically include the e-mail addresses of the sender and the recipient so from a privacy/security perspective I would expect that you wouldn’t want these types of files to be indexed anyway (and if I were one of your information sources I’d probably be pretty annoyed if you published my email address).

I suggest that if you wish to publish (and have indexed) information that you receive by email, that you extract the relevant content and publish it in a format that is recognised by web browsers and search crawlers (e.g. HTML, PDF, DOC, or even TXT, etc..)

Andy Henderson
WebConsulting SEO (Brisbane)

Spread the joy!

Take a Search Stand on Bing Fridays

Bing FridayToday is Bing Friday. The one day a week when everyone should use Bing instead of Google to prove that Google doesn’t rule the Internet. Confused? Let me backtrack a little…

At the SMX Melbourne Conference last month, a certain speaker made the valid point that Google has become such a money-making monolith that they seem to have lost sight of their original philosophy of Don’t Be Evil.

I’m protecting the identity of the speaker, in case he attracts unwarranted attention from Google, but his words really rang in my ears.

Let’s take a look at an extract from Google’s “Don’t Be Evil” philosophy:

“Don’t be evil… is about providing our users unbiased access to information, focusing on their needs and giving them the best products and services that we can... The Google Code of Conduct is one of the ways we put ‘Don’t be evil’ into practice. It’s built around the recognition that everything we do in connection with our work at Google will be, and should be, measured against the highest possible standards of ethical business conduct.”

Apparently, “Don’t be Evil” was originally suggested by Google employees Paul Buchheit and Amit Patel at a meeting. Buchheit, the creator of Gmail, said he “wanted something that, once you put it in there, would be hard to take out,” adding that the slogan was “also a bit of a jab at a lot of the other companies, especially our competitors, who at the time, in our opinion, were kind of exploiting the users to some extent.”

It got me thinking – are Google still living up to this slogan? Or have they become so powerful that they are doing the very thing they were accusing their competitors of doing and exploiting users in their bid to keep market dominance? Has Google been placing the needs of their shareholders above the needs of their users? Have they lost sight of their own motto?

Some of Google’s recent product releases and acquisitions do seem to be dollar-driven as opposed to user-driven. Some of their business decisions lately have also seemed questionable. Their move into China, for example, required them to self-censor data for Chinese users, a seeming hypocrisy which attracted skepticism worldwide. Then there was their collection of personal WiFi data during Streetview routes in Europe, triggering concerns over personal privacy. It’s hard to see how decisions like these are beneficial to users.

The SMX speaker suggested that Google has such massive market share that they AND their users have become blase about search quality. The tendency is for everyone to reach for Google whenever we need to search for something online and only use other engines for comparison shopping. His point was that the more blase we become about Google’s dominance, the more blase Googe will become about users.

The only way to take a stand against Google’s market dominance is to use other search engines regularly. That’s why he suggested that one day a week, instead of automatically reaching for Google, we should make the effort to use a different search engine, with Bing Friday being a good starting point. If enough people do it, Google might just sit up and take notice, but even if they don’t, at least we will shake ourselves out of our Google stupor and stop taking everything they do as gospel.

Now if you read my blog regularly, you’ll know that I am a big fan of Google. But I have been worried about the direction they’ve taken lately, particularly some of their recent acquisitions. I also believe that more competition is good for the industry and keeps all players on their toes for the benefit of everyone.

There are aspects of Bing Search I prefer over Google and I’m keen for Bing’s partnership with Yahoo to work out so it will help them leverage some market share away from Google. But I admit to being a lazy searcher and using Google as my automatic default engine. If I’m to make a difference, I need to take a stand and I feel this is a great start.

Will you join me and participate in Bing Fridays? To show your support, please comment on this post and/or tweet about it using the hashtag #BingFriday. Let’s see if we can get some traction!

POST SCRIPT : The speaker who came up with the concept of Bing Friday has given me permission to publish his name now. It was none other than Greg Boser of BlueGlass Interactive, Inc. He tells me that Bing Friday seems to be gaining momentum and to keep an eye out for a new project a friend is working on in relation to it. Sounds intriguing!

Spread the joy!

Free Online Training Initiative from Search Engine College

SEC-smiley-150x178Some of you might remember that last year, Search Engine College launched a free search engine marketing training initiative for charities and not-for-profit organizations worldwide.

The initiative provides 25 charities or not-for-profit organizations per year the opportunity to learn search marketing skills at no cost, to help them make the most of their limited marketing budgets.

Well the idea launched with more of a whimper than a bang, so today we’ve distributed an official Press Release to try and drum up more publicity.

If you know of a charity or not-for-profit that might benefit from free online marketing training, please direct them to our release, encourage them to get in touch, and/or spread the word by linking to this post.

Thanks!

Spread the joy!

May Search Light Newsletter: the *blame Google it’s late* edition

Search LightThe second issue of the Search Light newsletter for 2010 was published today.

Yes, it’s  our second issue even though it’s nearly June. Shut up. The delay has nothing at all to do with my procrastination skills. It’s all Google’s fault and you’ll find out why when you read it.

This month’s newsletter includes an article about Social Search – the biggest thing to hit the SERPs this year since, well, Personalized Search a week before. It also contains some of the more interesting FAQs answered in this blog and a recap of the Search Marketing Expo (SMX) Conference in Sydney.

If you’re not yet a newsletter subscriber catch it here and then quickly go and subscribe before you change from a geek into a nerd.

Spread the joy!

SMX Sydney 2010: Keynote – Future Directions in Search

Welcome to Day 1 of the SMX Sydney Conference!

Barry Smyth opening the conference

I’ll be live blogging as many sessions as I can and writing up the others later. Today we kick off with the keynote from Chris Sherman, Executive Editor from Search Engine Land.

Chris starts his presentation with a YouTube video by Raymond Crowe using shadow puppets to mime “What a Wonderful World”. Like shadow puppets, search marketing looks impossible to do but we’re just out there doing it.

Chris says that the last few weeks have brought incredible change. Seismic change in the industry! We’ve come out of the global economic crisis. Online ad spend is picking up. WE’ll see $54B in global ad spend. Search is 47% of that spend, which is promising for our industry. B2B lead generation spend is still lower than before the economicdownturn, but that may be a lagging indicator.

Chris then showed a series of videos to describe online search marketing.

1) The Renaissance Site:

Video from 1969, taking a look at the *future* of electronic technology. Chris says that what they’re describing in that video is really the Google of today. Although they started laser-focused on search, now they offer “something of everything to everybody”.

Google Fast Flip

New product – combines Google News with a easy view layout.

Google News Timeline

Takes news and shows how stories are developing over a specific timeline.

Google Dashboard

Allows you to see what info Google is tracking about you. How you are represented as an entity on Google. Showed his own dashboard. The amount of data is extraordinary. The dashboard now gives you control over how much of this information is public and available. Demonstrates Google’s commitment to privacy.

Google Australia new products:

– Google Insights for Search – compare one search with another etc.

– Google map icons – allows you to claim a local business using Google Places on Google Maps.

– Google Sponsored Listings – within maps now.  An alternative to Google AdWords.

– YouTube promoted videos – You can now sponsor videos via YouTube so your videos come up the top – not officiallyavailable in Australia yet, except via AdWords AU.

– Google AdWords Webinars – new to GG Australia

– Google Speaks ‘Strayan! New feature to hep Aussies find local info, in local jargon. Cute!

This week, GG has hired a team of photographers to use cameras within the Streetview cars to focus in on more interesting visual data they come across (similar to what Bing are doing with virtual reality?).

Google Woes

– legal woes

– privacy issues

– China and censorship

Google: The new evil empire?

Chris has heard rumors of GG replacing Microsoft as the new evil empire.

– Photographers have sued over book deal.

– EU looking at antitrust

– Execs convicted of privacy violations in Italy

– Xerox and Quintura sue over patents

– Streetview lawsuits in multiple countries

– Launch of Google Buzz

– and the beating goes on…

Chris doesn’t think they will get into major legal difficulties. In terms of privacy though, they might have trouble. When they launched Buzz, they did it without permission and that was a major concern – especially if you use public shared computers.

In response, Matt Cutts went into great detail on the European Public Policy Blog about privacy and transparency. Mind you, he link dropped in a nuclear fashion in that post which amused Chris greatly.

Keep in mind, Facebook’s privacy is MUCH more relaxed and dangerous, in his opinion.

Dealing with the Great Firewall

– Google moved Google.cn servers to Hong Kong last month

– But China is blocking access to the site from mainland computers

– Excellent analysis at http://bit.ly/93pmnY

– Not just China: Google is censoring in other countries as well

– You can use proxies from within China to get past the censorship

– Chris has never personally experienced censorship when running SEM conferences in China

New as of Yesterday

The Google Govt Requests and Removal Tool – a new tool which is a maps overlay to allow people to request information to be removed or request more data. You can even see in real time what requests have been complied with or not. When you mouse over China, it says “Chinese government considers this information a State secret so data is not available”. Article at: “Google Responds to Privacy Concerns With…”

2) Emperor’s New Clothes

Showed video of a plane experiencing a very very dodgy landing. Chris says this represents Yahoo.

– Yahoo is the proud owner of the Emperor’s New Clothes

– Microhoo competition: Salvation or sellout? Microsoft does the heavy search lifting while Yahoo sells ads.

I was discussing this with Chris last night at the Tweetup. Chris thinks this is a clever move by Yahoo, but it really gives all the power to MS/Bing.

– Yahoo has divided the labor – “we’re more interested in what happens before and after search than search itself. In other words, we’re going back to our *browse* / portal mode.

3) Assimilator as Innovator

Showed a clip from Star Trek from The Borg. This represents Microsoft / Bing.  “You will be assimilated”. In other words, Chris says, MS is very clever at making people do what they want. Acquisition after acquisition.

Now part of the collective:

medstory

tellme

aquantive

jellyfish

multimap

farecast

Fast Search & Transfer – AllThe Web

Powerset

Bada-Bing!

– Bing is arguably a better na,e than Live Search, but what does it mean? In Chinese it means “Very certain to answer”.

– Fun image licensing for their home page means rotation of photos – always different. Chris uses Bing as his home page because he loves their home page photos so much.

– Powerset does a semantic search rather than algorithmic. Different to other SE’s. Uses Freebase to gather data. If you drill down, it will give you options like Wikipedia on steroids – will go and semantic search ALL Wikipedia articles on a topic you search for – very powerful.

– Bing Maps – geolocation can be an issue (e.g. Thinks Chris is in Melbourne right now). But they do some things very well. Mapp Apps are very cool. What’s Nearby also good. Signs and Billboards etc.

– Bing has captured billions of data sources for travel sources and put them into searchable form. Based on historical data, Bing.com/travel can tell you things like when is the best time to buy a ticket to New York – when it’s cheaper etc. This is powerful stuff! Shows graphs and charts and heat maps to tell you costs of flights, accommodation etc. Unique to Bing.

Chris sees this as the way going forward. This type of travel data may get rolled out to retail, eg volume of sales etc.

Bing SearchRank – another new feature not yet available in Australia. Get an idea of what searches are popular right now, similar to Google Trends.

4) The Shiny New Disruptor

Wofram Alpha. It’s a new computational knowledge engine. Wolfram’s founder believes the complex world can be reduced to simple rules and those rules are computable.

– WA contains 10+ trillion pieces of data, 50K types of algorithms and models and linguistic data for 1000 languages.

– In WA, put in a mathematical problem and Google will shoot out an answer. But in WA, it will give you the ellipse, a visual definition of the calculation. Put in a chord search and it will come up with the scale visually, plus allow you to play it.

– Ask questions and WA will give you all the data you could ever wish for. People can type in things like “When will I die?” Scary answers. *10 peanut M&Ms* WA will respond with the dietary calories. *Who’s the fairest of them all”?* Snow White. They are obviously paying attention, as answer has changed since Chris first started asking it some months ago. *Am I drunk?* Will give you the alcohol percentages of common drinks.

Social Media

– love it or hate it, SM is huge.

– How big?

– Globally, 1 billion+ users wasting spending 2 billion minutes/month

– Share of global online time:

Facebook 16%

YouTube 9%

Google 5%

This is HUGE. If you’re SEOing for Google, you might want to rethink your priorities and start advertising on Facebook and YouTube.

Email’s not going away anytime soon, but stats show that Social Media is more popular with people. Email has flatlined in terms of time spent, while SM has gone to vertical curve.

Twitter

– 75% of Twitter’s traffic comes from APIs

– Twitter has become a *real* search engine

– Twitter has just announced monetization – “promoted Tweets”. Chris finds this disengenous, although he concedes it will probably be successful.

– Based on KW bids, ads will be displayed at top of search results

– Resonance reuqired (think Quality Score), based on retweets, replies, hashtags, clicked links etc. Searchers need to engage with the ads for them to maintain position – this is a Google approach. Makes sense given key staff are ex FeedBurner / Google staff.

– Third party distribution

– Twitter palns to expand program to it’s partners and then it will become massive.

– other options – Tweetup – contextual sponsored tweets displayed on  publisher sites, using a CPM model now with cost-per-click and cost-per-new-follower.

– TweetUp – is a new tool -the brainchild of GoTo.com’s founder. It’s a network of the world’s best tweeters. Response has been sceptical, but hey, the response to GoTo’s idea of PPC caused the same reaction in people.

Facebook

Chris says Facebook is here to stay. So many ways to reach people and the size of the audience is astonishing, he says. You can’t ignore Facebook. There are definitely ways to measure the impact of a Social Media campaign.

– If Facebook was a country, it would be the 3rd largest country in the world

– lots of fertile options for marketers

– pages, apps, ads, polls,

– And analytics via Facebook Insights

If you’re looking for ways to leverage Facebook, try:

InsideFacebook.com
AllFacebook.com

One third of the people ON Facebook are interested in marketing on Facebook – encouraging. Chris says, if you’ve avoided SM until now? STOP and reconsider.

Real Time Search & SEO

– As real time search becomes more commonplace, it is displacing *traditional* search results

– Fundamental SEO is still important, but there are new opportunities to gain exposure thanks to real-time algos

– At it’s heart, Caffeine is an attempt to capture real time crawl. Larry Page is very impatient about this concept.

– Real time search impacts SEO in a huge, huge way. The algorithm has basically been re-written. Can’t do much right now except continue to use best-practice.

Personalization

– This will amplify things

– Personalization affects search results

– For text results, can’t do much

– However, opportunities to gain real estate via universal search are still good

– Think *digital asset optimization*

Chris says, don’t despair, these changes offer oppportunities for you to use them to your advantage. Because most people won’t be – now’s your chance.

Mobile Search

Chris says “Are we there yet?” YES we are. Tipping point has hit this year – mobile advertising has become popular with the advent of smart phones.

-mobile advertising is the new Point of Sale

– 5.8 billion mobile subscribers worldwide by 2013; 30% will be smartphone users (Portio Research)

– Mobile ad spend 2015

New Data from Morgan Stanley:

– Sometime between 2013 and 2014, there will be more mobile Internet uers than desktop PC users

– Growth of the iPhone happened at 11x the growth of Desktop Internet!

– These stats will impact Facebook users too obviously

– If you’re not already doing so – GO MOBILE NOW

– Little competition right now

– Go multi-mobile – see Cindy Krum’s article on Search Engine Land

– Consider optimizing your site for mobile search

– Use GPS based mobile apps to leverage your business e.g. 4Square, Gowalla, Placecast (GPS based advertising)

Video Marketing

– YouTube is second largest search engine right now

– Syndicate your videos widely

– Use video because your competitors probably aren’t

– Embed meta data, relevant titles & filenames

– Use appropriate on-page SEO

– Descriptive text

– Include your URLs in the video to encourage viral linking and viewing

A Huge Trend: Targeting

– Types of targeting include device / geographic / demographic / behavioral

Device Targeting

– GG, MS Yahoo

– Device platofrm tgt allows you to target your ads to PCs and iphones and others

Goegraphic Targeting

– GG, MS, Yahoo

– Language targeting occurs at country level or radius

– Also beningn, used to reach specific groups and exclude others

Demographic Targeting

– MS (full), YH (partial), Google (exploring options)

– Targeting ads based on age, gender, income etc.

– similar to direct mail, but uses data from volunteered information

– Can be problematic, especially on shared computers

Behavioural Targeting

– MS, YH, Google content only. Not search or Gmail

– Ads targeted based on online behavior (visits, pruchases, queries etc).

– Benefit = ads match interest closely

– Concern = privacy? What privacy?

Merging Online and Offline Data

– Exelate & Aperture pull data from Neilsen and other co’s to combine data with search behaviror

Opting out of Targeted Ads

– Google makes it kind of fun but not clear and hard to find

– MS uses legalese and it’s boring

– YH also makes it unclear and difficult to find

– You can opt out of 200 Ad networks by going to NAI (inc big ones)

Conclusion

– Web search has consolidated to few major players

– Good news – competition among majors has also increased, good for SEM and searchers

– Counterintuitive: Advertising may DECREASE as search engines continue to refine targeting options

Spread the joy!