Click here to get twenty dollars off Webhosting at DreamHost
Powered by MaxBlogPress 

Google And Datamining



November 12th, 2008 17:53 pm | by Ed |

In a way it's interesting to note that while a lot of people have made a lot of noise about government and corporate datamining operations, it doesn't seem like very many people have given a whole lot of thought to how much data Google (and other search engines) has access to.

The first place they get data is of course in search queries. It's pretty easy to tell from the keywords you use in your search what it is you're looking for. That means there's no rocket science in deciding whether you're looking for auto insurance quotes, a new health plan, information about some disease or plans to build something. After all, that's the whole basis of how search engines operate.

You give them keywords and they try to provide you with results that match what you're lookin for. The next information they have is the links that you actually click on out of the ones that they provide in response to your search. This helps refine what they can guess about you.

Thing is, it doesn't end there. Take Gmail. In accordance with Google's known policy of archiving everything, Gmail has a monstrous storage capacity and of course, they like to provide targeted advertising that's relevant to the contents of your email. In order to do this their software has to read and analyze your email.

This gives them boatloads more information depending on what you talk about in your emails. The only way to prevent this is to encrypt the text your emails on your computer before entering it in gmail. Of course, that only applies to outgoing emails.

Inbound emails are wide open to them unless you have your correspondents use encryption. Most people are either too busy or too lazy to even attempt learning how to do this. Not only that but likely as not they'll get your email, decrypt it, then quote the whole thing in their reply which they won't encrypt.

Then go on to Google's other services that basically leave copies of your documents and spreadsheets on their servers. Yeah, it's handy that way, you can get to it from any computer with a web connection. The problem is that again, unless you take the time to encrypt stuff before giving it to them they can read it and also spreadsheed and word processing apps can't deal with the encrypted data.

Bottom line? It's a good idea to think about who is going to have access to your information before you hand it over to Google or any other website. Also bear in mind that even if you hit a delete button or link someplace does not mean that they haven't archived a copy of it somewhere anyway.

Technorati Tags: datamining, privacy, google, information security

Be Sociable, Share!
  • email
If you enjoyed this post, make sure you subscribe to my RSS feed!
Want to link to this post?
Just copy this code and paste it on your site where you want the link to appear:

No Comments

Sorry, the comment form is closed at this time.