Please don’t guesstimate me

It so happens that I load google’s webpage from a clean user account. Now this account has US locale, and is set to english throughout the system. Then why in hell does google think I want the page displayed in french?

For some reason, based on the fact that my IP matches a french ISP, a lot of websites assume I am french (which I am, but then again, I could be there only for a few days), and display their contents accordingly. So, what I’ll get is a french text on the page, french advertisement, and (if enabled) a french keyboard layout for any virtual keyboard that may appear on the page (having a virtual keyboard on a webpage is something of a puzzle to me, but that’s another story)…

Now, some of you may think that it’s kind of an easy guess, and reasonable at that. Well, I think reverse-mapping is incomplete at best, random in most cases. However, my identification string is not (for instance, with Safari):

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_2; en-us) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.1 Safari/525.18

Dude, that’s the first thing the browser sends to the web page. It doesn’t require any dns-mapping tool or clever interpretation… It says here “en-us”. That means english with a us locale. I know it can be forged, but someone forging an agent string might as well find a way to get through a proxy in a different country…

I generally find that “clever” programs trying to guess what I want to see or how I want it presented to me have a very strange way of finding out things about me. And most of the time, they get it wrong, precisely they want to guess in a “clever” fashion.

Don’t misunderstand me, I don’t think they make a bad guess 100% of the time… I just think that most of the time, they use a roundabout fashion that’s way too clever for the intended purpose. Sometimes they guess right. But most of the time, a more basic approach would yield better results. I don’t have any statistics, but I’m more than half convinced that the clever way yields most of the time the same thing as the basic way, by using more resources. And in “extreme” cases, such as my web browser, the basic way gets you better results.

That’s Occam’s Razor in reverse : if there’s a simple and a complex way to achieve something, it’s probably best to go the simple way. Unless you are a Shadok, of course…


Leave a Reply