When you visit websites in your browser, what does the other end actually see? What can they find out about you?

What do you look like to the internet? Nov 9, 2013 12:14am UTC
We all use the internet. We visit thousands of pages, sending countless bytes of information through switches and servers all around the globe. If we can retrieve so much information from these sites around the world, what can they get from us?

Let's take this blog post for example. Here's some of what I see on my end when a user visits this page. Let's assume it's you:

[REQUEST_URI] => /blog/37
[REMOTE_PORT] => 53008
[HTTP_COOKIE] => hus=fu37clpe0p3oh1qqu2pv8i6dd3
[HTTP_ACCEPT_LANGUAGE] => en-US,en;q=0.8
[HTTP_USER_AGENT] => Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
[REQUEST_TIME] => 1383934514

Let's break this down a little bit. Obviously REQUEST_TIME indicates when exactly you visited the page. Taken slightly further, this is a time of day when you likely enjoy browsing the web.

REQUEST_URI is again pretty obvious, indicating the page to which you browsed.

REMOTE_PORT tells me exactly which socket your computer is using to receive data from my server. Assuming there's a browser exploit or other vulnerability, this is how I would download whatever I want to your computer.

The next few get a bit more personal. HTTP_ACCEPT_LANGUAGE tells me the written language in which you prefer to browse the web. In this case, en-US means english. Based on this, I can guess at your nationality.

HTTP_USER_AGENT has some familiar words in it, like Mozilla, Apple, Chrome, etc... The contents of this entry give me the exact browser and version that you're using to view my page. Also, the presence of WOW64 indicates that your machine is 64-bit. If this were absent, I could assume your computer is a bit older and potentially open to more vulnerabilities.

HTTP_COOKIE. We've all heard of cookies in the context of the internet. This bit can enable me to track each time you return to my site. I can figure out how frequently you visit, and in conjunction with REQUEST_TIME provides a more detailed profile of your web browsing habits.

Lastly we have REMOTE_ADDR, aka your public IP address. If run through a service, this info will give me a pretty accurate indication of where you are currently located. Obviously it's not perfect, but if you've ever seen a website with an ad reading "find local singles in [fill in your city's name here]", this is how they're contriving that city. Coupled with HTTP_ACCEPT_LANGUAGE, I get a better estimation of your nationality as well. The list keeps going with REMOTE_ADDR. Based on your location, I can guess at your household income and potential employer.

So let's summarize:

From a visit to this page, (with reasonable accuracy) I now have the type of computer you're using, a good time of day to find you browsing, the web browser you prefer, how often you come to my site, the language you speak, your potential nationality, a list of places at which you may work, how much money you might make, and your actual location within a couple of miles.

Do you feel violated? Guess what. You've given away this info thousands of times.
