Google is breaking the Web

Tue, Oct 18, 2011 05:09 PM
So, Google announced today that they would start doing a couple of new things. First, they are going to start sending all logged in users to their SSL enabled search page. Secondly, they claimed they are going to stop sending search terms to the sites in the referring URL. They claim all of this for your security. The first one I buy. SSL is better for those people on public wifi, no doubt. The second, I don't think so.

Let's back up a step. For those that don't know how the web works, this is a quick lesson. When you click on a link on a site, your browser connects to that new site to get the page. Part of that communication is to tell the site you are going to what site the user was on when they clicked on your URL. This is a good thing. There is never any information passed between Google and your site. Its all between your browser on your computer and the site you are asking your compter to load. It helps site owners know who is linking to them. In the case of search engines, the referring URLs often contain the search terms someone typed in to find their site. This is also helpful for lots of reasons. None of them involve a user's security.

Ok, so, Google claims they are going to remove your search terms. But, my tests show they are removing the whole referring URL. Yes, you will not know what users are coming from Google. Let me show you. This is what I did.
  1. I typed http://www.google.com/ into my browser
  2. I searched for dealnews
  3. I clicked on the first link, which is the dealnews.com front page.
Using a tool called HTTPFox I am able to see what information is being passed between my computer and the web sites. This is what I see:
  1. http://www.google.com/ with no referring URL
  2. http://www.google.com/#hl=en&sugexp=kjrmc&cp=5&gs_id=j&xhr=t&q=dealnews&qe=ZGVhbG4&qesig=YtB_HodN2qCOIiqwx_wetA&pkc=AFgZ2tlle01GJ99f38Ol-HvrY0sbiq4vzJfAPDSXGQ2js5QqyHGJ9-5HIgoFXbUujrU81pfyhEVO8jpmFouC09MG1fRbqd0GVA&pf=p&sclient=psy-ab&site=&source=hp&pbx=1&oq=dealn&aq=0&aqi=g4&aql=f&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=7b65204da701ddb7&biw=1295&bih=1406 with no referring URL because Google use javascript to load the search results.
  3. http://www.google.com/url?sa=t&source=web&cd=1&sqi=2&ved=0CCwQFjAA&url=http%3A%2F%2Fdealnews.com%2F&rct=j&q=dealnews&ei=EPOdTtaUN4XOiAKZlIntCQ&usg=AFQjCNEN2YJ8XgSAJm6FOUqK2PuBUOkfxA&sig2=N2jBSsJb8sgPsrTkGgFCfw&cad=rja with a referrring URL of http://www.google.com/
  4. http://dealnews.com/ with a referring URL of http://www.google.com/url?sa=t&source=web&cd=1&sqi=2&ved=0CCwQFjAA&url=http%3A%2F%2Fdealnews.com%2F&rct=j&q=dealnews&ei=EPOdTtaUN4XOiAKZlIntCQ&usg=AFQjCNEN2YJ8XgSAJm6FOUqK2PuBUOkfxA&sig2=N2jBSsJb8sgPsrTkGgFCfw
As you can see, the request to http://dealnews.com/ was sent a URL by my browser telling that site that Google was linking to dealnews. In that URL you will see q=dealnews. That is the search term I typed into Google. Now, lets see what happens when I do the same thing on SSL.
  1. https://www.google.com/ with no referring URL
  2. Redirected to https://encrypted.google.com/ with no referring URL
  3. https://encrypted.google.com/#hl=en&sugexp=kjrmc&cp=8&gs_id=f&xhr=t&q=dealnews&tok=wzChADhZTTjwPuXR1iOwSA&pf=p&sclient=psy-ab&site=&source=hp&pbx=1&oq=dealnews&aq=0&aqi=g4&aql=f&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=47f2f62d0e6da959&biw=1295&bih=1406 with no referring URL because Google use javascript to load the search results.
  4. https://encrypted.google.com/url?sa=t&source=web&cd=1&sqi=2&ved=0CCsQFjAA&url=http%3A%2F%2Fdealnews.com%2F&rct=j&q=dealnews&ei=x_edTvjlGeKviQKzmdHqCQ&usg=AFQjCNEN2YJ8XgSAJm6FOUqK2PuBUOkfxA&sig2=OEhW8Z_BhHcCboIzu_Z2zQ with a referring URL of https://encrypted.google.com/
  5. http://dealnews.com/ with no referring URL.
So, if dealnews.com does not get a referring URL, how do we know this came from Google. This is the quote from the Google blog post:
When you search from https://www.google.com, websites you visit from our organic search listings will still know that you came from Google, but won't receive information about each individual query.
I ask you how the site will know that if there is no referring URL? Referring URLs are a fundamental part of the web. If Google wants to strip data off the URL, that is one thing. It is not great IMO, but whatever. But, not sending referrers at all is  just wrong and should be changed.

If you care, please share this post. Tweet it, +1 it, whatever. This is just bad news for the web.

Edit: I wanted to make sure everyone knew, I observed the same behavior in both Firefox 7 and latest Google Chrome

Edit 2: I have also confirmed with the Apache access logs that no referring URL was sent.
11 comments
Gravatar for Forum Software Reviews

Forum Software Reviews Says:

Hi Brian,

From what I remember, it's part of the HTTPS rules: you are not supposed to transmit the refer(r)er using SSL, I would say for security reasons (even if everything is actually crypted...)

Gravatar for Brian Moon

Brian Moon Says:

As of yesterday, I had https://encrypted.google.com/ referrers in my Apache access logs.

Gravatar for Craig

Craig Says:

I'm pretty sure this is the expected behavior. Perhaps the better question would be why/how you were getting it before. I'm pretty sure the http referrer is stripped at the client/browser level when coming over SSL.

http://stackoverflow.com/questions/1361705/is-http-header-referer-sent-when-going-to-a-http-page-from-a-https-page

Gravatar for Gerd Riesselmann

Gerd Riesselmann Says:

It has been said above: It's part of the HTTP spec to not send referer when going from https to http as in this case. Sending referers is the job of the browser, and there's nothing Google could do about it.

So, actually no: Google is not breaking the web. Unless you think that https itself is breaking the web.

Gravatar for Brian Moon

Brian Moon Says:

Ok, I learned something. I like learning something. But how do we interpret Google's claim that you will still know the link came from google?

Gravatar for Simon King

Simon King Says:

It looks like Google is providing some of this missing data via the Webmaster tools. Not exactly the same, but it's something. More info here: http://googlewebmastercentral.blogspot.com/2011/10/accessing-search-query-data-for-your.html

Gravatar for Forum Software Reviews

Forum Software Reviews Says:

After some searches, I finally found where I read this absence of Referrer when using HTTPS: http://tools.ietf.org/html/rfc2616#section-15.1.3

Gravatar for Gennady Lager

Gennady Lager Says:

That Google Webmaster Tools data is very much inaccurate and useless to the point of needing to be ignored:

http://www.distilled.net/blog/seo/new-google-webmaster-tools-keyphrase-data-is-70-useless/

http://www.seroundtable.com/google-webmaster-tools-accuracy-12768.html

Gravatar for Suter

Suter Says:

Hmm, I can see Referrer header in request from search results page even though it's served from https page.

The header is missing in requests from encrypted.google.com.

Gravatar for Destrey

Destrey Says:

Knowldgee wants to be free, just like these articles!

Gravatar for Nick  the 2nd grade math tutor

Nick the 2nd grade math tutor Says:

Google is making it more complicated to use its own search for ordinary users such as many tutors at our learning center still don't understand html and but to care about their online privacy. I think the general public even don't get aware of when these changes take place and they got replaced by some new ones.

Add A Comment

Your Name:


Your Email:


Your URL:


Your Comment: