2016-02-25

Back in 2014, following Tim Bray’s lead, I made the Perpetual βeta website accessible exclusively over TLS. This means that your web-browser encrypts the information it sends to this website and vice versa. This results in a highly secure connection. There are visual cues of this secure state. You might notice a padlock icon in your browser, or that URL‍s to this website have a https:// prefix rather than the more common http://.

The general consensus in the web developer community is that any and every website should be HTTPS by default. Why? HTTP by itself isn’t encrypted, leaving it open to eavesdropping, message tampering, and man-in-the-middle attacks. HTTPS, if you use it consistently, prevents these issues.
Stephen Merity

So that’s all good then. But there’s a fly in the ointment, as Merity goes on to explain:

HTTPS is confusing one of the core metadata tools of the Internet: HTTP Referrers. HTTP Referrers disappear when going from HTTPS to HTTP, but, more worryingly, sensitive HTTPS Referrers still get carried when going from HTTPS to HTTPS. Most secure applications aren’t aware of where their HTTP Referrers do or don’t go.

The Referrer Header

A referrer header is an important element of a successful Web. Referrer headers give a website’s author information about the sites that link to hers. Namely, the full URL of the referring web-page.

Yet therein lies the rub, since there might be instances where you don’t want to advertise the URL of your referring page to a link target. Fortunately, it is easy to omit or spoof the referrer information should such a need arise. The fact that a referrer header is so easy to alter means that all such headers are inherently untrustworthy.

However, referrer headers are still useful to the web-developer or marketeer. In the world of blogging, to which this article most directly relates, the referrer header is a courtesy to the target website. The so-called refback alerts the target website’s author to the existence of a link to their material, perhaps with a response to one of their own posts, or some other form of critique. With the referrer link they can then visit the referring page, review the context of the link and respond accordingly.

So, within the context of blogging at least, sending a valid and correct referrer header is simply good Web etiquette. Yet, as increasing numbers of websites adopt TLS and link to their regular HTTP cousins, the absence of referrer information will most certainly have an impact.

The table (fig. 1) illustrates which protocol pairs share referrer headers. We can see that in all but one case, HTTPS to HTTP, the default behaviour is to send a referrer header. In the case of HTTPS to HTTP it makes sense, in most cases, to withhold referrer information as we almost certainly demand there be no vestigial leakage of possibly sensitive information from a secure resource to an insecure one.

The sharing of referrer data between two HTTPS resources on the same domain is not a concern but, like Stephen Merity, I find it bizarre that this is also true with HTTPS servers on different domains. Having said that, if your URL‍s contain sensitive information of any kind, then you have way bigger problems to worry about.

Referrer Transfer (fig. 1)

Protocols Used

Referrer State

HTTP → HTTP

✓ Passed

HTTP → HTTPS

✓ Passed

HTTPS → HTTP

✗ Not Passed

HTTPS → HTTPS

✓ Passed

Enter the Referrer Policy

Fortunately, those wise owls at the W3C anticipated the problem and conceived a solution, the Referrer Policy.1 This provides for webmasters to be able to explicitly control the referrer headers of their web properties, with a granularity they did not have before.

A Referrer Policy has five distinct states and four delivery methods. Let’s take a look at the states first:

Referrer Policy States

State

Behaviour

no-referrer

The client will send no referrer information with the request.

no-referrer-when-downgrade

If a HTTPS resource links to a HTTP resource (downgrade) then the client will send no referrer information with the request. In all other cases, the client will send referrer information.

origin-only

Sends only the URL “origin” with the request. For example: if the source URL is https://example.com/contact then the client will send only https://example.com/ with the request, regardless of whether or not the destination is a TLS resource and regardless of whether or not the request is cross-origin (https://example1.com/ → https://example2.com/).

origin-when-cross-origin

Sends only the URL “origin” with the request if the client requests a downgrade resource (HTTPS → HTTP), upgrade resource (HTTP → HTTPS) or cross-origin resource (https://example1.com/ → https://example2.com/). If the source and destination share the same origin and protocol then the client sends the complete referrer URL with the request.

unsafe-url

The client will send complete referrer information with the request, regardless of protocol or origin.

Choosing a Referrer Policy is something that each webmaster must do for herself. There is no one-size-fits-all recommendation as each website has its own unique characteristics. I would advise this: if in doubt, choose no-referrer.

The Perpetual βeta is a static-file-based website. There are no databases or server-side applications. Furthermore, there are no forms on the site nor are there any query-strings. As a result, there are no instances where an internal URL can contain anything other than a link to a resource that physically exists on the server and, if it did, it would be of no consequence. In my reasonably uncommon case then, I can safely use the unsafe-url Referrer Policy and pass the complete URL with my links.2

Implementation

As I wrote earlier, a Referrer Policy has four methods of delivery:

№ 1: Via the Referrer-Policy HTTP Header

We configure this at the server level in either the httpd.conf or .htaccess files.3

Where you replace the string no-referrer with the one applicable to your chosen Referrer Policy (unsafe-url | no-referrer-when-downgrade | origin | origin-when-cross-origin).

NOTE: At the time of writing, I have found no browsers that honour this header directive.

№ 2: Via a meta Tag with a Value of referrer

The second option allows the webmaster to send a meta tag with her HTML:

That’s nice, clean and simple. Furthermore, it allows the webmaster to control the transmission of referrer information on a page-by-page level. Replace origin with your preferred Referrer Policy.

№ 3: Via the referrerpolicy Content Attribute

Still not granular enough for you? How about setting you Referrer Policy at a link level? Yes, you can achieve that too:

Replace origin with your preferred Referrer Policy.

№ 4: Implicit or Inherited

A link without a defined Referrer Policy will inherit one from its parent(s). First from the page (if one exists), then from the server headers.

If a Referrer Policy is not defined, then the link will inherit the default browser behaviour as illustrated in fig. 1.

Final Words

The blogosphere (and the Web in general) is rapidly moving towards a TLS-by-default profile, and rightly so. In the transitional period, there are, inevitably, TLS-based websites that have links to the insecure websites of their peers, links that therefore contain no referrer data. Left un-addressed, this information loss deprives webmasters of invaluable perspective on the ecosystem around their websites.

With Referrer Policies, webmasters can enjoy fine-grained control of the referrer data their websites emit, right down to the link level.

Referrer Policies are well supported, with all the major browsers able to interpret and respond to their directives.4 So there’s no reason not to use them today.

Yay, this time it’s spelled correctly! ↩︎

I can use the unsafe-url Referrer Policy because my URLs have no sensitive details to reveal. If you consider using this Referrer Policy you have to really think about the security implications. You should probably not use this Referrer Policy if you run a secure website or web-application, or if you pass application parameters in query-strings. ↩︎

For the Apache web-server. Please refer to your server’s documentation if yours is not an Apache installation. ↩︎

Excluding the HTTP Header delivery method. ↩︎

Show more