Why Does HTTPS Matter?
HTTPS has been around for a while, but it’s generally not well-understood. Many people know that sites using HTTPS instead of HTTP will display a lock on the URL to tell users that the site is safe to use. Everyone knows that big ecommerce websites have to use HTTPS. Many people are aware that HTTPS is a good idea for login pages and other form pages on any site. But does it matter for every day web pages and sites? Increasingly, the answer is YES, even for small sites and non-form pages.
HTTPS protects end users from eavesdroppers and other threats. Because of all the security ramifications of plain HTTP, Google is putting its considerable weight behind efforts to encourage websites to become more secure with an “HTTPS Everywhere” initiative:
Google has started giving extra SEO juice to sites served over HTTPS, promoting them in search results above their HTTP competitors. They’ve also begun to prefer indexing HTTPS pages over their HTTP counterparts when both exist.
In January of 2017 the Chrome browser will start adding a visual warning to any site using a plain HTTP connection, noting it as insecure. The first step is to flag HTTP sites that transmit passwords or credit cards, but eventually all HTTP sites will be marked.
HTTPS is also a requirement for some new interactive functionality, like taking pictures, recording audio, enabling offline app experiences, or geolocation, all of which require explicit user permissions. So, there are many reasons for website owners and users to pay attention to it.
What Does Insecurity Look Like?
As an experiment, to see exactly what level of security HTTPS gives the user, I visited two sites, one HTTP, and one HTTPS. Our Senior Systems Administrator, Ben Chavet, acted like an eavesdropper. He wasn’t even sitting next to me. He was 800 miles away watching my traffic over the VPN I was using. It took him just a few minutes to pick up what I was doing. What he did could have been done by someone in a coffee shop on a shared network, or by a “Man-in-the-Middle” somewhere between me and the sites I was accessing.
When I logged into the plain HTTP site, my “eavesdropper” could see everything I did, in plain text, including the full path I was visiting, along with my login name and password. He could even get my session cookie, which would allow him to impersonate me. Here’s a screen shot of some of the information he was able to view.
undefined
But when I logged into a site protected by HTTPS, the only thing that was legible to my “eavesdropper” was the domain name of the site, and a couple of other bits of information from the security certificate as it was being processed. Everything else was encrypted.
undefined
There are other problems with plain HTTP. An eavesdropper could steal session cookies to emulate a legitimate user to gain access to information they shouldn’t be able to see. If an attacker has access to a plain HTTP page, they could change links on the page, perhaps to redirect a user to another site. Or by encrypting form submissions (but not the form itself) an attacker can modify a form to post to a different URL. A valid HTTPS page is not vulnerable to these kinds of changes.
Clearly, HTTPS offers a huge security benefit!
What Does HTTPS Provide?
Let’s back up a bit. What exactly does HTTPS give us? It’s two things, really. First, it’s a way to ensure data integrity and make sure that traffic sent over the internet is encrypted. Secondly, it’s a system that provides authentication, meaning an assurance that the site a user is looking at is the site they think they are looking at.
In addition to obfuscating the user’s activity and data, HTTPS means the identity of the site is authenticated using a certificate which has been verified by a trusted third party.
If you get to a site using HTTPS instead of HTTP, you are accessing a site that purports to be secure. On an HTTPS connection, the browser you use (i.e. Internet Explorer, Safari, Chrome, or Firefox) and the site’s server will communicate with each other. The browser expects the server to provide a certificate of authenticity and a key the browser can use to encode and decode messages between the browser and the server. If the browser gets the information it requires from a secure site, it will display a safety lock in the address bar. If anything seems amiss, the browser will warn the user. Problems on an HTTPS page could be a missing, invalid, or expired certificate or key, or “mixed content” (HTTP content or resources that should never be included on an HTTPS page).
Identity, data integrity, and encryption are all important. A bogus site could still be encrypting its traffic, and a site that is totally legitimate might not be encrypting its traffic. A really secure site will both encrypt its traffic and also provide evidence that it is the site it purports to be.
How Do Users Know a Site is Secure?
Browsers provide messages for insecure sites. The specific messages vary from browser to browser, and depend on the situation, but might include text like “This page may not be secure.” or “The certificate is not trusted because it is self signed.” Most browsers display some color-coding that is expected to help convey the security status.
If a site is rendered only over HTTP, browsers usually don’t indicate anything at all about the security of the site, they just provide a plain URL without a lock. This provides no information, but also no assurance of any kind. And as noted above, unencrypted internet traffic over HTTP is still a potential security risk.
The following chart illustrates a range of possibilities for browser security status indicators (note that EV is a special type of HTTPS certificate that provides extra assurance, like for bank and financial sites, more about that later):
undefined
For more information about the HTTPS security, users can click on the lock icon. The specific details they see will vary from browser to browser, but generally, there is a link with text like “More details” or “View certificate” that will allow the user to see who owns the certificate and other details about it.
undefined
Research about how well end users understand HTTPS security status and messages found that most users don’t understand and ultimately ignore security warnings. Users often miss the lock, or lack of a lock, and find the highly technical browser messages to be confusing. The focus on color to indicate security status is a problem for those that are color blind. Also, so many sites still use HTTP or are otherwise potentially insecure that it is easy for users to discount the risk and proceed regardless. The conclusion of all this research is that better systems need to be put in place to make it clear to users which sites are secure and which aren’t, and to encourage more sites to adhere to recommended security best practices.
A while ago, Chrome started to let users understand how secure a site is. These examples use a combination of color and shape to convey what’s secure and what isn’t. Currently, the plain HTTP site is more noticeably a security threat.
undefined
Starting in January of 2017, they plan to add text saying ‘Secure’ or ‘Not secure’ for even more emphasis:
undefined
Other browsers may follow suit to make plain HTTP look more noticeably insecure. Between the user safety, the SEO hit, and the security warnings that may scare people away from sites using plain HTTP, no legitimate site can really afford to ignore the implications of not serving content over HTTPS.
What Do All the Terms Mean?
HTTPS terminology is confusing. There is a lot of jargon and countless acronyms. If you read anything about HTTPS, you can quickly get lost in a sea of unfamiliar terminology. Here is a list of definitions to help make things more clear.
Secure Socket Layer (SSL)
SSL is the original standard used for encrypted traffic sent over HTTP. It has actually been superseded by TLS, but the term is still used in a generic way to refer to either SSL or TLS.
Transport Layer Security (TLS)
TLS is the new variation of SSL, but it’s a newer, more stringent, protocol. TLS is not just for web browsers and HTTP, it can also be used with non-HTTP applications. For instance, it can be used to provide secure email delivery. TLS is the layer where encryption takes place.
HTTPS
HTTPS is just a protocol that indicates that HTTP includes the extra layer of security provided by TLS.
Certificate Authority (CA)
A CA is an organization that provides and verifies HTTPS certificates. “Self-signed” certificates don’t have any indication about who they belong to. Certificates should be signed by a known third party.
Certificate Chain of Trust
There can be one or more intermediate certificates, creating a chain. This chain should take you from the current certificate all the way back to a trusted CA.
Domain Validation (DV)
A DV certificate indicates that the applicant has control over the specified DNS domain. DV certificates do not assure that any particular legal entity is connected to the certificate, even if the domain name may imply that. The name of the organization will not appear next to the lock since the controlling organization is not validated. DV certificates are relatively inexpensive, or even free. It’s a low level of authentication, but provides assurance that the user is not on a spoofed copy of a legitimate site.
Extended Validation (EV)
Extended Validation certificates validate the legal entity that controls the domain as well as the fact that they have actual control over the domain. The name of the verified legal identity is displayed in the browser, in green, next to the lock. EV certificates are more expensive than DV certificates because of the extra work they require from the CA. EV certificates convey more trust, so are appropriate for financial and commerce sites.
Next Steps
It seems pretty clear that HTTPS is important. In my next article, HTTPS Everywhere: Making the Switch, I’ll talk about what it takes to migrate a site from HTTP to HTTPS.
More Reading
How HTTPS works
https://developers.google.com/web/fundamentals/security/encrypt-in-transit/why-https
How HTTPS affects SEO ranking
https://security.googleblog.com/2015/12/indexing-https-pages-by-default.html
https://fourdots.com/blog/why-you-need-ssl-to-rank-better-in-2016-and-how-to-set-it-2169
https://fourdots.com/blog/redefining-google-search-2015-1707
Browser clues about website security
https://www.usenix.org/system/files/conference/soups2016/soups2016-paper-porter-felt.pdf
http://cs.jhu.edu/~sdoshi/jhuisi650/papers/spimacs/SPIMACS_CD/ccsw/p19.pdf
https://nakedsecurity.sophos.com/2016/09/09/google-to-slap-warnings-on-non-https-sites/
https://groups.google.com/a/chromium.org/forum/#!topic/security-dev/aAtvHYFXRVo
https://www.chromium.org/Home/chromium-security/marking-http-as-non-secure
https://security.googleblog.com/2016/09/moving-towards-more-secure-web.html
How a password can be stolen over an insecure connection
http://security.stackexchange.com/questions/55433/how-is-password-stolen-over-non-ssl-connection
Types of Certificates
https://en.wikipedia.org/wiki/Extended_Validation_Certificate
https://en.wikipedia.org/wiki/Domain-validated_certificate
https://en.wikipedia.org/wiki/Chain_of_trust