Planet.debian.org

Francois Marier: How Safe Browsing works in Firefox

2016-04-01

Firefox has had support for Google's
Safe Browsing since 2005
when it started as
a stand-alone Firefox extension.
At first it was only available in the USA, but it was opened up to the rest of the world in
2006 and moved to the
Google Toolbar.
It then got
integrated directly
into
Firefox 2.0
before the
public launch
of the service in 2007.

Many people seem confused by this phishing and malware protection
system and while there is a
pretty good explanation of how it works
on our support site, it doesn't go into technical details. This will
hopefully be of interest to those who have more questions about it.

Browsing Protection

The main part of the Safe Browsing system is the one that watches for bad
URLs as you're browsing. Browsing protection currently protects users from:

malware sites,

deceptive sites
(including phishing
and social engineering
sites), and

sites hosting potentially unwanted software.

If a Firefox user attempts to visit one of these sites, a warning page will
show up instead, which you can see for yourself here:

fake malware page

fake unwanted software page

fake phishing page

The first two warnings can be toggled using the browser.safebrowsing.malware.enabled
preference (in about:config) whereas the last one is controlled by
browser.safebrowsing.enabled.

List updates

It would be too slow (and privacy-invasive) to contact a trusted server
every time the browser wants to establish a connection with a web server.
Instead, Firefox downloads a list of bad URLs every 30 minutes from the
server (browser.safebrowsing.provider.google.updateURL) and does a
lookup against its local database
before displaying a page to the user.

Downloading the entire list of sites flagged by Safe Browsing would be
impractical due to
its size
so the following transformations are applied:

each URL on the list is canonicalized,

then hashed,

of which only the first 32 bits of the hash are kept.

The lists that are requested from the Safe Browsing server and used to flag
pages as malware/unwanted or phishing can be found in
urlclassifier.malwareTable and urlclassifier.phishTable respectively.

If you want to see some debugging information in your terminal while Firefox
is downloading updated lists, turn on browser.safebrowsing.debug.

Once downloaded, the lists can be found in the cache directory:

~/.cache/mozilla/firefox/XXXX/safebrowsing/ on Linux

~/Library/Caches/Firefox/Profiles/XXXX/safebrowsing/ on Mac

C:\Users\XXXX\AppData\Local\mozilla\firefox\profiles\XXXX\safebrowsing\ on Windows

Resolving partial hash conflicts

Because the Safe Browsing database only contains partial hashes, it is
possible for a safe page to share the same 32-bit hash prefix as a bad page.
Therefore when a URL matches the local list, the browser needs to know
whether or not the rest of the hash matches the entry on the Safe Browsing
list.

In order resolve such conflicts, Firefox requests from the Safe Browsing
server (browser.safebrowsing.provider.mozilla.gethashURL) all of the
hashes that start with the affected 32-bit prefix and adds these full-length
hashes to its local database. Turn on browser.safebrowsing.debug to see
some debugging information on the terminal while these "completion" requests
are made.

If the current URL doesn't match any of these full hashes, the load
proceeds as normal. If it does match one of them, a warning interstitial
page is shown and the
load is canceled.

Download Protection

The second part of the Safe Browsing system protects users against malicious
downloads. It was
launched
in 2011, implemented in
Firefox 31 on
Windows and enabled
in Firefox 39 on
Mac and Linux.

It roughly works like this:

Download the file.

Check the main URL, referrer and redirect chain against a local
blocklist (urlclassifier.downloadBlockTable) and block the download
in case of a match.

On Windows, if the binary is signed, check the signature against a local
whitelist (urlclassifier.downloadAllowTable) of known good publishers and
release the download if a match is found.

If the file is not a
binary file
then release the download.

Otherwise, send the binary file's
metadata
to the remote application reputation server
(browser.safebrowsing.downloads.remote.url) and block the download if the
server indicates that the file isn't safe.

Blocked downloads can be unblocked by right-clicking on them in the download
manager and selecting "Unblock".

While the download protection feature is automatically disabled when malware
protection (browser.safebrowsing.malware.enabled) is turned off, it can
also be disabled independently via the
browser.safebrowsing.downloads.enabled preference.

Note that Step 5 is the only point at which any information about the
download is shared with Google. That remote lookup can be suppressed via the
browser.safebrowsing.downloads.remote.enabled preference for those users
concerned about sending that metadata to a third party.

Types of malware

The original application reputation service would protect users against
"dangerous" downloads, but it has recently been expanded to also warn users
about
unwanted software
as well as software that's not commonly downloaded.

These various warnings can be turned on and off in Firefox through the
following preferences:

browser.safebrowsing.downloads.remote.block_dangerous

browser.safebrowsing.downloads.remote.block_dangerous_host

browser.safebrowsing.downloads.remote.block_potentially_unwanted

browser.safebrowsing.downloads.remote.block_uncommon

and tested using Google's test page.

If you want to see how often each "verdict" is returned by the server, you
can have a look at the
telemetry results for Firefox Beta.

Privacy

One of the most persistent misunderstandings about Safe Browsing is the idea
that the browser needs to send all visited URLs to Google in order to verify
whether or not they are safe.

While this was
an option
in version 1 of the Safe Browsing protocol (as disclosed in their
privacy policy
at the time), support for this
"enhanced mode" was removed in Firefox 3
and the version 1 server was
decommissioned in late 2011
in favor of
version 2 of the Safe Browsing API
which doesn't offer this type of real-time lookup.

Google explicitly states that the information collected as part
of operating the Safe Browsing service
"is only used to flag malicious activity and is never used anywhere else at Google"
and that
"Safe Browsing requests won't be associated with your Google Account".
In addition, Firefox adds a few privacy protections:

Query string parameters are
stripped
from URLs we check as part of the download protection feature.

Cookies set by the Safe Browsing servers to protect the service from
abuse are stored in a
separate cookie jar
so that they are not mixed with regular browsing/session cookies.

When requesting complete hashes for a 32-bit prefix, Firefox throws in a
number of extra
"noise" entries
to obfuscate the original URL further.

On balance, we believe that most users will want to keep Safe Browsing enabled,
but we also make it easy for
users with particular needs
to turn it off.

Learn More

If you want to learn more about how Safe Browsing works in Firefox, you can
find all of the technical details on the
Safe Browsing and
Application Reputation
pages of the Mozilla wiki or you can ask questions on our
mailing list.

Google provides some interesting statistics about what their systems detect in their
transparency report
and offers a tool to find out
why a particular page has been blocked.
Some information on
how phishing sites are detected
is also available on the Google Security blog, but for more detailed information
about all parts of the Safe Browsing system, see the following papers:

All Your IFrames Are Belong to Us

Content-Agnostic Malware Protection

Ghost in the Browser