2014-02-28

Part 2 describes the earliest encounters, known and unknown, with the hack.

 

 

 

Matt Farrell (professional hacker): If that guy knew half the shit that I know, his fuzzy little head would explode.
— Live Free or Die Hard (2007) —

SQL Injection Attacks

As has been repeatedly mentioned, Skeptical Science endures frequent SQL injection attacks.

A SQL injection attack works like this.  When you enter data into a form on a web site, for instance to log on or to post a comment or to do a search, what you enter into that form will be combined with computer commands to do something in the database.  For example, if you search for “climate change” in a web site, the computer programs that run the site might issue the a database command which is the programming equivalent of “find all pages with the words ‘climate change’ in the text”.  The SQL statement — the programming version of that command — might look something like this:

SELECT * FROM PAGES WHERE CONTENT LIKE ‘%climate change%’;

Skeptical Science endures SQL injection attacks at least six times a year, if not more.

Clever hackers can use this to trick the system into doing something it never intended to do, by submitting cleverly constructed search criteria.  If the data entered into the form can be structured in a way to surreptitiously alter the database commands, to make it do something other than the site's programs intended, then the attacker has a way of manipulating the site or viewing data to which he shouldn’t have access.  If done correctly, it can be tricked into betraying private information, including user names and their passwords.

For example, suppose you enter the following in the search box, instead of simply ‘climate change’:

climate change’ UNION SELECT * FROM USERS WHERE USERNAME LIKE ‘

The program might put that together with the normal database command to get this

SELECT * FROM PAGES WHERE CONTENT LIKE ‘%climate change’ UNION SELECT * FROM USERS WHERE USERNAME LIKE ‘%’;

That particular command will return as the search results every post that ends with ‘climate change’ and also every username in the database.

[This is far from a complete, or even completely accurate, demonstration of SQL injection.  This example is merely intended to demonstrate the basic flavor of a SQL injection attack, without introducing details that would only further confuse things.]

That's not the only way to use SQL injection.  There are hundreds of ways.  That's why most first probes are usually done by bots trying dozens or hundreds of possible combinations.  They just want to find out if anything at all worked, and then they report back to the hacker when they detect something that has.

One of these attacks, a mildly successful one, is why the database SQL injection log files were created.

When it had happened the first time to Skeptical Science, Doug Bostrom helped John to proof the site against SQL injection attacks.  The solutions are complex, but along with their efforts at prevention they added a bit of detection, a log file to record every SQL statement used by the system.  That file could in turn be scanned for various SQL injection tricks so we'd know when an attack had occurred, and we could study its nature and determine whether or not there were still any holes in our shields.

But now that bit of prevention — at least at first glance — looked like it might have become the source of our current troubles.  Because the file contained every SQL statement from a given day, it also inadvertently contained the values which users had been entering as passwords.  When the logs were set up to automatically record every database query issued by the system, it never occurred to John and Doug that some of those values might be unencrypted passwords, and that the files might also not be secured from visibility because of where they were being written, and finally that someone some day might find a way to locate one of the files.  If someone somehow got at the file, then the file itself would give you the username and password of every person who'd logged on during the period covered by that particular log file.

One of these attacks, a mildly successful one, is why the database SQL injection log files were created.

This was the first hole we found, and at first glance it looked like it could have been the source of our troubles.  After all, the hacker had pointedly released the location of one of those files, and the hacker had gotten in somehow.  It made simplistic sense that this was how it was done.

But, on further thought, that explanation made no sense. The log files only contained a small portion of the database. The hacked files contained the entire user database and every forum thread ever posted. It was impossible to have obtained the entire forum from a few days of SQL queries. How could he have stolen the entire table of registered users?  Each log only contained one day's worth of updates, and they were migrated off of the site within days of creation.  And how could he have constructed the "enhanced" version of the forum from only the log files, or an ID extracted from the logs?  It didn't make sense.

The SQL injection log file wouldn't have been easy to get, either.  As a log file, it should have been in a secure place, accessible only to a site administrator.  Unfortunately, the file was inadvertently made accessible to everyone, but only if you knew the exact directory location and file name.  There were no directory listings of the site available through the Internet.  The logs directory was locked down, so if you tried to get a list of the available log files, you got the standard 403 "Forbidden" error.  You had to know exactly where a file was and what it was named to grab it.

One could randomly try various combinations of file names, hoping to guess at the naming convention used for any log files, but that is far more difficult than it seems.  Are the files .txt, .log or .zip files?  Or .tar or .gz or something else entirely?  Are they hourly or daily or weekly or monthly?  If daily, is the date format year-month-day, or month-day-year?  Is the day Gregorian or Julian?  Is the year two digits or four?  Is the month numeric, or a three-letter abbreviation?  Did any character delimit the date segments, and if so what character?  Is there also a prefix to the name, like "log" or "dblog" or "sqllog"?  The number of permutations makes it almost impossible for someone to have guessed at the existence, location and exact name of any actual log files.

For clarity, the particular SQL statement log file explicitly named in the hack was 2012-03-21.txt, stored in a directory named logs.

If there were any doubt, however, there would have been a long record of failed access attempts, looking for wrong names, in the system's access log files — there would have been a string of 403 and 404 errors.  There were none.  That was not how he did it.  There must have been another way in.

Alan Bradley: I still don't understand why you want to break into the system.
Kevin Flynn: Because, man, somewhere in one of these memories is the evidence!  If I got in far enough, I could reconstruct it.
— Tron (1982) —

March 22, 10:41 PM, CST — Reflections of the Heartland Affair

The announcement of the hack was curious.  First, it was made on a very obscure pseudo-skeptical blog, one which receives almost no comments and in fact has little content other than the posting of links to various and sundry articles elsewhere on the web on both skeptic and conservative political talking points.

Beyond this, the phrasing and the claims of the announcement were absurd.

Dear Friends:

In the interest of transparency, I think you should see these files from Skeptical Science.

An anonymous whistleblower has brought to my attention some database logs and other files (e.g., http://www.skepticalscience.com/logs/2012-03-21.zip (the current day is txt, past days zip)). These files detail everything that happens on the site, from forum conversations to user accounts. I have collated some of the data in a more readable form.

This is a plausible sounding distraction, for the eargerly gullible.  The data provided could never have come just from the log files.  They don't go back far enough, since the hack contained data back to the site's inception in 2007.  They are also stored off-site and deleted every few days from the web site itself.  One simply could not reconstruct the entire database from a handful of logs, or even several month's worth, or even every log ever generated.  If the hack had released a day or a week or even a couple of months worth of data, then this claim might have been credible.  But it did not.  The hack released data, complete data, going back years.

The data provided could never have come just from the log files.

One thing that has always been of interest was that the Skeptical Science hack was announced just a little over a month after the Heartland Affair began, on February 16, 2012.  Not coincidentally, the words announcing the Skeptical Science hack vaguely mirror the first lines of the DeSmogBlog post releasing the Heartland documents:

An anonymous donor calling him (or her)self "Heartland Insider" has released the Heartland Institute's budget, fundraising plan, its Climate Strategy for 2012 and sundry other documents (all attached) that prove all of the worst allegations that have been levelled against the organization.

The first thing of interest here is that only two people on earth knew about the database logs, John Cook and Doug Bostrom, so there is no way it could have been an anonymous whistleblower.  "In the interest of transparency" is also a laughable start.  Beyond this, the Skeptical Science hack release got really silly:

Why has SkS chosen to publish all this on the public internet? Is it the first step towards transparency, or a catastrophic error? This is what I first intended to ask Mr. Cook.

Publish?  By passing security, uploading programs, hijacking accounts, dumping the entire database, and then reformatting it to look like the forum with personal information included is the equivalent of SkS choosing to "publish all this on the public internet [sic]?"

Then there was this:

This "leak" is just a format conversion of already public material.

As shown earlier, this statement is a falsehood. It is simply impossible to recreate the entire user and forum database from a few days worth of SQL queries. yet this false narrative was quickly accepted and adopted by those uncritically wished to believe it.

I don't want to commit theft or forgery which, as I understand, would be required to raise this to the heroic level, but you gotta start somewhere. This is an anonymous leak per the standard, but I will consider stepping bravely forward if I get caught.

"Heroic level?” "Step bravely forward?"  "Anonymous leak per the standard?"

I don't want to commit theft or forgery...

And exactly how does someone step bravely forward, but only if they've been caught?  I mean, if they've been caught, they're being dragged forward.  It's a little late to step forward.  Or does he mean he'll be dragged bravely forward?  Or not run, if caught?

I found it difficult to read much of the discussion, so some crowdsourcing is needed here. To a layman it seems a dark and unpleasant world these people are living in. It's unlike anything on the more skeptically oriented mailing lists I have followed.

Once again, this paralleled comments on DeSmogBlog (not on the original release, but in a subsequent article):

We are releasing the entire trove of documents now to allow crowd-sourcing of the material.

Included amidst this curiously crafted diatribe were two links, one to the previously mentioned SQL injection attack log and another to a zip file containing the cloned-plus forum, the user table dump, and other files.

SID 6.7: Uh-uh-uh. I thought of that one. Better try again... Faster.
— Virtuosity (1995) —

February 25, 12:33 PM AEST — Another Day, Another Dollar

It was two days after The German had briefly opened the forum, and almost a whole month before his efforts would be released on a pseudo-skeptic's blog.  The German returned on February 25, 2:33 AM local time (CET), again using Tor, although the browser was now reported to the Skeptical Science web site as Chrome.  Perhaps he’d upgraded his version of the Tor browser, in the hopes of improving performance during long downloads. Perhaps he'd abandoned the Tor browser and was accessing Tor in a more sophisticated way, one which let him choose the browser he used during any incursion.

The slowness of the Tor connection meant that it was several minutes before he could start work.  He first viewed the recent comments on the site for two full minutes, and then began downloading all of the SQL injection database logs from the 23rd and the 24th.  He didn’t need a hacked user ID to get these, as has been explained, because they were erroneously accessible to anyone if you knew exactly where to look, which the hacker did.  He tried to get the log for the 25th, which didn’t exist yet (in zipped form), as well as the logs for the 20th through the 22nd, which had already been moved to an off-site backup area that he didn’t know about and couldn’t reach.  Downloading the files (6 megabytes each) through Tor, with its slow performance, took close to twenty minutes.

After that he went straight to the forum, using his original, hacked administrative ID.

After that he went straight to the forum, using his original, hacked administrative ID.  As this was his first return since his activity on the 23rd (the 22nd in his time zone), he immediately went to the thread titled “Attempted SkS hacking happening right now,” which he read for two and a half minutes.  No doubt it gave him a jolt, but looking at the thread about a SQL injection attack no doubt set him at ease.  Satisfied, although maybe a bit alarmed, that we hadn’t really caught on at all, he moved on to his favorite subject, reading the thread titled “Heartland releases emails and Gleick's story still checks out”.   He seems to have missed noticing the thread that discussed the way he had opened the forum to public access for a few scant hours.

He must have taken a break at that point.  It’s quite possible he visited the site using his real IP address, without the hinderance of Tor, but there would be no way to connect the dots from all of the thousands of IP addresses which visited the site in that short period.  He returned more than two and a half hours later to briefly pull up Dana Nuccitelli’s Skeptical Science user table entry, using the administrative panel, and then he quit for the day.

Razor: Remember, hacking is more than just a crime. It's a survival trait.
— Hackers (1995) —

February 26, 2:30 PM AEST — And Another, and Another, and…

On February 25th in Germany, the hacker did nothing more than look at the database user entry for John Mashey.  On February 26th the hacker stayed away.

On February 27th, he downloaded a SQL injection log file, the smaller, zipped version for February 26th.  That was it.  He didn’t even bother to look around.

He’d clearly prepared some programs ahead of time.

The German didn’t return until March 5th in Germany, at 1:26 AM CET.  That day he got very busy.  First, he grabbed the zipped log file form March 3rd.  It took a full forty minutes to download all 6.8 megabytes via Tor.  Then he uploaded another program that he had written, one named “u1”.  For this visit, he’d clearly prepared some programs ahead of time.  They weren’t the same interactive programs he’d been using to this point.  They had a specific, predetermined purpose, which was executed upon command without further direction.

“u1” didn’t run correctly the first time.  He had to change it, re-upload it, and re-run it four more times before he got it right.  Then he downloaded the zipped database, and uploaded a second prepared program, “u2”.

“u2” removed “u1”, and itself.  The hacker verified the result by accessing “u2” a second time, which happily gave him the desired 404 — file not found — response.

The process had taken two hours and three minutes.

Then he uploaded and ran his “u1” program again, grabbed a fresh copy of the database, and ran “u2” to erase his tracks.

He didn’t return until Thursday, March 8th, at 3:25 PM CET, and was immediately frustrated to find out that the March 4th SQL injection log was no longer available, and the March 8th log wasn’t yet available.  It shouldn’t have mattered, but I suppose he wanted to be certain.  His plan of action was now clear.  He’d use one database dump, and then amend that, over time, by applying the SQL injection log files.  Each log file would, in turn, bring the database up to date with that log file.  As long as he grabbed all of the the log files, and had one full database dump, he could always reconstruct the entire database.

For safety, The German first downloaded all log files that were available, those from March 5th, 6th, and 7th.  That took him nine minutes.  Then he uploaded and ran his “u1” program again, grabbed a fresh copy of the database, and ran “u2” to erase his tracks.  This foray had lasted 59 minutes.

In the coming days, the hacker made certain to visit every few days, only missing once, on the 13th.  At first he tried to list the logs directory, but only got the 403 Forbidden error for his trouble.  Uploading his directory listing program each time apparently was too much touble, so instead he simply hunted and pecked to find out which files were available.

The German quietly, patiently, visited and revisited the site.

Each day he would grab what copies he could of recent logs, the smaller, zipped versions when available, but also the larger text versions, just in case.  He would grab files he already had, in case a previously downloaded copy was incomplete.  Because the logs weren’t properly secured, he didn’t need to use any of his hacked user IDs, but he was still always careful to get into the site via Tor, to mask his identity.

The German quietly, patiently, visited and revisited the site.  He no longer bothered looking at any forum threads, at least not on the site itself.  He could browse them at leisure on his own copy of the database.  At Skeptical Science, he just methodically kept grabbing the data.  He had found a convenient way of continuously stealing every bit of content in the system.

To be continued...

Post updated to more accurately reflect the age of the SQL injection logging feature.

2013-02-28

I misunderstood the timeline of the logging feature in the Skeptical Science code.  The feature was actually implemented in early 2010.  It does not, however, substantively change the reasoning.  It would not have been possible to construct the deleted comments and user table contents from the logs, as they include information prior to 2010.  It is also a near impossibility that someone guessed the location of the log files and succeeded in downloading every single file for two full years, without missing one, in order to reconstruct the complete forum contents from those files, especially when they did not have an empty database structure with which to start, and there would have been database structure changes interwoven with, but not reflected within, the logs.  Lastly, our server log files clearly show that no such downloads were ever made, except as a result of the hack and then the subsequent announcement of the hack.

 

Show more