2016-04-03

Aaron Swartz would be proud of Alexandra Elbakyan.

The 27-year-old is at the center of a lawsuit brought by a leading science publisher that is labeling her a hacker and infringer.Courtesy of Alexandra Elbakyan
Stop us if you’ve heard this before: a young academic with coding savvy has become frustrated with the incarceration of information.
Some of the world's best research continues to be trapped behind subscriptions and paywalls.

This academic turns activist, and this activist then plots and executes the plan.
It's time to free information from its chains—to give it to the masses free of charge.

Along the way, this research Robin Hood is accused of being an illicit, criminal hacker.
This, of course, describes the tale of the late Aaron Swartz. His situation captured the Internet’s collective attention as the data crusader attacked research paywalls.
Swartz was notoriously charged as a hacker for trying to free millions of articles from popular academic hub JSTOR.

At age 26, he tragically committed suicide just ahead of his federal trial in 2013.
But suddenly in 2016, the tale has new life. The Washington Post decries it as academic research's Napster moment, and it all stems from a 27-year-old bioengineer turned Web programmer from Kazakhstan (who's living in Russia). Just as Swartz did, this hacker is freeing tens of millions of research articles from paywalls, metaphorically hoisting a middle finger to the academic publishing industry, which, by the way, has again reacted with labels like "hacker" and "criminal."
Meet Alexandra Elbakyan, the developer of Sci-Hub, a Pirate Bay-like site for the science nerd.
It's a portal that offers free and searchable access "to most publishers, especially well-known ones." Search for it, download, and you're done.
It's that easy.
"The more known the publisher is, the more likely Sci-Hub will work," she told Ars via e-mail.

A message to her site's users says it all: "SCI-HUB...to remove all barriers in the way of science."
"Guerilla Open Access Manifesto"
Swartz found himself in the crosshairs of criminal hacking charges in a US court of law for being caught liberating the JSTOR research database.

Elbakyan, by contrast, finds herself entwined in a US copyright and hacking lawsuit (PDF) brought by one of the world's leading scientific publishers, New York-based Elsevier.

That's the same publisher Swartz named in his 2008 "Guerilla Open Access Manifesto," a brief paper extolling the virtues of illegally freeing scientific research stuck behind the paywall.

Elbakyan says Swartz was not a direct source of inspiration for Sci-Hub, but she's happy to note the two share the same goal of open access to science literature.
"I also found the idea of open access in science very inspiring, and I even dreamed of start(ing) my own open access journal," she said. "That was a year before I created Sci-Hub.

And it was not related to Aaron, [but if Swartz were alive], who knows—perhaps he would became one of my good friends and collaborators? His writing on open access is good."
The civil hacking and copyright infringement case against Elbakyan has been going on for months.

To the consternation of Elsevier's attorneys, she altered the site's URL from sci-hub.org to sci-hub.io and changed others because of a court order blocking the .org domain.

Elbakyan rarely even bothered to respond in court to the ongoing New York federal lawsuit—after all, she lives overseas and isn't worried about US law.

For now, she said she'd only actively participate in the lawsuit if one condition was met: "If there will be lawyers who are interested in the case for the sake of idea, not money."
For Elbakyan, that's what this is all about—ideas.
In her own words, here's why she built Sci-Hub in 2011:

I started the website because it was a great demand for such service in research community.
In 2011, I was an active participant in various online communities for scientists (i.e. forums, the technology preceding social networks and still surviving to the present day). What all students and researchers were doing there is helping each other to download literature behind paywalls.
I became interested and very involved.

Two years before, I already had to pirated many paywalled papers while working on my final university project (which was dedicated to brain-machine interfaces).
So I knew well how to do this and had necessary tools. After sending tens or hundreds of research papers manually, I wanted to develop a script that will automate my work.

That's how Sci-Hub started.

The first users of the script were members of the online forum about molecular biology.
At first, there was no goal to make all knowledge free.

The script was simply intended to make the life of researcher easier, i.e. to make the process of unlocking papers more fast and convenient. But this turned out to be such an important improvement it changed the way research was accessed in our community.

After some time, everyone was using Sci-Hub.

Publishers of academic research made a combined $10 billion last year, much of it funded by university research libraries.
Subscription fees range from the thousands for single titles to millions for bundled packages.

Annual profits hover around 30 percent.

Currently, Elbakyan's site freely doles out tens of thousands of journal articles per day to millions of annual visitors.
Elsevier says it's not the bad guy.
It boasts all sorts of free-access programs for its literature, from allowing authors to share a link of the work for 50 days to providing free access to journalists.

The organization also grants its university subscribers the option to allow free, walk-in access.

Elsevier even comports with a rule requiring National Institutes of Health-funded research to be publicly available no later than 12 months after final publication.

The company said it published some 20,000 open access articles last year that were not behind a paywall. Overall, it published 400,000 manuscripts in 2,500 journals via 700,000 peer reviewers.

The Robin Hood coder
With all the different protections in place with these varied journals, how did Elbakyan ensure Sci-Hub users get access to zillions of articles? While hesitant to provide all the details, here's what she could share:

The project works by downloading content from university proxies.
It is the same technology anonymizer websites use. You need proxy of the subscribed university to be able to download the content.

The script will iterate through tens of different universities, trying to locate one that has subscribed.
Some papers can be downloaded only from one university out of 30, for example.
I would also note that university proxies are different from ordinary ones that are used by anonymizers, so I had to implement their support.

Though the algorithm itself sounds simple, and indeed the first working alpha version of the project was drafted by me in three days, by 2016 the project grew into complex system with lots of code implementing various features.

When asked whether she has insiders at universities supplying passwords, Elbakyan also had to decline. "That is confidential."
For his part, Swartz never released the millions of papers he obtained in 2010 from JSTOR. His collection process started a little more hands on.

Although Swartz was a Harvard fellow, he chose MIT and created a guest account at the school, accessed JSTOR, and executed a program called "keepgrabbing" that would allow a massive number of downloads at a time.

This happened to be in violation of JSTOR's terms of service. MIT eventually figured it out and blocked the IP address from where this was happening.
Swartz changed IP addresses, but again it was blocked.

This game continued a few times, and JSTOR subsequently blocked MIT's access to the database.

At his most extreme, Swartz was even accused of hardwiring his computer into the network from a network closet in an MIT basement.
Elsevier markets itself as a leading provider of "information solutions" in the science, medical, and health sectors. Like JSTOR and Swartz, Elsevier believes Elbakyan is breaking the rules.

The company believes Sci-Hub violates its copyright to its subscription database called "ScienceDirect," and it sees Elbakyan as a hacker in violation of the Computer Fraud and Abuse Act.
Enlarge / Aaron Swartz
Wikipedia
The lawsuit says ScienceDirect is "home to almost one-quarter of the world's peer-reviewed, full-text scientific, technical, and medical content." And according to Elsevier, Sci-Hub exploits access features in the ScienceDirect service.

According to the lawsuit:
"Elsevier maintains the integrity and security of the copyrighted works accessible on ScienceDirect by allowing only authenticated users access to the platform.

Elsevier authenticates educational users who access ScienceDirect through their affiliated university’s subscription by verifying that they are able to access ScienceDirect from a computer system or network previously identified as belonging to a subscribing university," according to the lawsuit. "Elsevier does not track individual educational users’ access to ScienceDirect.
Instead, Elsevier verifies only that the user has authenticated access to a subscribing university."
The lawsuit describes the Sci-Hub operation as "an international network of piracy and copyright infringement by circumventing legal and authorized means of access to the ScienceDirect database.

Defendants’ piracy is supported by the persistent intrusion and unauthorized access to the computer networks of Elsevier and its institutional subscribers...."
What's more, the suit says that Sci-Hub has unlawfully obtained "student or faculty access credentials which permit proxy connections to universities which subscribe to ScienceDirect" to obtain "copyrighted scientific journal articles therefrom without valid authorization."
Joseph DeMarco, Elsevier's attorney, told Ars that Elbakyan "and her confederates" have illegally obtained university student login credentials without some students' consent. "We believe she is expropriating identities," DeMarco said. "It's certainly possible that some people are giving her their credentials. We have reason to believe that many students are not willingly giving her their credentials."
A federal judge has already ordered Elbakyan to shutter the site; she didn't. "I think everyone is a little disturbed that she is basically flagrantly violating a judge’s order," DeMarco said. "I think what she’s doing does show you she is perfectly willing to violate US law. Whether those laws are criminal or civil in a sense is an academic discussion."
Capitalism and the moral dilemma
Regardless of US law, Elbakyan doesn't believe what Sci-Hub does is wrong.

For her, there's no moral dilemma about pilfering science journals.

Since 2011 I've encountered extremely wide range of opinions on ethics of Sci-Hub among website users.
Some say: to steal is not good, but what else do I have to do if I do not have money to pay for papers I need for research? Other(s) say: to steal is good, but it should be done quietly.
It is better to hide my identity and not to reply to interview requests.

Another opinion: of course copying and free distribution of information is not theft; however, the majority of people are not going to understand that.

The world is broken, and money always wins.
So it is better to hide our activities and not to reply to interview requests. And etc.
I would like to reference Robert K. Merton, the founder of sociology of science. He studied ethos of research communities.

And what he found is that communism is one of the four important ethical norms (along with universalism, disinterested, and organized skepticism) that makes science work.

By communism, he meant the common ownership of scientific discoveries, according to which scientists give up intellectual property in exchange for recognition.
I think that his work is very relevant to what is happening in science now.
So it sounds weird for me when someone says Sci-Hub is unethical; because what is really unethical is to restrict access to scientific information, and for what reason? To make money! Someone can argue that publishers need to pay for expenditures, however I see that research papers published more than 20 years ago are also behind paywalls; it is hard to believe that expenditures to publish these papers are still not covered by 2015.

If those ideas feel familiar, it's not unjustified déjà vu.

The themes of her comments have an uncanny resemblance to Swartz' manifesto. Consider the first half of the Swartz manifesto:

Information is power.

But like all power, there are those who want to keep it for themselves.

The world's entire scientific and cultural heritage, published over centuries in books and journals, is increasingly being digitized and locked up by a handful of private corporations. Want to read the papers featuring the most famous results of the sciences? You'll need to send enormous amounts to publishers like Reed Elsevier.
There are those struggling to change this.

The Open Access Movement has fought valiantly to ensure that scientists do not sign their copyrights away but instead ensure their work is published on the Internet, under terms that allow anyone to access it.

But even under the best scenarios, their work will only apply to things published in the future.

Everything up until now will have been lost.

That is too high a price to pay.

Forcing academics to pay money to read the work of their colleagues? Scanning entire libraries but only allowing the folks at Google to read them? Providing scientific articles to those at elite universities in the First World, but not to children in the Global South? It's outrageous and unacceptable.
"I agree," many say, "but what can we do? The companies hold the copyrights, they make enormous amounts of money by charging for access, and it's perfectly legal—there's nothing we can do to stop them." But there is something we can, something that's already being done: we can fight back.
Those with access to these resources—students, librarians, scientists—you have been given a privilege. You get to feed at this banquet of knowledge while the rest of the world is locked out.

But you need not—indeed, morally, you cannot—keep this privilege for yourselves. You have a duty to share it with the world.

And you have: trading passwords with colleagues, filling download requests for friends.

Issues of legality aside, do the two hacktivists have a point? Should scholarly academia—science journals in particular—be behind paywalls? The answer, obviously, changes depending on who you talk to.
"It's an issue of 'that’s how the world works.' Does it help or hurt the scientific process? I think it does both," Steven Aftergood, an electrical engineer who directs the Federation of American Scientists' Project on Government Secrecy, told Ars. "It helps by elevating material of particular value and of improving it for publication.

And it hurts by excluding those who can’t get access because of the subscription fee."
Andrew Kiss, an associate professor of weed biology and ecology at the University of Wyoming, told Ars in an e-mail that this question has moved him to only publish research in open access journals from now on.
"My research is very applied, and thus it often has potential to be useful by people who don't have access to the scientific literature. Putting my work behind a paywall, I feel, is not consistent with my goals as a public researcher," he said. "But I'm also not in a position to 'shame' other researchers for not doing the same.
I've come to my position fairly recently, and there are some legitimate reasons, I think, for those paywalls to exist.

Especially for science journals in niche areas, for example."
Enlarge
Flickr user: Jessica Lucia

Internet domain Whac-A-Mole
For now, the lawsuit continues to linger.

And ignoring the philosophy at play, Elsevier and DeMarco can see some parallels between Swartz and Elbakyan.
"The similarities that there are, are somewhat superficial.

They both stole digital content of scholarly journals," he said. "They both felt they were above the law and entitled to their own interpretation of how the law should be.
I think that's where the similarities end."
Soon, DeMarco will ask a federal judge to order that the newest Sci-Hub domains be terminated from the Internet.
"Since the issuance of the Preliminary Injunction, Elsevier has become aware of a number of additional domains being used by the Defendants in this action to continue their infringing activities and, in so doing, violate the terms of the Preliminary Injunction," DeMarco wrote in a recent court filing. "Elsevier is in the process of investigating these new domains and may, should it be appropriate to do so, submit a motion requesting that the Preliminary Injunction be expanded to include those domains."
But just as it played out before, this may not stop Elbakyan.
In the face of previous legal threats, she simply dropped her .org domain for another bit of Internet real estate.

And this latest filing is unlikely to convince this programmer to show up in a US courtroom.
In this sense, Elbakyan's saga may begin to parallel another famous Internet tale.
She's engaged in a game of domain Whac-A-Mole with major US companies and courts.
If that sounds familiar, it should.

The Pirate Bay has been playing the same game for a decade—and that pirate ship is still afloat.

Show more