2014-11-03

Remember how Google has stated one of its corporate goals is organizing the world’s data (whether the world likes it or not)?  Well…they didn’t say that last part.

But in a way they did say exactly that.  Because where all of this leads you, whether it’s the YouTube data honeypot, the Google Books data honeypot, the search data honeypot–not to mention Gmail, Google Voice, Maps, Wallet, the works. It’s the data, stupid.

And Google largely gets the data for free.  For the music business, this is the “we’re not going to let another MTV build a business off our backs” take 22.

Why is MTV a good analogy?  Because until MTV made enough progress that they didn’t need music videos anymore, record companies bore the entire cost of producing MTV’s core product which MTV essentially sold back to us.

The same is true of Google’s non display uses of music, movies, television and books.  While they have us arguing about copyright issues in Google’s game of data mining thimblerig, they are making bank on the data they scrape from the users we drive to their properties.

I have banged the table on this issue for years and I really think that very few people in our business understand how important it is. Here’s the point:  Google combines all of the data it scrapes from all of its products in the background and then uses that to sell advertising and other media.

If you were wondering why it’s worth it for Google to fight us on 30 million take down notices a month and the money losing Gmail product, that’s why. They are harvesting data from our users, paying us peanuts for our works of authorship, and then selling the data, including selling the data back to us in one form or another.  And now Pandora is doing the same thing–and if we’re stupid enough to take that hill billy deal, why wouldn’t they?  Wakey wakey.

Thankfully, as I will show later we have some good analysis on the issue as it relates to Gmail and Google Apps for Education, Google for Work and other Google products.  All of which leads to this conclusion. Google is now not only achieving a dominant position in search, video search (YouTube), etc., etc., but they are also achieving near total dominance of the data. Here’s two new ways that they are solidifying data dominance even further–“free” Google wifi at Starbucks and the NY public school system.

Starbucks, the Barista Data Dominators

If you are a habitué of Starbucks, you may have noticed that Google is providing wifi service.  If that doesn’t send a chill through you, think again.  Look for this screen: Let’s take a closer look at this: So whenever you see anything “free” from Google, you can bet that Google’s much hated privacy policy will apply.  When you click on that link, here’s where it takes you: In case you were wondering, this is the privacy policy that allows Google to scrape a continuous stream of damn near any information touched by a Google product and use it for Google’s data mining operations–essentially forever.

As we saw in the Google Gmail Litigation, Google’s privacy policy applies to nearly all of Google products including Google Apps for Education. By harvesting data from users of Google Fiber, including Google’s WiFi services at Starbucks, Google can harvest indirectly that which they may not be permitted to scrape directly.  (See my article from Texas Lawyer regarding data harvesting from attorney-client communications.)

And if you live in Austin or any other city where Google is launching Google Fiber, you, too, will be doing your bit to help Google’s data domination if you use Google Fiber. This back door attack through broadband is highlighted when Google provides the broadband connection to Starbucks using the same Google Fiber broadband service that Google is rapidly installing in homes across America.

But it’s particularly instructive if you were deciding whether to give Google access to broadband for a government agency, say for example, the New York City school system.

When YouTube is Not Enough: Uncle Sugar Wants to Organize the Data From Your Schools Eric Schmidt, or as he’s known at MTP, “Uncle Sugar”, managed to get himself made a member of New York “Smart Schools Commission”.  And what does Uncle Sugar think New York schools need the most?  Broadband. And you know what Google just happens to have?  Broadband.

Dang, ya’ll.  What a coincidence.  (I wonder if Uncle Sugar has ever been inside a New York public middle school, for example.)

How is New York going to pay for this broadband that it needs so desperately?  (That will solve the problem of 14,000 teachers being attacked annually, high school students being sold or selling drugs at school, or the 40% of New York teachers who believe bad behavior interferes with teaching, for example, according to the U.S. Dept. of Education.)

That’s right–the taxpayer is going to borrow the money!  In New York City, the most taxed city in America, if not the world.  Because you know what’s cool?  BILLIONS are cool.

“There are more than 500 schools that don’t have one connection that meets the broadband criteria that most of you have in your homes. That is a tragedy,” Schmidt said during an appearance in Mineola, WCBS 880 Long Island Bureau Chief Mike Xirinachs reported.

Schmidt is a member of the state’s Smart Schools Commission, which is proposing a $2 billion plan to bring broadband and other integrated technology into every public school. The Smart Schools Bond Act is on next week’s ballot.

So you don’t think that Google is going to somehow get its paws on scraping data from New York school kids, do you? Mayor Diblasio may be interested in recently passed legislation protecting students from Google in California that Google opposed.  According to Education Week:

The Student Online Personal Information Protection Act, or SOPIPA [interesting acronym], prohibits operators of online educational services from selling student data and using such information to target advertising to students or to “amass a profile” on students for a non-educational purpose.  [Which is exactly what Google does with its Apps for Education product, Gmail, Google Apps for Work or any product that uses its Content OneBox technology.]

The law also requires online service providers to maintain adequate security procedures and to delete student information at the request of a school or district.  James Steyer, CEO and founder of Common Sense Media, a San Francisco-based nonprofit that helped craft the law, described SOPIPA in an interview with Education Week as the nation’s “first truly comprehensive student-data-privacy legislation” and said he expects it to become a model for other states around the country.

“It’s a major step forward in creating a trusted online learning environment,” Steyer said. “I think this is a blunt call to industry to say that school data is for educational purposes.  Period.”

Protecting student data has become an increasingly contentious issue in recent months, with parents and activists expressing growing concern about the nature and volume of digital data on children that schools now share with third-party vendors [which is what Starbucks does]….

California’s new law is unique in putting the responsibility for ensuring the privacy of student data on industry. Governor Brown also signed into law a related bill that would require districts’ contracts with vendors to include certain privacy-related provisions. Steyer said that many of the “major players” in the ed-tech industry attempted to “water down” SOPIPA during the legislative process, but praised the bill’s sponsor, state Senate President Pro tempore Darrell Steinberg, a Democrat, for “standing up to the tech industry and saying ‘no.'” The final legislation does include some key accomodations to industry concerns, such as specifying that operators be allowed to maintain and use “de-identified” or anonymous student information to develop and improve their own educational products and services.

Student data mining is big business.  After student data mining startup InBloom hit the wall, the Electronic Privacy Information Center (a group we have a lot of time for at MTP) told the New York Times in a letter to the editor:

The recent collapse of inBloom, the student data company, is a powerful reminder that in the era of Big Data, privacy still matters. As we see it, the problem was not misunderstanding by the public, but a lack of meaningful privacy protections.

The Department of Education also bears some responsibility…. Instead of defending important privacy laws that help protect student data, the department chose to loosen the rules so that private vendors could pull sensitive data out of local schools. Schools were also encouraged to collect far more information than they had in the past. Parents did not know what information was being collected, who would have access to it or what impact it might have on their children’s future. Not surprisingly, many objected.

The Education Department could help restore confidence in these data-intensive programs by strengthening privacy rules and establishing a Student Privacy Bill of Rights. Students should know what information about them is being collected and how it is being used. And schools should be more cautious about turning over their students data to others.

MARC ROTENBERG AND KHALIAH BARNES

Mr. Rotenberg is president of the Electronic Privacy Information Center, and Ms. Barnes is director of EPIC’s Student Privacy Project.

While I don’t disagree with the EPICs that a federal solution might be elegant, forgive me if I am skeptical that the federal government will ever go after Google about anything that Google doesn’t want to be gone after by the federal government.

After the $500,000,000 fine that Google paid for violating the Controlled Substances Act, I’m sure that Google and its lawyers like former Deputy Attorney General Jamie Gorelick decided that would never happen again.  And if you don’t believe me, remember that the Department of Justice did apologize to Google for honest statements about Larry Page’s role in the case, among other things.

That’s why it’s encouraging to see that those with concerns about school systems being conned into paying for better access to data mining are going the state route.  At least for the moment, Google doesn’t control all the state legislatures, although they’re certainly trying to do that and Google Fiber is one vehicle for doing it as we have seen in Austin. As Education Week found with the SOPIPA law, Google’s reaction was rather tight lipped–there’s a shocker.

A spokeswoman for Mountain View, Calif.-based online-services giant Google, which came under intense criticism earlier this year after acknowledging to Education Week that the company had been scanning and data-mining the contents of student email messages, said she could not comment on pending or active legislation. Google also declined to clarify whether it scans student email messages sent using its wildly popular Apps for Education tool suite in order to build profiles that might be used for commercial purposes other than targeted advertising, as was alleged in recent lawsuit against the company. California’s new law expressly prohibits vendors from using “information, including persistent unique identifiers, created or gathered by the operator’s site, service, or application, to amass a profile about a K-12 student except in furtherance of K-12 school purposes.” In April, Google announced that it had stopped ad-related scanning of student email messages for advertising purposes.

And why did it do that?  Because Judge Lucy Koh pretty much ordered them to stop as part of the Gmail class action litigation.

Google took the position that their users did not have a reasonable expectation of privacy in their communications through Gmail and Google Apps for Education, relying on Smith v. Maryland–that’s right, the same case that the National Security Agency used to excuse its behavior following Snowdengate.

After reviewing Google’s rather circular arguments that users of Google products (including non-user recipients of email sent using Google products) had consented to Google’s privacy policies because “everyone knows” what Google does.  You know, the we’re here, because we’re here, because we’re here, because we’re here defense. Judge Koh rejected Google’s position in her Sept. 26, 2013 order on Google’s motion to dismiss stating “[i]mportantly, Plaintiffs who are not Gmail or Google Apps users are not subject to any of Google’s express agreements.”  She went on to rule that:

[Google's privacy policies] could mislead users into believing that user communications to each other or to nonusers were not intercepted and used to target advertising or create user profiles. As such, these Privacy Policies do not demonstrate explicit consent, and in fact suggest the opposite.

So you would think that Google would steer way far away from getting in the middle of scraping data from kids, right?  You would think that if you didn’t understand that with Google, it’s all about the Benjamins. That data is very valuable.  As Jeff Gould tells us at Medium:

Schools and school districts are another possible source of valuable segmentation data. They publish aggregate data on student test scores, income levels and ethnicity, and are well correlated with other geographically tagged data sources (e.g. by ZIP or Census district). Google says that its Google Apps for Education (GAFE) service has 30 million users worldwide, of which many millions are in the United States. GAFE thus gives it a vast pool of users whose profiles it can compare with external data sources.

It would be…straightforward…to compare user clusters derived from GAFE with the data published by schools and school districts. Once calibrated by comparison with external data in this way, Google’s clustering algorithms would no longer need to access that data, which has the disadvantage of being cumbersome to manage and static. Instead, the algorithms could extract on a dynamic basis valuable segments of youth consumers directly from the stream of email flowing into GAFE student accounts. The resulting data could be used to target ads, improve search results or even provide Google advertisers with insights into purchasing trends among fine-grained segments of this population. For example, Google could tell brands in real time what the latest shoe buying trends are among urban teenage boys in selected cities, or which retail fashion brands are preferred by teenage girls whose families fall in a given income bracket and geographical region. [Emphais mine.]

Repeat after me:  We are not going to make the same mistake we made with MTV.  Which I guess is true.  It’s not the same mistake, its a colossally bigger mistake.

It’s the data domination, stupid.

Show more