2016-07-18

Kick Off:

One of the most important ingredients to making a new consumer app is data about your customers. Data about users’ preferences and past behaviours allows the app to be not just useful, but personalized. For a long time, developers could get the data they needed from other web properties. But as the social media landscape matures and consolidates, the big properties are becoming increasingly tight-fisted with their data. As Ben Schippers writes in an interesting TechCrunch story, “Sure, you can have the data at desired speed, but once you hit the API threshold, the feed will go from Niagara Falls to a leaky sink faucet.” The big players need developers to build apps on their platform to maintain their monopoly. So they promise developers access to rich data to lure them in, only to throw the walls up when they’re finally being successful. I’ve rarely seen as clear an exposition of how the data you and I produce as we use the internet has become a currency, and more importantly, how that currency is controlled by an oligopoly. Ben’s piece is worth reading, and as you do, think about who should own your data, because it’s not an easy question.

This Week:

The battle between robots and humans is coming, and apparently we (collectively) have buried our heads in the sand. Here’s an interesting new chart of a Pew Research study that finds that two-third of Americans believe robots will soon do most of the work humans currently do. The catch? Eighty percent of Americans think that won’t eliminate their own jobs. These are interesting questions to put to ourselves — will robots replace lots of work? And why would your job be protected?

——

MIT Technology Review has a good inside look at the servers Facebook uses to solve A.I. problems. Facebook, like Google and lots of other companies, is using huge farms of GPU chips to train its neural networks (the key component in all the new AI solutions). One thing that’s cool is that Facebook has been open sourcing its server designs so others can use them (and improve them). The story notes that Facebook is looking into making its own chips, too, something that Google already does. Making a new chip is incredibly expensive – so this is a sign of how committed Facebook is to AI.

In Industry:

Putting the right story in front of the right readers is the name of the business model of many media companies today. A new tool called Multiworld Testing, created by Microsoft researchers could help many media companies. It implements a technique called Reinforcement Learning, where a system tries to learn what actions to take to get a reward. A great example of this problem is to decide which article a web site, like WSJ.com or the NYTimes.com, should show you next. It can play it safe and show you stories that their past data indicate you will definitely like. But then the sites do not get to test out stories that vary much from that and they miss out on finding about your other interests. Many media companies are using traditional A/B testing, but Microsoft’s approach has already provided a 25% lift on clicks for suggested stories in Microsoft News. (Fairly technical write-up here posted by John Langford, an extremely smart researcher at Microsoft.)

——

Keep your driver’s license handy. This week, Consumer Reports warned that society is not ready for autopilot. The extremely limited data suggests that the accident rate under autopilot is around 1 accident per 130 million miles while for a human being driving, it is around 1 per 60 million miles. That makes it seem like the Tesla autopilot is actually better than a person – certainly it’s not obviously much worse. So is Consumer Reports being overly cautious? I think so. And I suspect that a lot of future innovations from AI will face this same problem: we will naturally hold automated systems to a much higher standard than humans. Mostly, this is because preventing accidents requires us to restrict people’s behavior, which they don’t like. Restricting an AI’s behavior is much easier, and AI’s don’t get a vote. At least not yet.

——

Computer vision is undergoing a renaissance. What used to be used for mundane tasks like reading the amount written on a check in an ATM is now fueling new robotics, facial recognition in social media, and driverless cars. I’ve always been interested in the subject and I just ran across this slide deck by deep learning pioneer Yann LeCun (who now works at Facebook). It’s from a speech he gave at the 2015 Computer Vision Conference, but it’s a great survey of both the history of computer vision and current research and still worth reading through today!

Quirky Corner:

It’s fun to see people talk about how data can be used in sports, like to get the perfect golf swing or to track all the riders in the Tour de France.

Google’s DeepMind, the team behind AlphaGo, which beat a human grandmaster in Go for the first time earlier this year, is working on building an AI agent as smart as a rat. It’s a short video, and it’s interesting to hear how they think about testing their AI rat. Hint: pick a rat-behavior study from the 60’s and replace “cheese” with something more palatable to an AI.

What’s Happening at Ufora:

I spoke this past week on a panel called “Datafication in Asset Management” moderated by my friend Peter Knez. It was a lively discussion, and the audience had some great questions about collecting and using alternative datasets as part of an asset management strategy. In particular, someone asked about the legality of building bots to pull data out of websites that require you to login. Answer: probably OK, but much of the law is largely untested here.

I’ve been giving a lot of presentations lately and I think this is a good write-up on how to effectively present on data.



Braxton McKee is the technical lead and founder of Ufora, a software company that has built an adaptively distributed, implicitly parallel runtime. Before founding Ufora with backing from Two Sigma Ventures and others, Braxton led the ten-person MBS/ABS Credit Modeling team at Ellington Management Group, a multi-billion dollar mortgage hedge fund. He holds a BS (Mathematics), MS (Mathematics), and M.B.A. from Yale University.

The post This Week in Data by Braxton Mckee (CEO, Ufora) & MLconf Alumni Speaker, Issue #6 appeared first on The Machine Learning Conference.

Show more