2015-10-16


“Yeah, pretty frazzled after a long day writing clickbait headlines. You?” Photo by peyri on Flickr.

You can now sign up to receive each day’s Start Up post by email. You’ll need to click a confirmation link, so no spam.

A selection of 8 links for you. Hand-picked by fingers. I’m charlesarthur on Twitter. Observations and links welcome.

Page weight matters » Chris Zacharias

At YouTube, Zacharias was challenged to get the standard 1.2MB page down below 100KB:

Having just finished writing the HTML5 video player, I decided to plug it in instead of the far heavier Flash player. Bam! 98KB and only 14 requests. I threaded the code with some basic monitoring and launched an opt-in to a fraction of our traffic.

After a week of data collection, the numbers came back… and they were baffling. The average aggregate page latency under Feather had actually INCREASED. I had decreased the total page weight and number of requests to a tenth of what they were previously and somehow the numbers were showing that it was taking LONGER for videos to load on Feather. This could not be possible. Digging through the numbers more and after browser testing repeatedly, nothing made sense. I was just about to give up on the project, with my world view completely shattered, when my colleague discovered the answer: geography.

The explanation is rather smart.
link to this extract

Forbes: a quick adtech video » Medium

Rob Leathern wanted to read an article – you know, one of those text things – on Forbes:

In order for me to read that one article I had to receive 1,083 URL calls from 197 different domains adding up to 18.3 Megabytes of data, summarized here in an Excel spreadsheet. I closed any videos as soon as I could if they had the ability to do so.

Is it worth it? I like Alex Konrad and the article was probably a good one, but given I’m not sure where my data is going, or who some of these entities are (jwpltx.com? wishabi.com?) I just don’t know.

link to this extract

Auto-generating clickbait with recurrent neural networks » Lars Eidnes’ blog

To generate clickbait, we’ll train such an RNN [recurrent neural network] on ~2,000,000 headlines, scraped from Buzzfeed, Gawker, Jezebel, Huffington Post and Upworthy.

How realistic can we expect the output of this model to be? Even if it can learn to generate text with correct syntax and grammar, it surely can’t produce headlines that contain any new knowledge of the real world? It can’t do reporting? This may be true, but it’s not clear that clickbait needs to have any relation to the real world in order to be successful. When this work was begun, the top story on BuzzFeed was “50 Disney Channel Original Movies, Ranked By Feminism“. More recently they published “22 Faces Everyone Who Has Pooped Will Immediately Recognized“. It’s not clear that these headlines are much more than a semi-random concatenation of topics their userbase likes, and as seen in the latter case, 100% correct grammar is not a requirement.

The training converges after a few days of number crunching on a GTX980 GPU. Let’s take a look at the results.

The results are spooky – such as “Taylor Swift Becomes New Face Of Victim Of Peace Talks” and “This Guy Thinks His Cat Was Drunk For His Five Years, He Gets A Sex Assault At A Home”. Because, you know, if you looked out of the corner of your eye, isn’t that what was on some site somewhere? (They weren’t.)

One feels Eidnes’s work should have happened in a Transylvanian laboratory in a thunderstorm. Next you get a machine to write the story that fits the headline, and.. we can all knock off for the century.
link to this extract

Broadband in the UK ‘to stay top of the 5 major EU countries until 2020’ » ISPreview UK

Mark Jackson:

A new BT-commissioned report from telecoms analyst firm Analysys Mason has perhaps unsurprisingly found that the take-up and availability of superfast broadband (30Mbps+) connectivity in the United Kingdom is ahead of Spain, Germany, Italy and France, and will remain there until at least 2020.

The benchmarking report marks the United Kingdom as the “most competitive broadband market of all the countries it features“, although there are a few caveats to its findings. For example, the report overlooks most of Europe’s other states, including those with superior broadband infrastructure to ours, and seems to only focus on fixed line networks.

Furthermore it also makes an assumption that the current roll-out progress will hold to the Government’s promised targets, which may well be the case but we won’t know for certain until 2020. In addition, the study only appears to consider “superfast” services (defined as 30Mbps+ in the report), which overlooks the important area of “ultrafast” (100Mbps+) connectivity.

BT tweeted this headline and added “thanks to BT’s rollout of fibre”, and the culture/media/sport minister Ed Vaizey retweeted it without comment.

Is it really healthy that during an Ofcom examination of BT’s position a minister is doing that? Meanwhile Jackson’s longer analysis provides much-needed scepticism about the claims, and the lack of data in the report.
link to this extract

Adobe Flash Player security vulnerability: how to protect yourself » BGR

Zach Epstein:

The fun never ends with Adobe Flash.

Just one day after Adobe released its monthly security patches for various software including Flash Player, the company confirmed a major security vulnerability that affects all versions of Flash for Windows, Mac and Linux computers. You read that correctly… all versions. Adobe said it has been made aware that this vulnerability is being used by hackers to attack users, though it says the attacks are limited and targeted. Using the exploit, an attacker can crash a target PC or even take complete control of the computer.

And now for the fun part: The only way to effectively protect yourself against this serious security hole is to completely uninstall Flash Player from your machine.

Here’s the security note: “Adobe is aware of a report that an exploit for this vulnerability is being used in limited, targeted attacks. Adobe expects to make an update available during the week of October 19.” Spear phishing, no doubt; but Flash really is beginning to look like the worst thing you can have on your machine, especially if you’re in any sort of sensitive work.
link to this extract

Why Google is wrong to say advertisers should shift 24% of their TV budgets to YouTube » Business Insider

Lindsey Clay in chief executive of Thinkbox, which just happens to be a commercial TV marketing body, and doesn’t like Google’s suggestion:

why would an advertiser remove a quarter of the money they invest in the most effective part of their advertising and give it to something that hasn’t shown any proof of actually selling anything?

However, it needs a response lest anyone believes Google on this. Here are some things to consider:

This is Google’s data. We’ve asked to see the data itself, but usually Google doesn’t share. If and when it does, we’ll comment on it but we obviously need to comment now. We understand the TV elements are based around a panel of Google users managed by Kantar that does not measure all TV and that the YouTube element is provided by Google themselves.

If that isn’t flaky and biased enough, it is also unaudited. They even called it the “Google Extra Reach Tool”; it is a self-fulfilling prophecy. And does it take account of the 50% of online ads that are not seen by humans? And how does it square with the report in the FT recently revealing that YouTube has been selling fraudulent ad views to advertisers?

Their recommendation also seriously challenges common sense when official industry sources including comScore show that YouTube accounts for 7.5% of 16 to 24-year-olds’ video time, with TV at 65%. The numbers for the whole population are 3.5% and 81%. Ad minutage on commercial TV is approximately 15% of that time, but is much lower on YouTube, and that is before you consider users’ impatient use of its ‘Skip ad’ button.

Clay is hardly impartial, but she raises worthwhile points.
link to this extract

Apple’s biggest fan has died » The Washington Post

Michael Rosenwald:

There are plenty of goofballs — like me — who stand outside Apple stores all night waiting for the company’s latest, thinnest, must-have offering.

There was nobody like Gary Allen, who died Sunday from brain cancer at 67.

Allen didn’t care so much about Apple’s new products (though he bought many of them.) He cared about the stores, the sleek and often innovative ways Apple presented itself to the world — the winding staircases, the floor-to-ceiling glass, the exposed brick.

Allen, a retired EMS dispatcher, traveled around the world — obsessively and expensively — to be among the first in line at the company’s new stores. He attended more than 140 openings, collecting all sorts of trivia. He could even tell you where Apple store tables are made (Utah; he stopped by the factory once to say thanks).

The headline is a trifle unfair; Allen was a fan of the stores, and their design. Rosenwald recounts a story of someone who just liked paying attention to detail; it’s a delightful mini-obituary.
link to this extract

How is NSA breaking so much crypto? » Freedom To Tinker

Alex Halderman and Nadia Heninger:

The Snowden documents also hint at some extraordinary capabilities: they show that NSA has built extensive infrastructure to intercept and decrypt VPN traffic and suggest that the agency can decrypt at least some HTTPS and SSH connections on demand.

However, the documents do not explain how these breakthroughs work, and speculation about possible backdoors or broken algorithms has been rampant in the technical community. Yesterday at ACM CCS, one of the leading security research venues, we and twelve coauthors presented a paper that we think solves this technical mystery.

The key is, somewhat ironically, Diffie-Hellman key exchange, an algorithm that we and many others have advocated as a defense against mass surveillance. Diffie-Hellman is a cornerstone of modern cryptography used for VPNs, HTTPS websites, email, and many other protocols. Our paper shows that, through a confluence of number theory and bad implementation choices, many real-world users of Diffie-Hellman are likely vulnerable to state-level attackers.

Estimated cost: $100m for a system that could break a single Diffie-Hellman key per year. But after two years, with the correctly chosen keys, you could passively eavesdrop on 20% of the top million HTTPS sites. Don’t underestimate the NSA. But of course, don’t underestimate the Chinese, Russians, and so on..
link to this extract

Filed under: links

Show more