The Dark Side of CAPTCHA, Everyone’s Least-Favorite Security Tool

If they’re so easy to crack, then why do CAPTCHAs still exist? It’s not just to infuriate us

Last week, I got to the train station and realized I’d forgotten my wallet with my contactless debit cards. Fortunately, it’s now 2020 and physical cash cards are for Boomers, so I whipped out my iPhone and loaded up Google Pay. Alas, as my train approached, my screen presented me with a grid of pictures. “Click all the images where you can see road signs,” it read. I sighed, and resigned myself to the fact that I’d once again been beaten down by Big Tech. 

I’m sure you, too, have been forced to complete a CAPTCHA — either a sequence of blurry, wavy A.I.-generated numbers or letters or a mathematical or (in my case) visual puzzle. They’re based on the Turing Test, which was developed in 1950 by British mathematician Alan Turing, whereby computers were given a series of mathematical challenges designed to distinguish human behavior from their machine counterparts. In the 2000s, when bots, malware and viruses polluted the internet by imitating human patterns, the CAPTCHA, or “Completely Automated Public Turing Test (to tell) Computers and Humans Apart,” was developed. By 2010, CAPTCHA had become one of the most popular tools for cybersecurity and led to Google acquiring the technology for around $27.8 million.

Valuable as it is, the tech isn’t perfect.

Today, it’s hard to do much of anything online without solving some kind of CAPTCHA. The vast majority of people complete them with gritted teeth, believing that in the long term, their privacy and security will be better. But while that might have been true in the early 2000s, that’s no longer necessarily the case.

In 2013, the A.I. research company Vicarious developed a bot that it claimed could outsmart around 70 percent of CAPTCHAs operating online, and close to 100 percent of the CAPTCHAs that involved deciphering blurred text. In a 2017 paper outlining its methodology, the researchers argued that improved predictive-modeling technology allows robots to learn how to solve CAPTCHAs more easily. Basically, Vicarious’ A.I. bots showed that it didn’t require much to build a machine capable of studying a CAPTCHA’s pixel structure in order to come up with a fairly accurate guess of its string of numbers and letters. (Meanwhile, Microsoft, eBay and Yahoo! ditched audio CAPTCHAs in 2011 when Stanford University researchers built an A.I. bot that could decipher them through language and speech patterns.) 

“It’s inevitable that as machine-learning and A.I. becomes more advanced and integrated, the bar for Turing tests will get higher,” says Nick Flont, a researcher at Shape Security, an international cybersecurity firm in Santa Clara, California. He adds that as websites, social media platforms and apps demand more data from users, A.I. bots have “a much clearer idea of human behavior patterns on the internet than ever before, making it harder to detect whether someone is really a human user or an automated machine.”

That said, in a 2017 blog, Flont wrote that CAPTCHA-solving services — which employ people on low wages, usually in developing countries, to solve captcha puzzles — provide a bigger threat. Based on his analysis, Flont estimated that an average CAPTCHA solver would likely earn around $2 a day if they solved 10 CAPTCHAs, an amount that can go far in countries where local currency is either highly inflated, or where no minimum wage exists. “It’s much more profitable for criminals who treat these people as automated humans,” Flont says. “They can hire dozens of people to crack a site or a network in a brute force attack. And because there are humans behind the screen, the A.I. can’t actually recognize what’s going on.”

So if they’re so easy to crack, why then do CAPTCHAs still exist? Just to infuriate the rest of us?

The answer probably has less to do with security and privacy, and more with tech companies like Google, Facebook and Amazon pivoting to A.I.-driven automation and relying on us to train their bots for free — and without our knowledge. For example, when Google forces a user to click on dozens of images of cats in a CAPTCHA, it can feed that data to its A.I., so that it can better recognize cats and other four-legged animals.

And, of course, there’s a darker side to all of this. Last year, the Intercept’s Lee Fang discovered that hundreds of low-paid, gig-economy workers were solving CAPTCHAs on Google’s network for around $1 an hour. Unbeknownst to them, their work contributed to enhancing the accuracy of the Pentagon’s drone targeting system.

“There are lots of questions about who will own and get to use powerful machine-learning tools, and how they’ll affect everything from who gets employed in certain jobs, to who’s entitled to credit, or even healthcare,” Flont tells me. “All of that might be decided by machines rather than people who can be held accountable.” 

In that case, the days of identifying how many bicycles are in a grid to prove you’re a human being won’t seem so bad after all.