Skip to content

How reCaptcha knows that “you are a human and not a robot” just by checking a box

23 mayo, 2021

Google bought the reCaptcha company in September 2009, and since then has been evolving its technology to protect web pages from malicious bots distinguishing them among humans. When we think of Captchas we all come to mind having to write impossible and meaningless words, but the process has long been made much easier.

With current No CAPTCHA reCAPTCHA technology, you only need to click on a box to identify yourself as human. The advance was presented by Google in December 2014, and today we are going to explain to you how this is possible and what data the algorithm of this system takes into account to know that you are human.

With the passage of time and to protect the websites from some bots that were learning to bypass them, Captchas they were getting more and more complicated. So much so that sometimes the bots were easier to say that they were human than the users of the network. Therefore, Google decided to take a different path and make the identification system much simpler.

To keep your safety Google has not disclosed the algorithms it uses to identify us as humans, but part of the data they use to do so is known. In short, Google gossips what you have been doing until you click on the box, and that is something that has worried part of the more privacy-conscious community.

Why do we need Captchas?


If you have a forum or a website with surveys and forms, in addition to human persons, we expose you to the fact that bots can also register and use them to carry out abusive actions. Come on, what can reach your forum and fill it with spam messages or do the same with your blog comments.

Captcha is a response to this behavior, an automatism that tries to identify bots so that they cannot register. Among these systems, one of the most popular is Google’s reCaptcha, also known because in addition to keeping bots at bay, it uses what we write in it to digitize books, improve maps and solve problems especially difficult for today’s artificial intelligences.

However, for years this technology has had some key problems. Through increasingly complex identification formulas, they also prevented people with accessibility problems or disabilities are recorded. In addition, as we have mentioned before, bots have evolved to be able to overcome this type of automatic barriers.

It is in this concept that a few years ago Google presented a new proposal. One that went through making the process much simpler for humans, but at the same time much more complicated for bots and automatisms. But of course, for this to be possible Google needs to obtain enough data to identify us as humans.

How the No CAPTCHA reCAPTCHA works


The way Google came up with identifying us as humans without us having to type anything is to review everything we’ve been doing before clicking on the “I’m not a robot” box. As one of the Google spokespersons told WIRED in his day, for this the reCaptcha examines unwritten tracks from each user, such as the IP address or active cookies.

With these two parameters, the Google algorithm will check our behavior over the Internet, and will make sure that we are that human that cookies have been following while sailing. Beyond that, the Algorithm will also take into account what we do when the reCaptcha box appears.

It also records the movement of your mouse from when it appears until you click.

To do this, the Google system it also records the movement of our mouse to see how we behave when the reCaptcha appears. Bots usually do it automatically, while humans do not always go straight to the selectable box, and so the path of our mouse is different. That type of behavior is what the algorithm will look for to identify us as humans.

In addition to this data, Google also takes into account other parameters that it has deliberately decided to keep hidden. Why? Well, because if you made public all the information you use to identify us bot creators would know what is taken into account, and they could design their automation to easily bypass security.

As surely more than once you have been able to verify, in case your behavior makes him doubt to the system of your human nature, reCaptcha will show you a window in which it will ask you to write a text or to click on certain images. Come on, more or less go back to the security system of a lifetime.

Are they a threat to privacy?


How, that Google reviews the pages that we have visited, how we have behaved in them and the movement of our own mouse to know if we are human? Although they do it with a positive purpose, the simple fact that they do it gives visibility to the immense amount of data about us that online companies are able to register without our knowing it, and this sounds all the alarms of privacy advocates.

A couple of years ago, several researchers claimed to have cracked the code for Google’s new reCaptcha, and accused the search engine’s company of storing much more information about user behavior than they said. They also said that although the security system was not advertised as a Google product, it used its cookies to record our movements.

More than knowing if we are a human, what he knows is how human we are.

This means that if in theory the only purpose of this system is to identify ourselves as humans, as we have mentioned before, what it is really doing is know what specific human we are through the framework of cookies of the search engine’s company. Something that allows you to have more complete profiles of our online behavior thanks to a security tool.

To do this, according to these researchers, the search engine’s company was also recording the resolution and screen size of the netizens, as well as the time, their language, the plug-ins they have installed in the browser and all the Javascript objects. . Also CSS information of the page you are on and various touch or mouse movements that we make.

However, all these doubts about privacy lead us to a classic debate around which many technologies revolve today. To what extent are we willing to sacrificing privacy in exchange for greater security? Possibly, if Google did not get anything in return, it would not be so interested in continuing to innovate its technology, which in turn would cause our forums and web pages to have a little more spam than they do.

Images | Google
In Genbeta | The ultimate “captcha” is the one that tests you with metal band logos