At the end of 2018 Google let us know about its initial plans to start offering gender-free translations in its translator. Google translate would stop assigning a specific gender to the translation of words that are gender neutral in a language. It is something that happens a lot in translations from English to Spanish, for example something like “a doctor” which is neutral in English, translated as “a doctor” into Spanish.
The initial solution: offer translations for both female and male gender at the same time, indicating that the translation is gender-specific. Today, those plans have moved a little further, and in Genbeta we have talked with Romina Stella, product director at Google Research focused on gender biases in Machine Learning to tell us about the challenges of offering more neutral translations in a world that increasingly advocates a more inclusive language.
‘Sgroogled.com’: When MICROSOFT Launched ANTI-GOOGLE Ads
A translator who does not guess the gender of the subject in a sentence but learns where the gender changes
Romina tells us that two have been the most important advances in resolving gender biases in the translator. The first is that Google Translate no longer only offers gender-sensitive translations for unique words, but for complete sentences from English to Spanish.
For example, for a phrase like “my friend is a doctor” in which the gender of that friend is not known, because in English both “friend” and “doctor” are gender neutral, the translator offers two answer phrases: “my friend is a doctor” or “my friend is a doctor”.
The second great advance is one that has not yet been applied to English Spanish translations, but which basically changes the machine learning model when generating these two options.
The old model generated the two options and we verified that they were equal and of high quality, what happened is that many times those options were lost because the model generated translations that were not exactly the same. The new model generates a single translation, either male or female, and another machine learning model is responsible for rewriting that translation with the other gender.
In Spanish everything is more complicated
The advantage is that being able to rewrite the translations instead of having to generate them all, makes it possible to offer those that are distinguished by gender for many more phrases. This is something that for now it works only with Finnish, Persian, Hungarian, and Turkish to English, and although Spanish still uses the old model, the idea is to replace it with this one soon.
“We have to make the model learn which part of a sentence changes by gender”
Romina explains that in Spanish it is much more complicated because there are many more words that vary by gender, not only the terms that refer to people “In Spanish everything varies, articles and adjectives vary, in English it is somewhat easier because the number of terms that vary is a little smaller.”
And when gender is not binary …
If gender-neutral translations are just taking off, it stands to reason that more inclusive translations that also take into account non-binary genres, are even more difficult. Romina tells us that it is something that Google is taking into account despite the many technological challenges that exist: “we need to learn, we are moving forward and we are fully aware of this.”
In fact, at least for the English version, within Gmail they decided to eliminate the feminine and masculine pronouns from the predictions just to avoid suggestions that could lead to a mistake with the gender of a person. “Not all mistakes are the same and gender is something too important to make mistakes“said at the time Paul Lambert, product manager at Gmail.
In the case of the translator it is a more challenging subject because they need to learn how they are handled in each language, and although in English it would seem that “they and them” are the most used pronouns to refer to the neuter gender, there are many different pronouns that can be used for people who do not identify with masculine or feminine gender.
There are many different pronouns that can be used for people who do not identify with male or female gender, and this varies between languages, countries, and societies.
“In Spanish it is much more complex, we see the use of the ‘x’, we see the use of the @, we see the use of the ‘e’, we see people trying to use neutral language despite how difficult it is in Spanish to it has so much genre. ” It’s something they are thinking about, but just haven’t been able to figure out.
Google consults with many specialists for this, many are linguists, but as Stella comments, not even they understand everything. They need to learn not only how the language works but how it is used.
Spanish is spoken by a huge number of countries, and each one has its own peculiarities, and the vision of people and societies on gender issues is also extremely different in most of them.
Cover image: Flaticon and Unsplash