Not all AI/ML ethical dilemmas involve life and death…
This post is a little light on pure data science topics today, instead, this pokes at some interesting issues about giving people access to “magical” technology, such as automated machine translation, or hell, data analysis tools.
Auto-magic machine translation arrives for retro game fans
On August 25,2019, the folks at LibRetro, whom some of you may recognize as the emulator framework used as a frontend for lots of emulators for video games, and important component for the popular Retropie project, within easy reach put out an AI Service.
The service does a pretty cool thing from, it gives you access to VGtranslate, which will essentially take a screenshot of your game, run OCR on it, send the text to Google translation API, and translate it for you. Then, they cleverly pipe the translated text into a Text to Speech engine to read it back out to you. There’s an option to use a text overlay too, but the TTS feature is a cute twist.
The end result is definitely interesting from a pure tech/nerding standpoint. You could say it’s a very natural convergence many of the advances in ML that had been turned into production-ready APIs the past 5 years or so. I somewhat doubt this was anything more serious than a fun “wouldn’t it be cool if…” thing.
So, what? Sounds cool and harmless.
Full disclosure, I occasionally translate indie Japanese games into English, I’ve even occasionally been paid to do so, but it’s largely a hobby and I certainly can’t (nor want to) make a living off it. Therefore obviously have my own biases.
Probably to the surprise of no one, translators aren’t too happy about the mass release of this tool. But it’s not typically for cynical self-interest reasons that immediately come to mind. Translators don’t really look at it as an existential threat because they can see how horribly bad and inaccurate the machine translation (MT) systems are right now. It’s often so bad, even people who aren’t translation experts can know that something is not quite right (even if they don’t know exactly how wrong things are).
As it stands, the current output is a small step above randomly clicking on menu items in-game and progressing by trial and error. You can at least make semi-educated guesses at what various menus and prompts mean. But it’s a FAR cry from actually understanding what is going on. Machine translation will likely be a threat at some time in the future, but certainly not today.
MT systems have an extremely difficult time with this specific task set before it. Computers are already bad at handling ambiguity and implied context. On top of those issues you’re running OCR on arbitrary quirky retro fonts, plus you only get static screens of text that have zero context from even the previous screen carried over. Having sentence spread across multiple screens would result in crazy fragments too.
To be clear, even a human would massively struggle in such a translation task. It’s like being expected to translate a book but only given the text in post-it notes or tweets. And for all you know, each request can be from a different book.
While professional translators know what’s wrong with the MT output compared to a human trying to convey the work faithfully, the exact people who will be relying on this tool to play games are practically by definition incapable of identifying the good from the bad.
This is the exact same problem as when magical AI-powered analytics tools gets into the hands of people who aren’t trained to know the limitations of the tools they’re given. You even have the same two camps of opinion on whether these tools are a good or bad thing: the “any translation [data] is better than none!” versus the “translation [analysis] should be left to the experts!”. (I previously wrote a bit about striking a balance between the two here.)
So what are translators actually saying?
Various people have chimed in with various detailed thoughts about the tool, but I figured I’ll link to at least one for reference. It’s a fairly long thread, and I won’t be delving into all the specific details.
Come on, so kids read gibberish for a game and think it’s correct, what’s the harm?
“Harm” is almost too serious a word to use on the situation. These are emulated games that are well over a decade old now. Compared to the hot debates about the ethics of AI/ML applied to topics like mass surveillance, deep fakes, and militarized AI systems and who knows what else is around the corner, this is downright minuscule stuff.
But there’s still some kind of harm happening. As PastelChum mentions in that linked twitter thread, people get attached to their first experience of a game or book, bad translation or not. Replaying a game with a new translation is always different and less magical than the first time, even if the translation fixes huge material errors. You just can’t walk that same road twice.
It’s like how there are many people who decidedly like the style of the King James Bible with its ornate and archaic style over more modern and generally more accurate and easier to understand versions that are the standards today. It’s not all grounded in some absolute scale of correctness — familiarity, nostalgia, and emotions plays a part here too.
Thanks to this very human tendency, it means that the original game creators are “harmed” in that their game will not be experienced as intended, and the reader is “harmed” by being forever denied the experience they deserve. They’ll always be haunted by spoilers.
So what’s there to be done?
Because this tech isn’t going away there’s not a huge amount that can be done overnight, but there’s some things that can be done.
The most obvious one would be to just make machine translation insanely good, it has the same or better translation ability as humans. Easy stuff =). Barring such a massive set of breakthroughs, we need to get the system as close as we can and keep refining. The best thing we can do is close the gap between the promise of the technology (understand any language like it was your own) with the reality (get the gist of the general direction of the text, except when it screws up and is nonsense). We wouldn’t be in this situation if the gap between the two wasn’t so big.
Next is education on multiple levels: both for the tool builders that are consuming these APIs, as well as the end users.
The tool builders need to be made aware of the limitations of the APIs they’re using. Importantly, this needs to be stated in easy-to-digest terms that even a non-ML expert can understand the level of quality they’re getting out the door. Remember that once these APIs and tools are generally available, literally anyone can start applying them to arbitrary problems and situations, whether it is a good idea or not.
Finally, end users of the created tools should be informed about what they’re getting themselves into. Granted, it’s a tall order to ask. It means that developers need to know to be up front about this to their users, and on top of that you also need users to read and understand any amount of text on the internet, which is hard. But if we want to minimize the damage that’s done, we can at least inform the minority of users who do read any posted warnings what they’re in for.