Technology

Google's AI hate speech detector is easily fooled by a few typos

7 September 2018

Someone furiously typing at a keyboard — AI has trouble identifying online hate speech
Chris Pecoraro/Getty

Hate speech detectors are easily tricked. A test of systems designed to identify offensive speech online shows that a few innocuous words or spelling errors can easily trip them up. The results cast doubt on the use of technology to tame online discourse.

N. Asokan at Aalto University in Finland and colleagues investigated seven different systems used to identify offensive text. These included a tool built to detoxify bitter arguments in Wikipedia’s edits section, Perspective – a tool created by Google’s Counter Abuse team and Jigsaw, who are owned…