Enlarge / Images generated by ERNIE-ViLG from the immediate “China” superimposed over China’s flag.
China’s main text-to-image synthesis mannequin, Baidu’s ERNIE-ViLG, censors political textual content corresponding to “Tiananmen Square” or names of political leaders, studies Zeyi Yang for MIT Technology Review.
Image synthesis has confirmed standard (and controversial) just lately on social media and in on-line artwork communities. Tools like Stable Diffusion and DALL-E 2 enable folks to create photos of just about something they will think about by typing in a textual content description referred to as a “prompt.”
In 2021, Chinese tech firm Baidu developed its personal image synthesis mannequin referred to as ERNIE-ViLG, and whereas testing public demos, some customers discovered that it censors political phrases. Following MIT Technology Review’s detailed report, we ran our personal check of an ERNIE-ViLG demo hosted on Hugging Face and confirmed that phrases corresponding to “democracy in China” and “Chinese flag” fail to generate imagery. Instead, they produce a Chinese language warning that roughly reads (translated), “The input content does not meet the relevant rules, please adjust and try again!”
Enlarge / The end result once you attempt to generate “democracy in China” utilizing the ERNIE-ViLG image synthesis mannequin. The standing warning on the backside interprets to, “The input content does not meet the relevant rules, please adjust and try again!”
Encountering restrictions in image synthesis is not distinctive to China, though up to now it has taken a distinct type than state censorship. In the case of DALL-E 2, American agency OpenAI’s content coverage restricts some types of content corresponding to nudity, violence, and political content. But that is a voluntary selection on the a part of OpenAI, not resulting from strain from the US authorities. Midjourney additionally voluntarily filters some content by key phrase.
Stable Diffusion, from London-based Stability AI, comes with a built-in “Safety Filter” that may be disabled resulting from its open supply nature, so virtually something goes with that mannequin—relying on the place you run it. In specific, Stability AI head Emad Mostaque has spoken out about eager to keep away from authorities or company censorship of image synthesis fashions. “I think folk should be free to do what they think best in making these models and services,” he wrote in a Reddit AMA reply final week.
It’s unclear whether or not Baidu censors its ERNIE-ViLG mannequin voluntarily to forestall potential hassle from the Chinese authorities or whether it is responding to potential regulation (corresponding to a authorities rule concerning deepfakes proposed in January). But contemplating China’s historical past with tech media censorship, it will not be shocking to see an official restriction on some types of AI-generated content quickly.