China May Blacklist Illegal AI Training Data

China's National Information and Security Standardization Committee, Reuters reports, wants to assess and determine the percentage at which AI is trained using illegal or harmful information; determined sources are then to be blacklisted.

Every body of content used to train publicly accessible generative AI models will be subject to a security evaluation, and those found to include "more than 5% of illegal and harmful information" will be banned, according to the Chinese body.

Illegal and harmful content specified by the committee includes any Chinese-censored information on the web and promoting terrorism or violence, calling for the collapse of the socialist system, harming the nation's reputation, and destroying social stability and national unity.

China to Impose New Regulations for Generative AI — Cutting edge applications of Artificial Intelligence are seen on display at the Artificial Intelligence Pavilion of Zhangjiang Future Park during a state organized media tour on June 18, 2021 in Shanghai, China. Andrea Verdelli/Getty Images

China's reported proposed security specifications include requiring companies building these AI models to obtain the approval of the people whose personal information, including biometric data, will be utilized for its training and instructions in preventing intellectual property violations.

The proposal came from the Chinese Body composed of the Cyberspace Administration of China (CAC), the Ministry of Industry and Information Technology, and the police.

China's proposal comes after its finalized draft rules in July that took effect last August to regulate generative AI services, according to the Library of Congress. The completed draft rules were reportedly temporary as the word "interim" was still within its provisions.

China's AI Regulation

China's regulatory goals within its July-released provisions were said to promote generative AI's healthy growth and controlled application, "safeguarding national security and social public interests" and "protecting the lawful rights and interests of citizens, legal persons, and other organizations."

China's provisions, however made sure to explicitly state that only publicly available generative AI services are covered by the regulatory practices.

China's July-released restrictions also reportedly prohibit "collecting unnecessary personal information" in addition to "illegally retaining input information that can identify a user or providing users' input information to others" about the duties of preserving users' input information and use records. These regulatory practices for the publicly available generative AI services are reported to be co-regulated by six other central government bodies, including China's Ministry of Education and the Ministry of Science and Technology.

AI Regulation Enforcement

China's recent provisions are said to affect the AI industry's success to a grand scale, according to a Times article. Experts reportedly say that the finalized regulatory practices are among the "strictest" in the world. The difficulty of censoring generative AI output is a challenge for AI developers that even Chinese regulators saw to relax its strict provisions.

As a response, the regulatory body made gradually lax rules for AI moderation after concluding that innovation could also take a hit without a more "flexible enforcement."

This versatile type of enforcement that clouds AI systems within Beijing was also noted in the article as a cause of Chinese authorities enforcing it whenever the body deems it necessary, often "arbitrary and less consistent," and with their discretion.

AI regulation remains ambiguous and debatable throughout the globe, but China's recent provisions are described as the "world's earliest and most detailed regulations governing artificial intelligence."