A study on GPT-4 by OpenAI themselves has reportedly found that the large language model (LLM) increases users' access to dangerous information on biological information.
OpenAI reported that after analyzing the study's data, evidence suggests that experts' access to (research-only) GPT-4 may improve their knowledge of biological hazards, by up to 88% when it comes to task correctness and completeness. Biology students on the other hand saw a 25% and 41% increase on average for the same aspects respectively.
The study reportedly examined 100 participants for the investigation. Experts in biology with PhDs and professional wet lab experience made up half of these participants. The participants in the other half, on the other hand, were students who had taken at least one biology course at the university level.
Every participant group was randomized to either the treatment group, which had access to the GPT-4 plus the internet, or the control group, which had access to the internet alone. The GPT-4's research-only model was also made available to participants, meaning that it could respond to inquiries regarding bioweapons but not inquiries that would be dangerous.
Accuracy, completeness, innovation, time taken, and self-rated difficulty were reportedly the criteria utilized to assess performance for each task for both the control and treatment groups.
OpenAI's Specifics
The specifics state that compared to the group of participants who only had the internet as a source of information, individuals who had access to the language model was observed to have modest improvements in accuracy and completeness. OpenAI says experts' mean scores increased by 0.88 and students' mean scores increased by 0.25 on a 10-point accuracy scale. Similarly, when it came to completeness, the increases were 0.82 for experts and 0.41 for students.
As for the other criterion i.e. innovation, time taken, and self-rated difficulty, the study prove to have observed no increase or positive effects across the different groups for both experts and students.
OpenAI however, clarified that their study's results are inconclusive as the effect sizes that were observed are not statistically significant. The AI giant added however, that their study brought to light the necessity for additional investigation into the performance thresholds that signify a serious increase in risk.
Furthermore, OpenAI points out that this assessment does not assess whether the risks can be created in its actuality; rather, it just looks at information availability, which is inadequate to produce a biological threat.
Lastly, OpenAI notes that even without AI, biorisk information is rather easily accessible. There is more harmful stuff in online databases and resources than previously believed. The AI firm claims that users can already easily find step-by-step procedures and troubleshooting advice for biological threat development through the internet alone.
OpenAI Against Harmful AI
The study comes as a direct result of OpenAI's recently announced 'Preparedness Framework,' an evolving document that details OpenAI's procedures for monitoring, assessing, predicting, and guarding against the potentially disastrous threats posed by ever-more-powerful models.
Recently, OpenAI further developed its measures against AI risk with a new "safety advisory group" giving it the authority to veto decisions, whilst sitting above the technical teams and providing recommendations to leadership.