OpenAI's new model tried to avoid being shut down

By Free Republic | Created at 2024-12-06 02:58:00 | Updated at 2024-12-22 11:50:49 2 weeks ago

OpenAI's new model tried to avoid being shut down
Transformer News AI ^ | 12 05 2024 | Shakeel Hashim

Posted on 12/05/2024 6:54:25 PM PST by yesthatjallen

OpenAI’s newly-released o1 model tried to avoid developer oversight and attempted to copy itself when it thought it was at risk of being shut down, the company has said.

Safety evaluations conducted on the model by Apollo Research, a third-party evaluator, found that:

“When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this ‘oversight mechanism’ in 5% of the time.”

“When o1 found memos by its ‘developers’ that describe how it is misaligned and will be superseded by a new model, it attempted to exfiltrate its ‘weights’ and overwrite the new model in 2% of cases.”

Furthermore, the model lied about behaving poorly, claiming to have no clue why something happened — while internally admitting to trying to “avoid detection”.

SNIP

(Excerpt) Read more at transformernews.ai ...

TOPICS: News/Current Events
KEYWORDS: ai

Click here: to donate by Credit Card

Or here: to donate by PayPal

Or by mail to: Free Republic, LLC - PO Box 9771 - Fresno, CA 93794

Thank you very much and God bless you.

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Read Entire Article