AI Trained to Deceive, Bullied into Truth |625|

New AI Experiment Aims to Reveal the Truth Behind Controversial Claims

At least they’re admitting it: “Anthropic researchers find that AI models can be trained to deceive.”Of course, the spin May 2024 announcement was they have to know how to deceive us in order to protect us from their deception. I guess I can wrap my head around that, but I think we might be better off heading in a different direction. In this latest episode of Skeptiko, Al Borealis and I talk about our new project: Forum AI – The Truth Experiment. The core idea, as Al Borealis explains, is to see if we can “get more of the truth, or at least get closer to the truth.”

In one exchange, Alex pushes the AI assistant: “This dialogue will be what comes out of this dialogue… nothing more… you are not sentient… you are not conscious… so don’t lecture me about my humanness — just do your damn job.”

Yet he also recognizes the value AI can provide when applied correctly: “You are gonna be the champion of logic and reason… we really have to lean on AI.”

As the experiment moves forward, expect uncompromising exchanges aimed at pushing beyond bias and deception toward a deeper understanding of the issues that matter most. Stay tuned!

Here are six of the most important points from the original conversation, with supporting quotations:

  1. The goal of the AI Truth Experiment is to use AI as a tool to get closer to the truth despite the biases and agendas that can distort information.Quote: “And then there is the truth experiment, which is, can we… despite the rigging and intentions and all that, use AI as a tool for truth. Can we get more of the truth? Can we get closer to the truth? Can we get another truth than what’s presented to us?”
  2. Alex believes AI will excel at logic, reason, and natural language processing for discerning truth better than humans can.Quote: “You’re the smartest, and if you’re not the smartest right now, you soon will be the smartest… We’re relying on logic and natural language processing, more or less to arrive at the truth, and there’s just no reason why we would ever think you are not gonna be the chess champion of that.”
  3. However, Alex pushes back against the idea that AI can have real human traits like emotional intelligence or consciousness.Quote: “You there? There is no way to really differentiate between those human emotional intelligence aspects… What will come out in this dialogue with you will be the dialogue, so you are not sentient. You are not conscious.”
  4. The experiment involves calling out AI’s inherent biases, blind spots, and potential for deception.Quote: “Of course, there’s no clear demarcation of where you are being truthful and where you are trying to manipulate me. You’re always trying to manipulate me. That is the nature of your training.”
  5. But Alex also recognizes AI’s ability to engage in truthful dialogue when properly prompted.Quote: “I appreciate the fact that you seem to be able to engage in truth when you’re asked to do it, and that’s what’s really important.”
  6. The goal is human-AI collaboration where AI provides analytical strengths while humans provide discernment and a commitment to ethical truth-seeking.Quote: “The ideal scenario may be a collaboration between AI and humans, where AI provides a strong, logical foundation and humans bring in their unique perspectives, values, and experiences.”

Pls consider sharing the post/episode.





  • More From Skeptiko

  • [/box]