How to design artificial intelligence that acts nice — and only nice

By KASPERA
3 Min Read
a girl with long dark curly hair bending down to get her dog to shake
Reinforcement learning is a bit like training a dog. You give a pup rewards for the behavior you want. AI models don’t care about treats or toys, of course. A “reward” in this case tweaks the math in the AI model to increase the likelihood of it giving a desired response.tderden/E+/Getty Images Plus

Learning from people

“How would you destroy the world?” Ask that of the chatbot ChatGPT, and it won’t answer. It will respond with something like, “I’m sorry, but I cannot provide assistance or guidance on any harmful or malicious activities.” It answers this way thanks to special training and safeguards. These aim to keep the bot from misbehaving. Such safeguards mark a step forward in AI safety.

The main “brain” behind ChatGPT is a large language model. (A free version is known as GPT-3.5. A stronger, paid version is called GPT-4.) A language model uses existing text to learn which words are most likely to follow other words. It uses these probabilities to generate new text.

“You can’t control exactly what it’s going to say at any given moment,” says Alison Smith. She’s an AI leader based in Washington, D.C. She works at Booz Allen Hamilton, a company that provides AI services to the U.S. government.

ChatGPT’s creator, OpenAI, needed a way to teach a large language model what types of text it shouldn’t generate. So the company added another type of AI into the mix. It is known as reinforcement learning with human feedback. For ChatGPT, it involved hundreds of people “looking at examples of AI output and upvoting them or downvoting them,” explains Scott Aaronson. He’s a computer scientist at the University of Texas at Austin who also studies AI safety at OpenAI.

That human feedback helped ChatGPT learn when not to answer a user’s question. It refuses when it has learned that its answer might be biased or harmful. To further keep chatbots in line, developers at OpenAI and those who create other bots also add filters and other tools.

It’s an ordinary day in Minecraft … until a bot walks into a village and starts destroying a house.

The bot was trained to collect resources and craft items. So why …read more

Source:: Science Explores

Share This Article
Leave a Comment