People pay a lot of attention to lip movements during talks, about half the time. Robots have trouble with this, often looking stiff or creepy, a problem called the uncanny valley, which is when something looks almost human but not quite right, making it unsettling. Now, researchers have made a robot that learns to move its lips properly for speaking and singing, without strict rules.
The robot has a flexible face with 26 small motors. It first learned by looking at its reflection in a mirror, making random faces to understand how its motors control expressions. This is like a vision-to-action language model, a computer system that links what it sees to actions.
Next, it watched videos of humans talking and singing on YouTube to copy lip shapes for different sounds. The robot's artificial intelligence (AI), turns audio into matching lip moves. It worked for various languages and even sang a song from its own AI-made album.
The robot is not perfect yet, struggling with sounds like "B" or "W," but it improves with more practice.
How the robot learns and improves
The researchers tested the robot with different sounds and contexts. When combined with chat AI like ChatGPT, it adds emotional depth to talks. Facial gestures are key for robots in jobs like teaching, healthcare, or entertainment.
This work focuses on learning over programmed rules, helping robots connect better with people. Faces are vital for communication, and lifelike ones could help in many fields. Experts predict many humanoid robots soon, all needing good faces to avoid the uncanny valley.
There are risks, so development must be careful. This step makes robots more natural in interactions, using observation to master subtle moves.
The researcxhers have described the methods and results of this study in a paper published in Science Robotics as cover story.