NTT - Nippon Telegraph & Telephone Corporation

05/23/2024 | Press release | Distributed by Public on 05/22/2024 21:39

May 23, 2024 tsuzumi Speaks Human: The Future of AI Communication

NTT's AI can understand you. No, really. It can understand you.

When humans speak to each other, they don't just listen to the words being said. They look at the person speaking, their tone of voice and their body language. The words are important, but just as important are the way those words are said. That's a uniquely human trait, something that computers, software and Artificial Intelligence (AI) are not able to grasp, even with their access to dictionaries and grammar guides.

Or are they?

In an era of already rapid technological change, NTT's development of the "tsuzumi" Large Language Model (LLM) could become a genuine leap forward for AI. Unlike conventional LLMs that primarily process and generate text data, tsuzumi has the capacity for human-like communication.

Put simply, tsuzumi speaks human.

NTT's LLM is able to understand users not just through their words, but through the nuances of their expressions, emotions, and physical condition. This revolutionary technology has the potential to bridge the gap between digital interactions and the rich, empathetic communications characteristic of human relationships.

The secret to tsuzumi is its ability to perceive and interpret both verbal and non-verbal information, such as tone, emotion, and even the physical state of the user. This is complemented by its speech synthesis capabilities, which allow it to respond not in a uniform way, but in a manner that reflects the user's attributes and the context of the interaction. Two people can ask tsuzumi the same question; depending on what tsuzumi thinks they need, it will give them entirely different answers. If tsuzumi is confident that its user just needs a neutral, fact-filled response, it can do that. If the human communicating with tsuzumi needs something a bit more empathetic or encouraging, it can do that too. It's a huge departure from the traditional, machine-like AI we are used to seeing and aims to replicate the adaptive and understanding nature of human conversation.

So how has NTT achieved this? What made an empathetic AI system possible? It's all down to data. Huge amounts of data gathered by NTT over its decades of research on speech recognition and speech synthesis, together with more recent research on how to work out the true feelings of humans by using biometric statistics. NTT's research has allowed tsuzumi to use its speech recognition powers to gather non-verbal as well as verbal information, so that it is able to extract a deeper meaning than words alone can give.

The potential applications of tsuzumi's technology are vast. Here are some of them:

1. Counseling and Support Services

One of the most game-changing applications of tsuzumi could be in the realm of counseling and mental health support. By understanding and responding to the emotional state of users, tsuzumi can provide a more personalized and empathetic interaction, potentially offering comfort and guidance to those in need of emotional support.

2. Education and Learning

In educational settings, tsuzumi could revolutionize the way students interact with learning materials. Imagine a tutoring system that not only answers your questions, but adapts its explanations based on your mood, stress levels, and learning preferences. Such a system could improve learning efficiency and make education a more personalized experience.

3. Customer Service

Customer service might be greatly enhanced using tsuzumi, providing users with an interaction that is not only efficient, but also feels genuinely understanding and sensitive. This could improve customer satisfaction and loyalty across various industries, from retail to banking.

4. Healthcare

tsuzumi could assist in patient care by offering a communication tool that adapts to patients' emotional and physical conditions, providing information, comfort, and even companionship in a manner that is sensitive to their current state.

5. Entertainment and Media

The entertainment and media industries could leverage tsuzumi to create more engaging and interactive experiences, from video games that adapt to the player's emotional state to virtual reality experiences that offer unprecedented levels of immersion and interaction.

The development of tsuzumi reflects NTT's broader ambition to help create a society where AI can empathize with and understand human emotions and conditions. As tsuzumi evolves, it could become a key element in realizing a future where technology supports and strengthens human capabilities and well-being in a way that is seamless, intuitive, and deeply human-centric. NTT's tsuzumi LLM is not just another step towards more advanced AI; it's a leap towards a future where technology can truly understand and interact with us on a human level. A future of more meaningful, personalized, and empathetic interactions between humans and machines, helping to create a more connected and understanding world.

For further information, please see this link:
https://www.rd.ntt/e/research/LLM_tsuzumi.html

If you have any questions on the content of this article, please contact:
Public Relations Department of Nippon Telegraph and Telephone Corporation
[email protected]

NTT-Innovating the Future of AI