"I'm going to open the sunroof and listen to Jay Chou's old songs on the way to Quyuan Fenghe." If you say this to a person, he will easily understand your three intentions: one, go to Quyuan Fenghe; two, open the skylight; three, listen to Jay Chou's old songs. But if we replace people with machines, such as cars, will the cars be able to understand and give corresponding operational feedback? As we all know, voice is naturally one of the most suitable ways of in-car interaction because of its convenient and safe operation. It has almost become the standard of in-vehicle solutions in the industry, although there are large differences in the voice solutions made by various companies. For example, the semantic understanding multi-tasking mentioned at the beginning is still a relatively new technology application in the industry. Few companies have been able to implement it. Most manufacturers focus on improving the accuracy of voice recognition and natural language understanding. Chen Hualiang, head of AliOS data intelligence, revealed that they are currently upgrading the technology of voice, focusing on improving the experience of scene-based intelligent semantic understanding (SSLU: Scene-based Spoken Language Understanding), which is an intelligent upgrade of language understanding based on natural language understanding, which includes the improvement of multi-domain task processing capabilities. Common dialogue systems are generally composed of several modules: automatic speech recognition (ASR), natural language understanding (NLU), dialogue management (DM), natural language generation (NLG) and text to speech (TTS). It is reported that AliOS has now implemented innovative self-play dialogue training data generation and crowdsourcing solutions, combining a comprehensive understanding of people, cars, and scenarios, migrating linguistic, semantic prior knowledge, and knowledge graph knowledge into the dialogue system, training end-to-end deep learning dialogue system models, improving scenario coverage and dialogue fluency, and enabling the system to better understand voice commands based on scenarios. Taking the command mentioned at the beginning as an example, AliOS will first accurately recognize each word of the sentence "I want to open the sunroof and listen to Jay Chou's old songs on the way to Quyuan Fenghe", and then combine the user's current usage scenario to understand the meaning of the sentence and call related services to perform complex operations such as navigating to Quyuan Fenghe, opening the sunroof, and playing Jay Chou's old songs. Chen Hualiang said: "Spoken language is usually vague and incomplete in meaning. It is not enough to achieve understanding of spoken expression by relying solely on massive corpus data. We believe that only with more information such as people, cars, and scenes can we achieve scene-based intelligent natural language understanding capabilities and provide users with a better voice experience." He introduced that at present AliOS has focused on optimizing and upgrading voice technology in several high-frequency in-vehicle application scenarios such as navigation, music, audiobooks, and radio, to achieve multi-condition search, navigation multi-tasking, changing preferences during navigation, multi-slot query, etc. To give a few vivid examples, for example, "How far is it from here to Zhongshan Park?", AliOS can accurately understand it as asking the distance from the current location to Zhongshan Park; "Delete the previous waypoints", AliOS can accurately delete the last waypoints; "Play some songs that suit the occasion for me", AliOS can play appropriate songs based on the current weather and time information. In addition, AliOS has now achieved multimodal fusion of voice, vision, gestures and other interactive methods from the bottom of the system, striving to provide users with an immersive experience. It will be widely used in various scenarios such as in-car music, news broadcasts, audiobooks, in-car navigation, etc. As a winner of Toutiao's Qingyun Plan and Baijiahao's Bai+ Plan, the 2019 Baidu Digital Author of the Year, the Baijiahao's Most Popular Author in the Technology Field, the 2019 Sogou Technology and Culture Author, and the 2021 Baijiahao Quarterly Influential Creator, he has won many awards, including the 2013 Sohu Best Industry Media Person, the 2015 China New Media Entrepreneurship Competition Beijing Third Place, the 2015 Guangmang Experience Award, the 2015 China New Media Entrepreneurship Competition Finals Third Place, and the 2018 Baidu Dynamic Annual Powerful Celebrity. |
<<: Tesla successfully acquired land for industrial use in Lingang, Shanghai for RMB 973 million
The Central Meteorological Observatory continued ...
[[411252]] As a national-level application, every...
recent Sinopec releases news Major breakthrough i...
Author: Chen Jiajun I believe many people have he...
What is a hot product? Not only do we need to mak...
1. Wedge Some time ago, I chatted with a young la...
Tadpole Jun guessed that the most exciting thing ...
The following content is compiled from the live s...
Recently, the Gaogong Intelligent Automobile Summ...
There's still no definitive answer as to whic...
This cylindrical battery jointly developed by Sam...
On winter mornings, many people would rather give...
Cervical cancer is too cunning. It has a low prof...
According to foreign media reports, Honda Silicon...