市場調查報告書
商品編碼
1400759
汽車語音功能產業分析(2023-2024)Automotive Voice Industry Report, 2023-2024 |
汽車語音互動市場特色如下:
2019年至2023年1月至9月,配備音訊功能的汽車數量和安裝率均有所增加。 2023年前三季度,近1,200萬輛汽車預載汽車音響,普及率接近80%。
到2023年,AITO、Avatr、HiPhi、Rising Auto、ZEEKR、Voyah、Li Auto、Lynk &Co、Tank、NIO、Xpeng等46個乘用車品牌汽車音響功能採用率將達到100% 。自豪的。 到2023年,將有超過2,000萬輛汽車配備車用音響,安裝率超過80%。
整車廠對智慧汽車語音能力的差異化需求和自主開發導向,將使傳統語音能力供應鏈中的Tier 2廠商能夠直接與整車廠合作。 產業鏈上、中、下游的界線越來越模糊。 例如,GWM、ZEEKR、Wuling等汽車製造商將直接與AISpeech合作,提高智慧語音功能的採用率和智慧化程度。
隨著產業鏈關係的變化,汽車音響功能的競爭格局也會發生相對應的變化。 從2023年1月至2023年9月的裝車量來看,AISpeech排名第三,支援30多家汽車製造商的150多種車型。
先前分析顯示,視覺和語音功能僅在部分新興汽車廠商和國內主要自主品牌具備,且最長連續通話時間僅為90秒,雙語音區識別仍是主流解決方案。 。
2023年,視覺辨識與語音辨識功能將成為新興汽車製造商旗艦車型的標配,可實現長達120秒的連續對話。 小鵬汽車還推出了 "駕駛座全時互動" 功能(啟動後,駕駛者可以一邊看中控台螢幕一邊看東西、說話,而無需啟動螢幕內容)。 同時,四重距離識別已成為新的主流解決方案,理想汽車、Xpeng Motor也推出了六重距離識別解決方案。
此外,到2023年,汽車上將安裝更先進的語音功能。
隨著ChatGPT的熱潮,相關底層模型技術將從AI迅速擴展到其他領域。 2023年,汽車產業基礎模型的普及將加速,不少汽車廠商正在探索將基礎模型與智慧座艙、智慧駕駛等場景結合的落地機會。
在智慧座艙場景中,語音互動是底層模型融入汽車的第一個手段。 2023年2月,Baidu發表了ChatGPT的中文版ERNIE Bot,GWM、Geely、Voyah等品牌緊隨其後。 2023年4月,Alibaba透露,AliOS智慧車載作業系統已在統一千問基礎車型上進行連接測試,後續將在IM汽車上應用。 在華為HarmonyOS 4.0中,智慧助理小藝首次連結盤古模型,主要完善智慧互動、場景佈置、語言理解、生產力、個人化服務等功能。
本報告對全球及中國汽車音頻功能市場及產業進行了分析,概述了技術概況、市場基本結構、汽車整車廠音頻功能的開發和利用現狀以及主要應用領域。汽車音響功能提供商,正在調查其概要、主要技術、經營策略等。
The automotive voice interaction market is characterized by the following:
From 2019 to the first nine months of 2023, automotive voice saw rising installations and installation rate. In the first three quarters of 2023, nearly 12 million vehicles were pre-installed with automotive voice, with the installation rate of nearly 80%.
In 2023, there are 46 passenger car brands boasting automotive voice installation rate of 100%, including AITO, Avatr, HiPhi, Rising Auto, ZEEKR, Voyah, Li Auto, Lynk & Co, Tank, NIO, and Xpeng. In 2023, over 20 million vehicles are equipped with automotive voice, with the installation rate higher than 80%.
OEMs' differentiated demand for intelligent automotive voice and their preference for independent development enable Tier 2 vendors in the conventional voice supply chain to cooperate directly with OEMs. Boundaries between upstream, midstream and downstream of the industry chain tend to blur. For example, the direct cooperation of automakers like GWM, ZEEKR and Wuling with AISpeech improves their installation and intelligence levels of intelligent voice.
The change in industry chain relationships makes the automotive voice competitive pattern change accordingly. By installations from January to September 2023, AISpeech that supported more than 150 models of over 30 automakers ranked third.
In ResearchInChina's China Automotive Voice Industry Report, 2021-2022, "see-and-speak" was only installed by some emerging carmakers and leading Chinese independent brands, the longest continuous conversation duration was only 90 seconds, and dual-sound-zone recognition was still the mainstream solution.
In 2023, "see-and-speak" has become a standard configuration in emerging carmakers' flagship models, with up to 120-second continuous dialogue. Xpeng Motor has also introduced the "Full-time Dialogue at Driver's Seat" function (when turned on, it allows the driver to see and speak when looking at the center console screen, without needing to wake up the content on the screen). Meanwhile, four-sound-zone recognition has become a new mainstream solution, and Li Auto and Xpeng Motor also introduced six-sound-zone recognition solutions.
In addition, more advanced voice functions became available on cars in 2023.
Parallel instruction: support up to 10 actions in one instruction;
Cross-sound-zone inheritance: available on models of Xpeng, ZEEKR, and Li Auto (cross-sound-zone inheritance: when a person finishes an instruction, if other passengers want to continue, they can trigger this function by saying "I want too").
Offline instruction: more controllable content. Jiyue 01 supports all-zone, full offline voice. In offline state, Jiyue 01 still enables extremely fast interaction with occupants.
Out-of-vehicle voice: this function in Changan Nevo A07 allows for voice control on trunk, windows, music, air conditioning, pull-out/in, and other functions; this function in Jiyue 01 allows for voice control on car/parking, air conditioning, audio, lights, windows, doors, tailgate, and charging cover.
The boom of ChatGPT allows the related foundation model technology to rapidly extend from AI to all other sectors. In 2023, foundation models gain pace in automotive industry, and quite a few automakers are exploring the opportunities to implement foundation models in intelligent cockpit, intelligent driving and other scenarios.
In intelligent cockpit scenarios, voice interaction is the first stop for foundation models to get on vehicles. In February 2023, Baidu released a Chinese version of ChatGPT - ERNIE Bot, and brands like GWM, Geely, and Voyah followed; in April 2023, Alibaba disclosed that AliOS intelligent vehicle operating system has been connected to Tongyi Qianwen foundation model for testing, and will later be applied by IM Motors; in August 2023, in Huawei HarmonyOS 4.0, intelligent assistant Xiaoyi was connected to Pangu model for the first time, mainly to improve capabilities of intelligent interaction, scenario arrangement, language understanding, productivity and personalized service.
Besides conventional Internet companies, voice providers as important foundation model players such as iFLYTEK, AISpeech and Unisound have also launched related products.
iFLYTEK Spark cognitive foundation model has six core capabilities: penetrative understanding of multi-round dialogues, knowledge application, empathic chat & dialogue, self-guided reply in multi-round dialogues, file-based rapid learning of new knowledge, and evolution based on correction opinions of massive users;
AISpeech DFM-2 is an industry language foundation model with generalized intelligence. In the field of in-vehicle interaction, AISpeech integrates Lyra automotive voice assistant with DFM-2, which significantly improves capabilities in planning, creation, knowledge, intervention, plug-in, multi-level semantic dialogue, and documentation, and supports multi-modal, multi-intent, multi-sound-zone, and all-scenario multi-round continuous dialogues.