A lot of people don't fully grasp the concept of smart speakers, and some have even lost interest in understanding them. However, today, Xiaomi provided a compelling demonstration.
The challenge for smart speakers lies in finding the right fulcrum in the market. Smart speakers are more than just audio devices; they're stepping stones to the future. Thus, identifying a suitable market fulcrum is critical. If this fulcrum represents only a few hundred thousand units in sales, it won’t have the strength to support an immense future. However, achieving one million units will create momentum, and reaching five million units could fundamentally transform the industry.
This means the key issue is determining which category can sustain five million sales. This is why intelligence and sound must be combined in smart speakers. In short, the first company to reach five million sales will secure the first ticket to the AI Internet era.
Many might not realize that the annual domestic sales of speakers are expected to be around 50 million units. Some of these are Sonos-branded products, while most are low-end Bluetooth speakers. This indicates that there is already a fulcrum of five million units within the Bluetooth speaker market, though this segment is highly price-sensitive.
When a company ventures into smart speakers, it needs to clearly define its objectives: are they aiming to create a new category or replace an existing one? If the latter, what are the characteristics of the existing market? Without clarity, simply imitating Echo is akin to copying appearances without capturing the essence.
Most companies making smart speakers have a clear business positioning. If a product is priced at $799, it’s essentially creating a new category and brand. At this price point, subsequent investments and technological support are crucial. Without them, the product risks failure.
On the other hand, pricing a product at $299 positions it to replace the existing low-end Bluetooth speaker market. This price has the potential to disrupt the market entirely. If the user experience is superior, it could easily overshadow the Bluetooth speaker segment. As we’ve seen in the smartphone market,å°ç±³once again disrupted established markets. The question remains: what market did Xiaomi initially target, and how did it succeed?
Therefore, when developing a smart speaker, it's essential to know your audience and purpose.
Throughout the startup phase, we often overestimate the initial changes and underestimate the rapid growth phase. This pattern persists in the smart speaker category. There are too many distractions at the beginning, and too few companies focus on the long-term vision. Many doubt the viability of this category, which is fatal when the success of the product hinges directly on its execution.
When the product encounters hard tech challenges:
We all know that technology creates value. Many chip companies joke that they work for ARM. However, many companies may not have anticipated that this round of technological innovation would create value by drastically reducing product costs.
In Echo’s smart speaker architecture, the microphone array boards, motherboards, and power amplifiers are separated. The primary reason for this design is that when using analog microphones, corresponding A/D converters and others occupy significant space, allowing for easier optimization. A straightforward optimization approach is to use digital microphones. This allows the microphone array board to be simplified, potentially combining three boards into one. While this reduces overall costs, it comes at the expense of signal quality, which is inferior to analog microphones. To compensate, higher demands are placed on the algorithm—wakeup, beamforming, noise reduction, echo cancellation, dereverberation, etc., requiring extensive optimization. Many people may not appreciate the differences between acoustic algorithms and other common algorithms. Classical numerical and non-numerical algorithms, including deep learning, fall more under computer science. Scientists in these fields can experiment, collect data, and refine hardware parameters without needing to iterate. Acoustic algorithms, however, span two domains, requiring both CS-based algorithms and extensive lab testing and refinement. Xiaomi’s AI speaker exemplifies the value of hard technology companies, showcasing how close collaboration can significantly impact market trends. However, the story doesn’t end here. Once a product gains a substantial user base, it can spawn new features like voice calling, environmental monitoring, and voiceprint recognition, translating into enhanced user experiences.
When product companies find a perfect collaboration model with such hard technology firms, the pace of change accelerates.
Next Steps for Smart Speakers:
Many assume smart speakers are a minor business even if successful. This overlooks the parallels between the iPod and iPhone, which share a continuum. Smart speakers represent the beginning of far-field interaction but aren’t the endpoint. Thus, the next phase of smart speakers requires examining both the speaker market and advancements in far-field speech interaction.
Regarding smart speakers, two distinct localization approaches are emerging in China:
One is Xiaomi’s $299 route, emphasizing speaker attributes to target the low-end Bluetooth speaker market. Prices will likely drop, with the core challenge being whether hard-tech companies like Sonic Technology can offer an experience surpassing existing products while improving continuously. Xiaomi’s AI speakers are poised for an experience upgrade, but prices are declining rapidly. Even optimistic estimates suggest this process won’t grow as fast as apps like Momo and may require cycles akin to Amazon’s refinement.
If Xiaomi’s speaker had launched without other competitors finding their own hard-tech partners like Sonic Technology, the chances were slim. Looking at current competitors, even those with similar experiences face challenges, let alone matching Xiaomi’s AI speaker performance.
Another route combines smart speakers with TV boxes, targeting the box upgrade market. This approach is technically more challenging and demands higher standards for acoustics and interaction accuracy.
Regardless of the path, the ultimate goal is simple: whoever reaches five million users first secures a firmer advantage.
Summary:
The most intriguing aspect is that Xiaomi opened the mobile internet era in 2011. Now, it seems the big screen of AI Internet will likely begin with Xiaomi. The only difference is this time, Xiaomi will partner to achieve this. If Xiaomi succeeds, an even more interesting question arises: will the previous trinity (phone, MIUI, or Mi Chat) return?
DongGuan BoFan Technology Co.,Ltd. , https://www.ufriendcc.com