Maybe we have mastered the key to the era of smart home

OFweek smart home network news In the 12th year of audio technology research and development, Zhou Zhengyou once again felt the ideal magnet attraction.

Regarding possible scenarios in the Smart Age, the peacock's opening screen seemed to show before its eyes, sparking sparks in the hearts of 36-year-old engineers. Life is happy and stable, but is there so much we can do about voice technology? In the uncertain future, although the whole picture cannot be seen clearly, the words “Intelligent Intelligence” have been shining brightly. For enterprises, abandoning the voice intelligence has lost the future, and one step ahead may hold the era of smart home turning on. key.

Time heroes. In 2015, when the backbone team of the 34 people of HKUST Telecom took a suitcase and blew the cool breeze of Beijing’s early spring and gathered at Exit A of Yiyuan’s Wanyuan Street Subway Station, they all went toward the Chaolin Mansion where Jingdong’s smart team was located, and were surrounded by morning exercises. People have cast a curious look. They may not think that this moment basically determines the development path of China's smart speaker market and ecology.

As the "world's 50 most intelligent companies", HKUST's intelligent voice core technology represents the highest level in the world. As the largest retail enterprise in China online and offline, Jingdong has already become the most important channel for digital and intelligent products in China. The first to establish a complete intelligent hardware strategy. In cooperation with JD.com, HKUST launched the “Shu” smart speaker soon. At present, Handan accounts for about 80% of the domestic smart speaker market.

Walking on the road of exploration in the field of intelligence, they are happy to see fellow travelers. It has been 30 years since the user accepted the mouse and keyboard and used the touch screen. In the same way, the revolution of speech interaction triggered by smart speakers also takes time.

In these two years, what made them most satisfied was that the relationship between the user and the user was getting closer and closer to understanding their needs, preferences and expectations. The questions they proposed, the functions they wish to use, and their interaction with each other every day provide the basis and direction for product optimization. Users see that the functions they want are continuously realized. In addition to the simplicity and novelty brought by smart speakers, they also feel the sense of participation as a designer.

In the near future, there are people who are applauding and criticizing about product performance, market and future. As the person in charge of the Linglong R&D Center, Zhou Zhengyou said that there are boundless possibilities in the future. We will be able to walk steadily on this road and go even further.

“Who are you?” — Engineers who traveled south and north and left nosebleeds set off a verbal revolution

On April 24, 2015, Beijing Jingdong Science and Technology Co., Ltd. was established.

In June of the same year, the first smart speaker was released to realize human-machine voice interaction and voice control of smart homes.

User Kobayashi to use "叮咚" smart speakers in his living room

Two years later, when the new product 叮咚TOP was sold out of stock in the first release of 618 and the entire team was immersed in a pleasant atmosphere, for Zhou Zhengyou the engineers, the situation when the first speaker was developed seemed to be in sight.

As soon as the company's team arrived in Beijing, they docked with Jingdong Smart and began their intense work. Less than a week after coming to Beijing, the dry Beijing Spring brought nosebleeds to everyone on the team. At the time, a common scenario was when an engineer pulled a piece of paper from a nosed paper pack and handed it to another engineer who had his eyes bleeding down to plug it in, so that he could continue discussing the plan.

When the product is actually on the line, their affirmation of leadership is not a "cold," but they are excited to squeeze together to see user interaction data.

From the background data, man-machine voice interaction has brought a lot of fun to the user's life. For example, after booting, many users will ridicule and ask "who are you?" At this point, you will answer, "I'm a smart speaker." Since the addition of the custom question and answer function, we can see that the answers edited by users are varied, some are simple, such as "I'm a dog" or "What flowers are I", and some are very cute, such as "I am a person "Love is the first man in the universe who loves to see flowers." Some are very interesting: "I'm your best friend. I supervise you to complete your homework every day."

Ling Long science and technology chief scientist Tang Bo recalled that at the beginning of product design, they had heated discussion for the character of Yan, and finally voted for the definition of "20-year-old intellectual women." In the user's life, “she” should be a knowledgeable, elegant, gentle yet rigorous family assistant who can help users solve various problems.

"We are not a speaker hardware company, but dedicated to voice intelligence, the speaker as a smart home center or entrance," Tang Bo said, as an entry-level product, to guide users to adapt to the interactive voice.

In fact, when we review the development of mobile Internet and mobile smart devices, the core trend is the evolution of user interaction. After the mid-1980s, the mouse keyboard from traditional desktop computers became the benchmark for user experience, and the iPhone used touch guidelines for smart phones to establish the interaction criteria, which has experienced a long evolution for more than thirty years.

In the same way, the revolution of speech interaction triggered by smart speakers will also require time.

Technically speaking, as early as 2002, Google launched a voice search and started research on voice input, semantic recognition, and voice text conversion. After more than a decade of evolution, voice interaction has reached a level that can provide users with accuracy and ease. Operational, relatively mature use of interactive experience.

Data show that there are more than 60.5 million U.S. users using voice assistance this year, of which Millennials (currently 25 to 34 years old) have more than 29.9 million users, which means that more than one-third of the generation of people in this generation are using voice assistants. By 2019, this proportion will exceed 44.3 percent to 39.3 million.

Voice interaction actually frees the user from the “lower family”. The ideal state of smart home is that through the Internet and voice technology, the home is everywhere "entrance", so the "entrance" disappears in the user's consciousness. When someone is knocking on the door while doing yoga, you can directly send out a voice command without stopping: "Hey, open the door." When you feel horny when you are busy with work, you can ask for help. " Hey, help me to order a braised pork rice takeout.” Voice commands can be applied to any scene in life. Natural activities occur in the user's life. The more frequently you use it, the more she “knows” you. The smoother the machine interaction.

In the field of voice technology, HKUST Newsletter is the company with the longest time in basic research, the best performance in the past evaluation, and the highest market share. Its intelligent voice core technology represents the highest level in the world. In June 2017, it was selected as the “Top 50 Most Smart Companies in the World” list in MIT Technology Review 2017.

However, any explorer in any field is faced with many difficulties.

The R&D team upgraded a semantic recognition model before the first generation of speakers. After the technological breakthrough was achieved at the time, the entire team was very excited. After the new model was put on line, it would further improve the accuracy of semantic recognition. I did not expect the result to be the opposite. After repeated inspections, I found a problem with the value of a certain line of code. From the enthusiasm of thinking of the project, to the frustration of the bad test results, to the satisfaction of solving the problem. Zhou Zhengyou said with a smile and said that "programmers" are boring and dull. In fact, we have a lot of ups and downs.

“What is the black hole?” —The same boat builds, Tucao changes to “iron powder,” and interactive data form a new type of user relationship.

At present, Handan has launched 6 products, and more than 10 other brands of speakers are used. Putting them together is a "big family." A1 is fixed, A3 is mobile, and TOP opens more home entrances. At the same time, more than 30 versions of the software have been updated and the open platform has been launched for the first time. More than 150 applications have been developed and are ready to go live.

“Yeah seems to be getting smarter!” Seeing such a message in the customer group, Zhong Bo, the person in charge of the product center, feels better than anything.

"Hey" smart speakers are loved by many home users

“For users, many users still like it very much.” Based on interaction data, each user uses voice commands on average about 10 times a day. Compared with the average daily activity of the APP, it is less than 10%, and the user's daily activity reaches 30%.

Interaction with users is a fun and sincere process. Zhong Bo recalled that when the product was on the line, some users were frantic about it. You answered him and truly valued his proposal. He became "iron powder." Some function users put forward soon, we will achieve. Then the user is very happy. Everyone is “Amway”. TOP launches new sales records. We have come up to look at users' posts 2 years ago and found that they want smart speakers to perform a lot of functions, many of which have already been implemented. Through the voice interaction data and seeing many "problems" raised by users, we are all right.

Users also leave a lot of fun and sharing messages.

A user put the pipa on the balcony, resulting in “fans” yelling through the window to “hello” to awaken his home’s speakers. There was another user who was in a hurry, but the parents didn’t hear the phone ringing. He used a custom alarm clock set on his mobile phone to make the family’s voice broadcast: “Hurry up, call your daughter!”...

"This team responds to the user's speed and sincerity, which is quite surprising."

Beijing's "old gun" Mr. Zhang has bought more than 40 sets of gongs. In addition to being a gift, he left more than a dozen of his own homes and connected all smart appliances through JD Micro's platform to lead an intelligent life.

At first, Mr. Zhang particularly liked to use it. It was easy and convenient to use. Later I discovered that my child didn’t wake up. After reflecting in the user group, the team began to get busy. Initially, the user groups they set were among young people after the 1980s and the 90s. They did not expect that the elderly and children would use them frequently.

Linglong provided feedback to the Information Technology Research Institute. Researchers went to Hefei’s elementary school to allow students to line up. Each person said “hey” to the microphone and collected a large number of awakening children’s tones, and then passed the technology. Analyze and establish models to increase the child's awakeness rate. Mr. Zhang immediately felt the changes in the system upgrade, and it was very much behind the curtain that the team valued the sincerity of the users.

"The greater the amount of users, the more types of demand are represented." Tang Bo introduced the products to run for more than two years, and they have been committed to improving the maturity of voice interaction through data, and at the same time expanding the amount of information and knowledge.

For example, after Trump took office as President of the United States in 2017, many users asked "who is the American President's wife?", in order to let the chatter smoothly with the owner, similar hot news topics should be quickly added to the database.

Another unexpected case occurred to children. They have a strong desire for knowledge. They naturally use it as a partner for study and life. They often ask questions about some popular science subjects. For example, a child often asks: "Hey, what is a black hole?" When he called the message, he answered: "The black hole is a kind of celestial body in the universe..." The child often asks: "What is the universe?"

“The diversity of user needs often exceeds our imagination.” Every day, we collect user evaluations of new features. We have added a lot of data in the past two years. For example, children like bedtime stories, English nursery rhymes, and stars that young people like. The family fortune tells them if there is any water reversal today...

Zhong Bo recalled that it was difficult at first, because the users were very different and the interests of different age groups were also different. However, in the long run, various types of user needs still have habits and laws. As long as the core needs of users are constantly supplemented and divided, characteristic products for various types of users may emerge in the future.

When it comes to smart speakers, there is a recognized difficulty - the machine's understanding of natural language. People in the relaxed state of life, the language is very casual, very complex for any kind of algorithm. Especially when judging the semantics and satisfying one requirement, it may give up another requirement. At this time, it is very difficult to make a choice.

For example, the user says "who are you"? Should you play a song of the same name or give him a reply? The investigation found that under the condition of simple instructions, the most objectionable to the user is not a key issue, but increases the user's interaction cost. Therefore, it is necessary to answer the questions through trade-offs and trade-offs to win the right answers.

At the time of the initial design, he was too rational and decent, and the customer thought that he was too honest but not cute. This made a bunch of serious technical men overwhelmed. Finally, the girls who run the business can't stand it, adding a lot of nifty answers. For example, the user said, "You are a ugly!" You will reply, "I think it's okay, it's still handsome in the robotics world."

On the JD.com platform, since its launch, Zhai Speaker has been the champion of sales of WIFI speakers. It is the sum of the second and third place. It basically monopolizes the smart speaker market in China. This year 618 sold particularly well. They and users are getting closer. At the moment of joy, the first product was born, reformed, and quarreled. Everything in the past was vivid. They understand that the introduction and maturity of hardware will promote the optimization of interaction and use experience, and this experience will further stimulate the upgrading of hardware devices. There is still a long way to go and they will take a good walk.

"Hey!..." - voice communication, open platform, smart entrance to add points for a better life

As a new way of human-computer interaction in the Internet of Things, the intelligent speaker with voice interaction as the core represents more of a family lifestyle and a middle-class lifestyle. At present, Amazon's echo sales will soon exceed 10 million units. Echo is also rapidly moving from the niche circle of early adopters into the mass market for families.

Echo's success proves that voice interaction can be accepted by users as a means of smart home. Subsequently, Google's Google Home, Apple's Home Pod, Microsoft's Invoke also went online. Domestically, JD.com launched a "Sound Speaker" in cooperation with HKUST, Tencent Cloud released "Little Micro," and Baidu announced the acquisition of KITT, an American voice interactive technology company. AI, Ali released smart speaker "Tmall Elf X1".

According to statistics, there are currently 50 domestic smart speaker companies, and at least 500 related hardware companies or technology providers are active at the forefront of the situation and another gusty form has formed.

“We are very happy to see fellow workers. The current situation needs more strength to work together on ecology.” Zhou Zhengyou said that they particularly hope that the smart speaker field will have better products with features that they did not expect, prompting everyone to make breakthroughs in the existing standards.

By contrast, we can see that smart speakers currently face several different positioning. For example, Apple believes that speakers should first satisfy music enjoyment; Google and Amazon will position speakers as artificial intelligence controllers; cool dog music and Himalayan FM are looking for a hardware carrier for their own content business, trying to be dual in artificial intelligence and intelligent hardware. In the wave, master the scene advantage and user entrance.

From the industrial point of view, the smart speaker competition is quite fierce. There are 112 voice intelligence companies within one kilometer of Shenzhen Nanshan District, and Cheyah Mobile CEO Fu Sheng, an Orion Star Investor, also said last year that it would rather bankrupt and do robotics and artificial intelligence. This is evident in the industry's determination of smart prospects.

In fact, based on the accumulation of voice data for more than ten years, the company has been trying to do smart hardware for a long time, but it is not a long-term applause. The company is aware that they have no advantage in grasping the needs of end-users. Jingdong is the largest sales channel for electronic products in China. Apart from the value of channels, the company has a stronger grasp of user needs. At the same time, Jingdong Smart Micro-Link is China's first and largest cross-brand smart device interconnection platform, building the foundation of smart home. Linglong, which combines strengths and strengths, has the same technology accumulation, resource accumulation, and supply chain support needed for interaction, and even has advantages over other products.

In addition to the maintenance and updating of the database, the product features and positioning of the company are constantly being fine-tuned for the needs of users.

For example, in the existing application, New Oriental's spoken practice is very popular. Through data transfer, teachers can score in the background. The future may not only be human-computer interaction, but also interactive development to everyone. At the same time, users have shown a strong enthusiasm to shape their own flaws, such as custom wake-up words, custom timbre, custom switch mode, etc. These directions are also in the framework of product optimization.

At the same time, Micro-Link has been established since the beginning of 2013. At present, the amount of users is very large. The Alpha Intelligent Platform that Jingdong strives to build provides a lot of content and Internet service resources. Therefore, in addition to the voice song, another manifestation of "smart" is through micro-linkage control of curtains, lights, smart appliances, call taxis, express delivery and other third-party applications and shopping. These features mean that the initial shape of the family of IoT, allowing users to consume content through the cloud, off the desktop or the application can be directly consumed.

At present, although intelligence has not yet become just needed, it can significantly improve the quality of family life. From this perspective, every family needs a smart central control device. Similar to "I got up", "I went to bed", an order can produce multiple chain responses. In the morning, the user wakes up as soon as he can say "I got up", the curtains will automatically open, the lights will light up, the speakers will play your favorite music, and the water heater will start heating...

At the same time, in terms of content, the Jingdong Alpha open platform is gradually being perfected, and taxi, laundry, express delivery, and housekeeping services can all be accessed. Tang Bo believes that the service platform can gather high-quality resources. How to let users not download an APP for each application, as long as the first set up can be invoked with a single voice. How to open up this connection is an important development direction in the future.

Everyone has a family. For each product, the team also put in their own family to experience. For the current situation, they feel very lucky that in the population of 1.4 billion people in China, the potential consumers are 600 million. As long as the products are good, they are not afraid of having no market.

Recently, Zhou Zhengyou returned home from Anhui. “Yibao” was just over 1 year old and looked at his son’s toddler. When he suddenly thought that when everyone saw the first time he was walking down the production line, he would talk to it around every corner of the living room as if it were their child. There is life.

Like other children, "Erbao" also particularly likes yo. As soon as the familiar melody of the "Snail and Yellow Stork" sounded, it followed the music to the left and right. Parents and family gathered together and the children ran around their knees. In the living room of his home, he thought of many users who had been paralyzed in the living room. Maybe they are now also enjoying the happiness of family life.

Ah, how wonderful...

Chongqing LDJM Engine Parts Center , https://www.ckcummins.com

Posted on