Big model battle in full swing? "Small model" may be the way out

巴比特_

2023-08-03 05:42:34

Text: Qingcheng Finance, Author: Qing Mu Editor: Liu Zi

Image source: Generated by Unbounded AI

On July 26, OpenAI launched the Android version of ChatGPT. Although it is currently only available in the United States, India, Bangladesh and Brazil, OpenAI also said that it will promote the Android version of ChatGPT in more countries next week. This brought ChatGPT, which has been slightly less popular recently, back into the public eye.

At the beginning of ChatGPT’s launch, it took only two months to become the fastest application in history to break through 100 million users. The global technology market that has been silent for a long time is boiling again. Domestic investors and entrepreneurs are flying to Silicon Valley asked.

Faced with this turbulent AI wave, Chinese entrepreneurs and investors acted quickly. A few months later, China’s technology industry has shown a spectacular posture of “Hundred Models War”. In the first half of 2023, more than 80 large-scale model products have appeared in China. According to the latest data, 130 companies are already making large-scale models in the domestic market. On a global scale, more than 400 large models were newly released in the first half of this year.

While Chinese large-scale model players are chasing commercial interests and the future of technology, they are also dubbed national sentiments: to be the Chinese version of OpenAI.

According to news on July 24, before the Android version of ChatGPT was launched, IDC released a large-scale model technical capability evaluation report showing that Baidu Wenxin large-scale model 3.5 scored 7 out of 12 indicators, ranking first in the comprehensive score. Wu Tian, vice president of Baidu, said that the capability of the new version of Wenxin Yiyan 3.5 has surpassed that of ChatGPT 3.5, which is an important milestone in the development of related technical work in our country.

HKUST Xunfei previously announced that it will carry out the third iteration of the Xinghuo large model on October 24, fully benchmarking ChatGPT, the Chinese ability has surpassed GPT3.5, and the English ability is equivalent to GPT3.5.

01 scene, scene

In fact, as Li Zhifei, the former Google scientist and the founder and CEO of Mobvoy, said, there may not be an organization like OpenAI in China.

Compared with general-purpose large-scale models such as ChatGPT, domestic large-scale model products pay more attention to applications and scenarios, that is, vertical large-scale models, industry large-scale models, and industrial large-scale models. In this regard, the opinions of the bigwigs in the technology venture capital circle almost expressed the same meaning.

Robin Li, the founder of Baidu, has long publicly stated: “It doesn’t make much sense for a start-up company to recreate ChatGPT. I think there is a great opportunity to develop applications based on this large language model. There is no need to reinvent the wheel. After having the wheel, it is possible to make a car.” , Aircraft, the value may be much greater than the wheel."

Zhu Xiaohu, managing director of GSR Venture Capital, wrote in Moments: “Don’t be superstitious about the general model, because next year GPT-3.5 will become commodity (general infrastructure), and three years later, GPT-4 will also be. For most Entrepreneurs, scenarios first, data is king!”

Fu Sheng, chairman and CEO of Cheetah Mobile, believes that there will be two roads for large models. A big model called Getting Better is “Building an Einstein”. But many jobs do not require “Einstein”, college graduates can do it. This is another way. I believe that there must be a large number of people making “civilian large models”.

Zhang Pingan, CEO of Huawei Cloud, said at the Pangu Large Model 3.0 press conference: “The Pangu Large Model has no time to write poems and chat. No matter how many parameters there are and how good the dialogue ability is, if it can’t solve practical problems, it will not be of much use.”

Most of the large-scale models released in China recently are aimed at vertical industries, such as the Yanxi large-scale model released by JD. The vertical large-scale model “Ziyue” in the field of education released by Dao.

The JD Yanxi large model has accumulated JD’s accumulated knowledge in retail, logistics, health, finance and other industries for many years. It integrates 70% of general data and 30% of JD’s original supply chain data for training, bringing product recommendations, financial policies, Capabilities in areas such as financial management rules and logistics experience. Cao Peng, President of the JD Cloud Division, believes that a single large-scale model technology itself cannot directly generate value, and the technology can only generate actual value when it is put into the scene.

Ctrip’s travel model asks to screen 20 billion unstructured tourism data, combine Ctrip’s existing structural real-time data, and Ctrip’s historically trained robots and search algorithms to conduct self-developed vertical model training, and invest a lot of manpower Generate and verify the general response content of travel. Liang Jianzhang, founder and chairman of the board of directors of Ctrip, said that Ctrip will spare no effort to invest in large models, and there is no limit to the amount of investment.

In terms of applications, Baidu has recently reached a cooperation with Lenovo in the field of AIGC. Lenovo’s private customization business has fully introduced Baidu Wenxin Yige. Consumers can customize the appearance of laptops through AIGC-themed painting activities on the official website. Huawei Cloud Pangu Large Model and Meitu Visual Large Model MiracleVision jointly launched the AI model fitting function, which can effectively improve the e-commerce efficiency of clothing products.

Although the vertical large model does not have high requirements for parameters and computing power like the general large model, it has higher requirements for scenarios and data, requiring developers to have professional knowledge, rich industry application practice accumulation, and tolerance for errors The degree is also lower, requiring AI to have super stability and reliability. Therefore, the closer to the vertical industry, the greater the advantages of the vertical model.

“The general large-scale model can solve 70%-80% of the problems in 100 scenarios, but it may not be able to meet the needs of a certain scenario of the enterprise 100%. If the enterprise fine-tunes based on the large-scale industry model and its own data, it can Construct a dedicated model to create a highly available intelligent service, and the model parameters are less than the general large model, the cost of training and reasoning is lower, and the model optimization is easier.” Senior Executive Vice President of Tencent Group, CEO of Cloud and Smart Industry Business Group Tang Daosheng said.

From this perspective, “small models” may be more sexy and better able to solve specific problems.

SenseTime has launched a large model with 100 billion parameters, and is also launching a small model with 10 billion parameters for different vertical fields. The advantage of a large model is that it can find new solutions and help solve new problems. Once solved, it can generate a large amount of data in a narrow field and retrain a small model. Some small models can even run on the terminal at a lower cost. But the small model wouldn’t exist without the big model.

02 Big manufacturers take all winners, where are the opportunities for start-up companies?

There is a view in the industry that the Chinese version of ChatGPT will only be produced in five companies: Baidu, Ali, Tencent, Byte, and Huawei.

In the Internet era, it is a typical “721”. The first place is delicious and spicy, the second place is barely surviving, and the third place is in danger.

Right now, a hundred models are fighting, and everyone wants to get a share of the big model. But there is a very real problem that big factories have advantages that start-up companies can’t match when they make large-scale models. For a small and beautiful start-up company, it is probably just an illusion that it wants to overthrow a large factory with only three or five people.

Large models cannot be separated from the cloud platform. The landing of large models requires continuous fine-tuning and training, all of which need to be run on the cloud platform. Baidu, Ali, Tencent, Byte, and Huawei all have their own cloud businesses. Baidu and Huawei have also completed the layout from chips to applications. Baidu is “Kunlun Core + Flying Paddle Platform + Wenxin Large Model”, Huawei is “Shengteng chip + MindSpore framework + Pangu large model”, which are advantages that start-up companies cannot match.

In addition, large companies have natural advantages in terms of capital reserves, human resources, usage scenarios, and data accumulation. Without a landing scenario for startups, technology cannot be iterated, continuously optimized, and data network effects cannot be formed.

So small companies have no chance at all?

Let’s revisit the metaphor of the gold rush era: “This era is very similar to the gold rush era. If you went to California to pan for gold at that time, a lot of people would die. But people who sell spoons and shovels can always make money.” This is also true. Lu Qi, the founder and CEO of Qiji Chuangtan, recently shared with entrepreneurs. Lu Qi hopes to help Chinese entrepreneurs recognize this historic turning point, locate the coordinates of today’s era, and find their own position.

In early July, Stuart Russell, a professor of computer science at the University of California, Berkeley and author of “Artificial Intelligence—A Modern Approach,” warned that AI-powered bots such as ChatGPT could soon “run out of text in the universe.” ", and the technique of training bots by collecting large amounts of text is “beginning to run into difficulties.”

Last week, more than 8,500 writers signed a letter asking leaders of companies including OpenAI, Microsoft, Meta and Alphabet not to use their work to train AI systems without permission or payment, and asking These artificial intelligence companies compensate their copyright losses.

The stock of Internet data is about to be exhausted, and high-quality data is becoming increasingly scarce. A model is good or bad, 20% is determined by the algorithm, and 80% is determined by the quality of the data. In the “troika” of data, computing power, and algorithms, data is the core, longest-term, and most fundamental element. Large models need to be fed with massive amounts of data in order to be continuously optimized and iterated.

Next, the real value will become sustainable high-quality data. How to continuously obtain data sources that are legal, compliant and in line with business logic will become a key factor in improving the performance of large models. Therefore, data operators may become an important role restricting the development of large models.

Ideally, the model continuously provides services to users, and users continuously generate new data for the model. As for the next step, private data will be spelled out. More personalized services mean more private data, and it is unlikely that humans will show private data to the big model without reservation.

In any era, “water seller” is always a good business. Interestingly, no matter whether you are a pioneer, explorer or gold digger, you cannot do without water. Of course, you can also sell spoons and shovels.

03 Conclusion

In the past few months, there was a post that was widely circulated on social platforms:

Think of the AI as a child. AI in Europe and the United States belongs to the elite education route. After he was born, his family spent money all the way for him to study until he got a doctorate.

China’s AI belongs to the utilitarian education line. He is raised for survival at birth, and when he is 15 years old, he is forced to find ways to earn money for the family and learn how to market skills.

A few words, tasted carefully, full of flavors.

Although not necessarily true, this may also explain to some extent why OpenAI and ChatGPT did not appear in China. In fact, some domestic investors and entrepreneurs were full of confidence at the beginning and wanted to be the Chinese version of OpenAI. After a few months of tossing, I found that I still need to find a profit model, explore business application scenarios and commercialization capabilities.

It is worth mentioning that some C-end users have recently perceived that the performance of ChatGPT-4 on certain tasks is too poor. This is considered to be the use of a mixed expert model (MOE) by OpenAI to reduce costs and increase efficiency, and to shift its focus to enterprise-level services. one of the actions.

Looking around, Apple is also developing its own large language model Apple GPT, and Qualcomm is already studying how to realize it by the end of this year, so that the model with a parameter level of 10 billion to 15 billion can run offline on the mobile phone without cloud processing.

Big models are a reshaping of productivity, a paradigm shift. 200 years ago, humans used steam engines to convert thermal energy into kinetic energy for the first time, and the era of industrialization began. Today, humans use large models to convert electrical energy into brain power and general intelligence, and a new era is opening.

Of course we don’t need too many wheels, but we still need good wheels.

There is a long way to go.

View Original

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Comment

0/400

No comments