About the large model, I had a one-time chat with a professional

Original source: White Horse Business Review

Image source: Generated by Unbounded AI‌

"It's almost becoming a red sea." When I chatted with an entrepreneur about the big model, he directly threw this sentence at me.

In November last year, OpenAI released ChatGPT based on GPT-3.5, which instantly ignited the upsurge of large models. For more than half a year, there has been a "hundred-model war" in China. Leading Internet companies such as BAT and artificial intelligence companies have basically announced their own large-scale models.

At the beginning of May, Zhou Hongyi, the head of 360, said to the outside world, "If you don't go through two years of imitation and plagiarism, if you come up and say that you can surpass it, that is bragging." There is a two-year gap between the large model and foreign countries, I take it back, today it is close to the international level."

Some people feel that it took half a year to catch up with ChatGPT, and it seems that the large model is not difficult.

So, what are the core barriers to large models? What is the level of China's large model? What risks does the large model bring to human society?

To this end, we chatted with Shen Wei (pseudonym), a well-known 985 university professor who has been engaged in machine learning research for many years, to uncover the fog of the large model.

The GPT path has run through, so there is a "Hundred Model War"

**White Horse Business Review: Can you explain the big model in the most popular and simple language, what is the big model? How is it different from previous AI models? **

Shen Wei: The so-called large model refers to the large number of parameters in the model, but there is no clear and clear definition in the academic circles to define how large a parameter is called "big". It is still in the stage of rapid research and development. Generally speaking, the parameter quantity of a large model reaches 1. more than 100 million.

In fact, the development of deep learning has roughly gone through three stages. The first stage is from 2012 to 2017, represented by small domain-specific models such as image segmentation yolo and image classification ResNet, so the amount of parameters occupies a maximum of several hundred MB of memory.

In 2017, the advent of Transformer enabled deep learning to parallelize computing, which is more efficient, which means that large-scale model operations can be done, and subsequently produced large natural language models such as OpenAI GPT and Google Bert. In this stage, a large model of a specific task was born, and the model parameters exceeded 100 million.

Around 2020, deep learning will enter the general model stage. Its input is a sentence with blanks, and the role of the model is to "fill in the blanks". In the past, the model was adapted to downstream applications, but now it is the downstream application adaptation model. Models at this stage include GPT 3.5 and GPT 4 in the natural language field and Clip, DALLE, Stable Diffusion, Midjourney, etc. in the image field. At this stage, the model parameters can reach tens of billions and hundreds of billions.

**White Horse Business Review: Do you know which company or institution was the first to study large-scale models? What are the results? **

Shen Wei: At first, universities and scientific research institutions did related research. I know that Wu Dao of Beijing Zhiyuan Artificial Intelligence Research Institute and the brains of Pengcheng Lab are the earliest. Now the research in the industry is also very synchronized. Research in academia has some results, but the performance is not as amazing as ChatGPT.

**White Horse Business Review: In just a few months, there has been a "hundred-model war" in China, and the number of companies that have launched large-scale models is already beyond count. What do you think of this phenomenon? **

Shen Wei: Big models are definitely a trend, and people have been researching them. In the past, many companies may invest in a small area and do some research; now that a good product such as ChatGPT has suddenly appeared, everyone has seen a clear business direction, so they have begun to increase investment.

On the other hand, many companies are facing the pressure of commercial competition, and they may fall behind if they do not make large-scale models. Therefore, large-scale model projects must be launched.

White Horse Business Review: Zhou Hongyi recently said that he retracted the sentence "the gap between domestic large-scale models and foreign countries is two years". He believes that today it is close to the international level. It's only been a few months, and the big model doesn't seem to be difficult. How much do you think the difference is?

Shen Wei: The difference depends on who is benchmarking against. I have not experienced 360 Smart Brain products so far, so I am not very good at evaluating them. However, there are some generative AI products in China. After my experience, I feel that there is still a gap with ChatGPT. The large domestic models still need to work hard.

**Under heavy capital investment, only top companies have the opportunity? **

**White Horse Business Review: What are the core barriers to the development of large models? **

Shen Wei: The core barriers of large models include data, computing power, and algorithms.

From the perspective of computing power, at least 10,000 Nvidia A100 graphics cards are needed to train a generative AI like ChatGPT. The price of a single graphics card is currently 60,000 to 70,000 yuan, and the unit price of V100 with better performance is 80,000 yuan. The investment must reach at least six or seven billion yuan, and only a few top companies and institutions can afford it. For commercial organizations, spending hundreds of millions to buy a bunch of graphics cards may not necessarily produce results. This is a question that must be considered.

Next is data and algorithms. Algorithms are easier to understand, such as development frameworks and optimization algorithms. In terms of data, China has no shortage of data, and even more Internet data than the United States, but which data to choose for training and how to process them are core barriers.

**Baima Business Review: Do you usually communicate with companies? What is the difference between non-profit research institutions and corporations in research? **

Shen Wei: We will have some exchanges with the research department of the company. By communicating with enterprises, we will better understand the actual business needs. Sometimes the academic research we do will pay more attention to the forward-looking technology, and the requirements for implementation are not so high; but enterprises generally place more emphasis on implementation.

**Baima Business Review: Have you ever studied domestic large-scale models? Which one do you like the most? **

Shen Wei: Maybe the top companies can get out. One is heavy capital investment, and only leading companies have the strength; the other is that the data in the hands of several leading companies is more abundant; the third is that there has been a period of technology accumulation in the field of artificial intelligence.

**White Horse Business Review: What is your most promising large-scale model application? **

Shen Wei: From a technical point of view, the first application should be in the field of natural language processing and images, and speech recognition may be later.

You can see that more and more chatGPT is used to write copywriting. There are more and more applications for this kind of content creation. I think other applications such as smart customer service should be faster. Some of the current intelligent customer service often cannot understand the needs of users and cannot solve practical problems. If users cannot distinguish whether it is a human or a robot, the experience will be improved a lot; including the NPC in the game, the previous dialogue is hardcoded, Now it is gradually interactive, and the player experience will be better.

**White Horse Business Review: You used to be the chief analyst of a leading brokerage firm. From an investment perspective, what opportunities do you think big models have? **

Shen Wei: The logic of capital hype is from application to algorithms, models, and then to computing power; the logic of the industry is the opposite, and computing power has a clear growth expectation, so Nvidia has recently risen rapidly and a lot. Investors now also understand that it still needs to be verified that the big model can be realized and realized, but most of the increased capital investment has been invested in computing power. After repeated hype, the general rise in the market should have come to an end, and logical verification and performance fulfillment are needed later.

I originally mainly looked at the media and Internet industry, such as the relatively strong game sector some time ago. The logic of capital is first to apply large models to improve R&D efficiency and reduce costs; second, large models bring better experience, and NPC characters are smarter. Viscosity increase, UP value increase. Of course, performance verification may eventually be required.

Humans cannot control AI, or even their own destiny

**White Horse Business Review: We have seen that Altman and Musk have raised concerns about the safety of artificial intelligence. Now we only know that there are intelligent results through large model training, but the training process is like a black box. It's actually pretty scary. How do you feel about security issues? **

Shen Wei: In terms of security, first of all, I observed several abnormal phenomena. The first was an open letter signed in March by more than 1,000 people, including Musk and Apple co-founder Steve Wozniak, calling for a moratorium on training AI systems more powerful than GPT-4.

The second is that Jeffrey Hinton, the chief scientist of Google and the 75-year-old "Godfather of AI", resigned in May this year. The direct reason for his departure from Google was that he was worried about the danger of artificial intelligence, and even regretted the work he had done all his life. .

The third is that in the past two years, the large-scale training model in the academic field has added ethical discussions.

At present, I think the large model is still controllable, and there is no major problem; but the technology is developing too fast. In just a few months since it was released, GPT has gone through several iterations, and the development speed is too fast. As we become more and more intelligent, will we develop self-awareness, stop listening to human "commands", and go out of control? This question is what everyone is worried about.

**White Horse Business Review: Do you think AI will cause mass unemployment? In the face of AI, how can ordinary people keep their jobs? **

Shen Wei: From a macro perspective, I don't think AI will cause a lot of unemployment. Humans will always have jobs. It's just that the content of people's jobs will change. Of course, from an individual point of view, there will definitely be structural unemployment, and we can only continue to learn.

**White Horse Business Review: Many people said before that machines have no emotion, lack of imagination, and cannot replace humans; now that the human brain can be simulated by AI, can human lust and sexual desire be simulated in the future? Hormones, dopamine, etc. It's a biological reward mechanism. **

Shen Wei: It is the current assumption that machines have no emotions. Artificial intelligence is getting closer to the human thinking mode. Will it produce "emotions" similar to humans? It's just that they live in a different space dimension from humans, just like the daughter of Tu Hengyu in "The Wandering Earth". Artificial intelligence may generate its own world with biologically similar reward mechanisms to humans.

**White Horse Business Review: If everything can be calculated, planned, and set, wouldn’t it be a bit boring? **

Shen Wei: The behavior of AI is not predicted and planned by humans, but the result of its self-strengthening and self-training. The decision-making of MOSS in "The Wandering Earth" is made by itself, rather than obeying the instructions given by humans.

**White Horse Business Review: Is the replacement of carbon-based civilization by silicon-based civilization a deterministic direction? **

Shen Wei: This question is beyond the outline. According to the current development trend, it may be like this, just like in "The Wandering Earth", it is MOSS, not humans, who really dominates the destiny of mankind; but in reality, it is also possible that technology will stagnate at a certain stage and cannot pass it. After all, technological development is not linear.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)