🎉 Gate.io Growth Points Lucky Draw Round 🔟 is Officially Live!
Draw Now 👉 https://www.gate.io/activities/creditprize?now_period=10
🌟 How to Earn Growth Points for the Draw?
1️⃣ Enter 'Post', and tap the points icon next to your avatar to enter 'Community Center'.
2️⃣ Complete tasks like post, comment, and like to earn Growth Points.
🎁 Every 300 Growth Points to draw 1 chance, win MacBook Air, Gate x Inter Milan Football, Futures Voucher, Points, and more amazing prizes!
⏰ Ends on May 4, 16:00 PM (UTC)
Details: https://www.gate.io/announcements/article/44619
#GrowthPoints#
Someone finally made it clear about the status quo of GPT! The latest speech by OpenAI Daniel is very popular, and it has to be a genius hand-picked by Musk
Source: Qubit
Following the release of Windows Copilot, the popularity of the Microsoft Build conference was detonated by a speech.
Former Tesla AI director Andrej Karpathy believed in his speech that tree of thoughts is similar to AlphaGo's Monte Carlo Tree Search (MCTS)!
Netizens shouted: This is the most detailed and interesting guide on how to use the large language model and GPT-4 model!
What became popular with the speech was also a note compiled by Twitter netizens based on the speech. There are a total of 31 notes, and the number of reposts has exceeded 3000+:
How to train GPT assistant?
Karpathy's speech this time is mainly divided into two parts.
Part One, he talked about how to train a "GPT assistant".
Karpathy mainly describes the four training stages of AI assistants: pre-training, supervised fine tuning, reward modeling and reinforcement learning.
Each stage requires a dataset.
Karpathy supplements it with more examples:
What needs to be clearly pointed out here is that the base model is not an assistant model.
Although the base model can answer the question, the answer it gives is not reliable, and it is the assistant model that can be used to answer the question. An assistant model trained on the base model, with supervised fine-tuning, will outperform the base model in generating responses and understanding text structure.
Reinforcement learning is another critical process when training language models.
By training with human-labeled high-quality data, reward modeling can be used to create a loss function to improve its performance. Then, reinforcement training is carried out by increasing the positive label and reducing the probability of negative label.
In creative tasks, the use of human judgment is crucial to improving AI models, and adding human feedback can train models more effectively.
After intensive learning with human feedback, a RLHF model can be obtained.
After the model is trained, the next step is how to effectively use these models to solve problems.
How to use the model better?
In Part Two, Karpathy focuses on hinting strategies, fine-tuning, the rapidly growing tool ecosystem, and future expansion.
Karpathy gave specific examples to illustrate:
And hint() can make up for this cognitive difference.
Karpathy further explains how the Thought Chain hint works.
For inference problems, if you want Transformer to perform better in natural language processing, you need to let it process information step by step, instead of directly throwing it a very complicated problem.
Daniel Kahneman, a Nobel laureate in economics, proposed in "Thinking Fast and Slow" that the human cognitive system includes two subsystems, 1 and 2. 1 is mainly based on intuition, while 2 is a logical analysis system.
In layman's terms, 1 is a fast and automatic process, and 2 is a well-thought-out part.
This is also mentioned in a recent popular paper "Tree of thought".
Karpathy thinks this line of thinking is very similar to AlphaGo:
In this regard, Karpathy also mentioned AutoGPT:
The content of the window context is the working memory of the transformers at runtime, and if you can put task-related information into the context, it will perform very well because it has immediate access to this information.
In short, related data can be indexed so that models can be accessed efficiently.
Finally, Karpathy briefly talked about Constraint ing and fine-tuning in large language models. Large language models can be improved through constraint hints and fine-tuning. Constraint hinting enforces templates in the output of large language models, while fine-tuning adjusts the model's weights to improve performance.
About Andrej Karpathy
Later, Musk, one of the co-founders of OpenAI, took a fancy to Karpathy and dug people to Tesla. But also because of this incident, Musk and OpenAI fell out completely, and were finally kicked out. At Tesla, Karpathy is the head of projects such as Autopilot and FSD.
In February of this year, seven months after leaving Tesla, Karpathy joined OpenAI again.
Recently, he tweeted that there is currently a lot of interest in the development of an open source large language model ecosystem, which is a bit like a sign of the early Cambrian explosion.
Reference link: [1]