@@ -7,4 +7,88 @@ sidebartitle: GPT-3
77meta: GPT-3 is a trained neural network with 175 billion parameters that allows it to be significantly better at text generation than previous models.
88
99
10- GPT-3 is a trained neural network with 175 billion parameters that allows it to be significantly better at text generation than previous models.
10+ [ GPT-3] ( https://arxiv.org/abs/2005.14165 ) is a neural network that was
11+ trained by the [ OpenAI] ( https://openai.com/ ) organization with 175 billion
12+ parameters, which allows the model to be significantly better at natural
13+ language processing and text generation than the prior model,
14+ [ GPT-2] ( https://openai.com/blog/gpt-2-1-5b-release/ ) , which only had
15+ 1.5 billion parameters.
16+
17+ <img src =" /img/logos/openai.jpg " width =" 100% " alt =" OpenAI logo. " class =" shot rnd " >
18+
19+
20+ ## What's so special about GPT-3?
21+ The GPT-3 model can generate texts of up to 50,000 characters, with no
22+ supervision. It can even generate creative Shakespearean-style fiction
23+ stories in addition to fact-based writing. This is the first time that a
24+ neural network model has been able to generate texts at an acceptable
25+ quality that makes it difficult, if not impossible, for a typical
26+ person to whether the output was written by a human or GPT-3.
27+
28+
29+ ## How does GPT-3 work?
30+ To generate output, GPT-3 has a very large vocabulary, consisting of 2,500
31+ words, which it can combine to generate sentences. These words are sorted
32+ into different categories (nouns, verbs, adjectives, etc.), and for each
33+ category, there is a “production rule”, which can be used to generate a
34+ sentence. The production rules can be modified with different parameters.
35+
36+ A few examples:
37+
38+ * noun + verb = subject + verb
39+ * noun + verb + adjective = subject + verb + adjective
40+ * verb + noun = subject + verb
41+ * noun + verb + noun = subject + verb + noun
42+ * noun + noun = subject + noun
43+ * noun + verb + noun + noun = subject + verb + noun + noun
44+
45+ In addition, GPT-3 is able to understand negations, as well as the use
46+ of tenses, which allows the model to generate sentences in the past,
47+ present and future.
48+
49+
50+ ## Does GPT-3 matter to Python developers?
51+ GPT-3 is not that useful right now for programmers other than as an
52+ experiment. If you get access to [ OpenAI's API] ( https://openai.com/blog/openai-api/ )
53+ then Python is an easy language to use for interacting with it and
54+ you could use its text generation as inputs into your applications.
55+ Although there have been some initial impressive experiments in
56+ generating code for
57+ [ the layout of the Google homepage] ( https://twitter.com/sharifshameem/status/1283322990625607681 ) ,
58+ [ JSX output] ( https://twitter.com/sharifshameem/status/1282676454690451457 ) ,
59+ and [ other technical demos] ( https://twitter.com/__MLT__/status/1287357881675853825 ) ,
60+ the model will otherwise not (yet) put any developers out of a job
61+ who are coding real-world applications.
62+
63+
64+ ## How was GPT-3 trained?
65+ At a high level, training the GPT-3 neural network consists of two steps.
66+
67+ The first step requires creating the vocabulary, the different
68+ categories and the production rules. This is done by feeding
69+ GPT-3 with books. For each word, the model must predict the category
70+ to which the word belongs, and then, a production rule must be created.
71+
72+ The second step consists of creating a vocabulary and production rules
73+ for each category. This is done by feeding the model with sentences.
74+ For each sentence, the model must predict the category to which each
75+ word belongs, and then, a production rule must be created.
76+
77+ The result of the training is a vocabulary, and production rules for each
78+ category.
79+
80+ The model also has a few tricks that allow it to improve its ability to
81+ generate texts. For example, it is able to guess the beginning of a word
82+ by observing the context of the word. It can also predict the next word
83+ by looking at the last word of a sentence. It is also able to predict the
84+ length of a sentence.
85+
86+ While those two steps and the related tricks may sound simple in theory,
87+ in practice they require massive amounts of computation. Training
88+ 175 billion parameters in mid-2020 cost in the ballpark of
89+ [ $4.6 million dollars] ( https://lambdalabs.com/blog/demystifying-gpt-3/#:~:text=But%20to%20put%20things%20into,for%20a%20single%20training%20run. ) ,
90+ although some other estimates calculated it could take up
91+ to $12 million depending on how the hardware was provisioned.
92+
93+
94+
0 commit comments