Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text.
It is the third-generation language prediction model in the GPT-n series created by OpenAI, a San Francisco-based artificial intelligence research laboratory. GPT-3’s full version has a capacity of 175 billion machine learning parameters. GPT-3, which was introduced in May 2020, and is part of a trend in natural language processing (NLP) systems of pre-trained language representations.
The quality of the text generated by GPT-3 is so high that it can be difficult to determine whether or not it was written by a human. David Chalmers, an Australian philosopher, described GPT-3 as “one of the most interesting and important AI systems ever produced.”
One of the most powerful features of GPT-3 is that it can perform new tasks (tasks it has never been trained on) sometimes at state-of-the-art levels, only by showing it a few examples of the task.
In another astonishing display of its power, GPT-3 was able to generate “news articles” almost indistinguishable from human-made pieces.
The GPT-3 neural network is so large a model in terms of power and dataset that it exhibits qualitatively different behavior: you do not apply it to a fixed set of tasks which were in the training dataset, requiring retraining on additional data if one wants to handle a new task, instead you interact with it, expressing any task in terms of natural language descriptions, requests, and examples, tweaking the prompt until it “understands” & it meta-learns the new task based on the high-level abstractions it learned from the pretraining.
This is a rather different way of using a DL model, and it’s better to think of it as a new kind of programming, where the prompt is now a “program” which programs GPT-3 to do new things.