- In the midst of GPT-4 speculations surrounding NeurIPS 2022 in New Orleans, OpenAI introduced text-davinci-003, a new member of the GPT-3 family of AI-powered big language models, which supposedly outperforms its predecessors by interpreting more complicated instructions and creating higher-quality, longer-form material.
Amid GPT-4 speculations surrounding NeurIPS 2022 in New Orleans this week (including murmurs that GPT-4 specifics would be published), OpenAI generated significant news.
On Monday, the business introduced text-davinci-003, a new member of the GPT-3 family of AI-powered large language models and part of the “GPT-305 series”, which supposedly outperforms its predecessors by interpreting more complicated instructions and creating higher-quality, longer-form material.
According to a new Scale.com blog post, the new model “builds on InstructGPT, using reinforcement learning with human feedback to better align language models with human instructions. Unlike davinci-002, which uses supervised fine-tuning on human-written demonstrations and highly scored model samples to improve generation quality, davinci-003 is a true reinforcement learning with human feedback (RLHF) model.”
An early demo of ChatGPT offers some safeguards
Meantime, OpenAI launched an early demo of ChatGPT, an interactive, conversational model that is part of the GPT-3.5 series and whose dialogue style “makes it possible for ChatGPT to answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.”
A new OpenAI blog post said that the research release of ChatGPT is “the latest step in OpenAI’s iterative deployment of increasingly safe and useful AI systems. Many lessons from the deployment of earlier models like GPT-3 and Codex have informed the safety mitigations in place for this release, including substantial reductions in harmful and untruthful outputs achieved by the use of reinforcement learning from human feedback (RLHF).”
ChatGPT has “limitations”
OpenAI described ChatGPT’s “limitations” in a blog post, including the fact that occasionally answers appear plausible but are erroneous or nonsensical.
The blog further says, “Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.”
Open AI emphasized that ChatGPT will “sometimes respond to harmful instructions or exhibit biased behavior. We’re using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now. We’re eager to collect user feedback to aid our ongoing work to improve this system.”
Sam Altman, OpenAI CEO, describes language interfaces a “big deal”
On Twitter, OpenAI CEO Sam Altman expressed about language interfaces saying, “are going to be a big deal, I think. Talk to the computer (voice or text) and get what you want, for increasingly complex definitions of “want”!” He warned that it is an early demo with “a lot of limitations–it’s very much a research release.”
But, Sam Altman added, “This is something that scifi really got right; until we get neural interfaces, language interfaces are probably the next best thing.”