Solving a machine-learning mystery | MIT News | Massachusetts Institute of Technology

RomanFebruary 7, 2023AI and Machine Learning

MIT researchers have explained how large language models like GPT-3 are able to learn new tasks without updating their parameters, despite not being trained to perform those tasks. They found that these large language models write smaller linear models inside their hidden layers, which the large models can train to complete a new task using simple learning algorithms.
— Read on news.mit.edu/2023/large-language-models-in-context-learning-0207

Share this:

Related