Generative Pretrained Transformer - How is GPT pretrained?


In order to train GPT to predict the next word in a sequence, a method known as masked language modeling is employed. To get it to function, we need to conceal some words in the input sequence and train the model to deduce their meaning from contextual cues.