It says its tailored for beginners, but I don't know what kind of beginner can parse multiple paragraphs like this:
"How wrong was the prediction? We need a single number that captures "the model thought the correct answer was unlikely." If the model assigns probability 0.9 to the correct next token, the loss is low (0.1). If it assigns probability 0.01, the loss is high (4.6). The formula is
−
log
(
�
)
−log(p) where
�
p is the probability the model assigned to the correct token. This is called cross-entropy loss."
I see. The problem with me writing these is even though I'm not an expert, I do have a bit of knowledge on certain things so I'm prone to say things that make sense to me but not to beginners. I'll rethink it
One of the downsides of using an expert LLM to write for you is that they know all that perfectly well, even if you don't, and aren't too bothered by such a chunk. It's like reading any Wikipedia article on mathematics... This is the kind of thing that people are documenting in the LLM-user literature in creating an illusion of expertise (or 'illusion of transparency'). Because the LLM explains it so fluently, you feel like you understand, even though you don't. Hence new phrases like 'cognitive debt' to try to deal with it.
"How wrong was the prediction? We need a single number that captures "the model thought the correct answer was unlikely." If the model assigns probability 0.9 to the correct next token, the loss is low (0.1). If it assigns probability 0.01, the loss is high (4.6). The formula is − log ( � ) −log(p) where � p is the probability the model assigned to the correct token. This is called cross-entropy loss."