
Chain of Thought Reasoning: Why Does It Work for LLMs?
Fine-tuning has proven to be the most effective way to improve the performance of large language models on domain-specific tasks. While every task may be different to us humans, to language models, they all involve one thing - predicting the next token. We know that each token is generated at a computational cost that is a function of the input length. In essence, tasks that produce approximately equal number of tokens will require roughly the same computational cost.
[Read More]