Farouk's Blog

Chain of Thought Reasoning: Why Does It Work for LLMs?

Posted on March 3, 2025 |

Fine-tuning has proven to be the most effective way to improve the performance of large language models on domain-specific tasks. While every task may be different to us humans, to language models, they all involve one thing - predicting the next token. We know that each token is generated at a computational cost that is a function of the input length. In essence, tasks that produce approximately equal number of tokens will require roughly the same computational cost. [Read More]

ai

The Anatomy of an AI Agent

Posted on January 17, 2024 |

“AI agents” as they are known today (in 2024) are software applications that utilise the capabilities of large language models (LLMs) to intelligently complete tasks and return results. In this post, I will explain what goes on behind the scenes in these kinds of systems, what they are made of and how they operate in more complex cases. Before you read! This post assumes that you already know about language models and have at least seen one working. [Read More]

ai

Reflecting on 7 months of re-learning Machine Learning

Posted on June 15, 2023 |

In November of 2022, I began to relearn machine learning. You may misunderstand this as simply refreshing or revising my knowledge, but believe it when I say I am actually “relearning” machine learning, and in this post, I will share the motivation for this. The motivation When I discovered machine learning in 2020, I was fascinated by the concept of making predictions based on the patterns identified from past data; I thought anything could be predicted when you throw machine learning into enough data, so I began to learn what I thought was to be known. [Read More]

Machine Learning

Software 2.0

Posted on February 21, 2023 |

There have been a lot of conversations around Software 2.0 recently; the next generation of software, and perhaps, a new way of building a different class of software. And yet, its meaning is still very much unclear. Some describe it as “software building other software”, others define it as “people building software” by simply feeding it data as opposed to writing actual code. But to me, these definitions say nothing about the kind of software that comes out of the process or the new experience it offers the user. [Read More]

Software 2.0

Why LLMs could be a big deal for HCI

Posted on February 20, 2023 |

Large Language Models are machine learning models that have been trained on a very large volume of textual data that represent sequences of words in different languages. The goal is to predict the next most probably token (word) given a sequence of tokens (words). With this powerful fundamental attribute, they can be adopted in many use cases including language translation, text summarisation, text classification among other. A use case which I find particularly interesting is the classification of human-written sentences into actionable intents which I believe can enable humans to communicate with computers in a more natural way. [Read More]

LLM Artificial Intelligence Software 2.0