
Reading: AI Engineering
This book is a good introduction to the AI Engineering field, written in 2024. Do not expect a deep dive into topics or detailed technical explanations. One person on GoodReads said that it gets too much into the weeds for beginners while lacking in-depth architectural analysis for seasoned readers, which I agree with. It has 500 pages (in my paperback version) filled with content, including foundational models (LLMs), prompt engineering, fine-tuning models, RAGs, evaluating models, agents, how to build systems, how to optimize inference, and more. It mostly takes a boots-on-the-ground approach.
What's AI Engineering? AI Engineering is the next iteration of classical Machine Learning that I learned at university. This time it focuses largely on LLMs (foundational models). AI engineers build systems. They don;t spend much time researching new approaches. Instead, they focus on utilizing existing models, adapting them to their requirements, evaluating them, and shipping products. Think of it as being more business and product-oriented rather than research-oriented. That's what this book is about.
Of course, the basics are introduced, e.g., tokens, embeddings, attention mechanism, backpropagation, entropy, perplexity, transformers, self-supervision, etc. I am not sure how hard it would be for someone with no ML background to understand. But with some ML prerequisites, this book is rather easy to read without the need to find external explanations.
Read this book if you want to learn what decisions you will have to make when building an AI system, e.g., model selection. An expensive state-of-the-art model, which may also be slow, for a simple classification problem? Probably not. You will learn about the pros and cons of buying vs building models, what sampling is, and how to change the sampling strategy, etc.
There is an entire chapter on evaluation: the challenges during evaluation, common metrics and criteria, benchmarks, how to use AI as a judge, and finally, how to design your evaluation pipeline.
Prompt engineering is explained: how to manage context, what are system and user prompts, and best practices for prompting, such as clear instructions, breaking complex tasks into simple steps, chain-of-thought, prompt improvements, and storing your prompts in version control. Then, jailbreaking and prompt injection are covered, including how to defend against them and common security strategies.
When you reach the limits of context or experience hallucinations, you should check out RAGs (they simulate long-term memory for LLMs). It's about using data from external memory (documents, tables, chat history, etc.). Multiple search strategies are explained, from term-based retrieval to embedding-based retrieval and using vector databases. You will also learn how to improve retrieval with chunking strategies, splitters, reranking, or query rewriting.
The book also explains what an agent is: what usual tools are available to an agent, what planning mode is, and how to handle failures and evaluate agents.
If other options are not enough, you use fine-tuning to adapt a model to a specific task. Fine-tuning is the process of further training the whole model or part of it. You will learn when to (or not to) fine-tune, memory considerations, quantization, and fine-tuning strategies (e.g., LoRA).
You will learn about dataset engineering, including data curation, quality, coverage, quantity, and acquisition. What data synthesis or model distillation is, and how to inspect data, deduplicate, and clean it. Remember, training data is the most important part of building a custom AI system.
How to optimize inference is also covered: what AI accelerators are, and how to optimize the model through pruning, caching, and batching.
Lastly, in the final chapter, all the theory is combined, explaining how to build an entire AI system: context construction, putting in guardrails, adding a model gateway, reducing latency with caching, adding agent patterns, using monitoring and observability, and gathering user feedback.
Read this book if you are in the process of building an AI system.