Christopher Bishop's Deep Learning Book: A Comprehensive Overview
Christopher Bishop, a renowned figure in the field of machine learning, has recently authored a new book on deep learning. This book aims to provide a comprehensive introduction to the core concepts underpinning deep learning, catering to both newcomers and experienced practitioners. Given Bishop's established reputation and the rapid evolution of deep learning, this new textbook has generated considerable interest within the machine learning community.
Target Audience and Scope
The book is designed to be accessible to individuals with varying levels of expertise. It is intended both for newcomers to machine learning and for those already experienced in the field. The material covers key concepts relating to contemporary architectures and techniques, equipping readers with a robust foundation for potential future specialization.
The book's structure is well-suited to teaching a two-semester undergraduate or postgraduate machine learning course, while remaining equally relevant to those engaged in active research or in self-study.
Content and Structure
The book is organized into numerous bite-sized chapters, each exploring a distinct topic. The narrative follows a linear progression, with each chapter building upon content from its predecessors. A full understanding of machine learning requires some mathematical background and so the book includes a self-contained introduction to probability theory. However, the focus of the book is on conveying a clear understanding of ideas, with emphasis on the real-world practical value of techniques rather than on abstract theory. Complex concepts are therefore presented from multiple complementary perspectives including textual descriptions, diagrams, mathematical formulae, and pseudo-code.
While some reviewers note that the book shares content with Bishop's earlier work, "Pattern Recognition and Machine Learning" (PRML), particularly in the initial sections, this is not necessarily a drawback. The inclusion of probabilistic models, a strength of PRML, remains a valuable aspect of the new book. However, the first part of the book is a bit weird, because you can feel it's a patch-up from older material (mostly from PRML), and in some cases it's not really clear why some sections were kept (including some in-depth discussions about maximum likelihood on certain distributions, or K-means clustering and EM). The middle part covers basic NN layers (convolutional, recurrent, etc.). It's good but it's a quick exposition compared to other books. The last chapters cover generative models and they are incredibly good (as expected). Overall it's a fantastic addition to any library but (like PMLR) it's a more advanced book than most other introductions.
Read also: Guertin Education Expenses
Chapters that specific to Deep Learning seem to be 6-10, 12, 13 and 17-20.
Strengths
- Clarity and Accessibility: The book is praised for its clear and concise explanations, making complex ideas easier to understand. This is particularly beneficial for those new to the field.
- Comprehensive Coverage: The book provides an up-to-date treatment of a good range of topics, covering algorithms for supervised and unsupervised learning, modern deep learning architecture families, as well as how to apply all of this to various application areas.
- Practical Focus: The emphasis on real-world practical value, coupled with multiple perspectives (textual descriptions, diagrams, mathematical formulae, and pseudo-code), enhances the learning experience.
- Mathematical Rigor with Intuition: The book strikes a balance between mathematical rigor and intuitive explanations. It provides intuition through lots of graphs and pictures described with precise math.
- Motivating Key Decisions: The book motivates the key decisions in modern ML. This makes studying it not just informative but genuinely engaging.
- Author Expertise: Chris Bishop's extensive experience in explaining neural networks is evident in his ability to present complicated ideas in the simplest possible way.
Potential Weaknesses
- Overlapping Content: A surprising amount of this is just recycled verbatim from Bishop's earlier book (PRML).
- Typos and Inaccuracies: Some reviewers have noted the presence of typos and inaccuracies, raising concerns about attention to detail. For example, the book never points out that the proof as presented is wrong (Q is coprime to the other primes, but not necessarily prime itself).
- Relevance of Mathematical Background: A lot of the mathematical background seems pretty irrelevant as presented. For example, design matrices and the Moore-Penrose pseudo inverse are introduced on page 116, then never mentioned again.
- Terminology Issues: At several stages, terms are introduced in a way that is either unclear or inaccurate. For example, page 172 says that networks having more than one layer of learnable parameters are known as feed-forward networks or multilayer perceptrons, but "feed-forward" should be reserved for acyclic networks, and MLPs would additionally be fully connected. Page 347 says that "a conditional independence property that is helpful when discussing more complex directed graphs is called the Markov blanket or Markov boundary", but these aren't interchangeable terms and neither of them are clearly defined in the text, which talks about "the" Markov blanket rather than "a" Markov blanket even though Markov blankets aren't usually unique. A Markov boundary specifically refers to a minimal Markov blanket.
- Non-Standard Terminology: An additional very minor quibble is the use of "error function" rather than "loss function". While not incorrect, this feels non-standard and results in E(x) representing both loss functions and energy functions (and expectations if you include \mathbb{E}). The book also discusses Erf(x), so "error function" means different things depending on what section you're in.
- Limited Coverage of Specific Topics: Some reviewers have expressed a desire for more in-depth coverage of topics like geometric deep learning and flow matching.
Author Credentials
Chris Bishop is a Technical Fellow at Microsoft and is the Director of Microsoft Research AI4Science. He is a Fellow of Darwin College Cambridge, a Fellow of the Royal Academy of Engineering, and a Fellow of the Royal Society. Hugh Bishop is an Applied Scientist at Wayve, a deep learning autonomous driving company in London, where he designs and trains deep neural networks. He completed his MPhil in Machine Learning and Machine Intelligence at Cambridge University.
Alternative Resources
For those seeking alternative or supplementary resources for learning about deep learning, the following options are worth considering:
- "Dive into Deep Learning" (d2l.ai): This book is available for free online and offers a practical, hands-on approach to learning deep learning. The paperback print (color!) quality is good.
- Karpathy's lectures: Karpathy is "bottom up" (start from first principles and build on it).
- fast.ai: fast.ai is "top down" (you start w/ working examples and gradually "peel off" to understand it).
- "Understanding Deep Learning" (Prince): Some reviewers have expressed a preference for this book over Bishop's, particularly in the initial chapters.
Read also: Learn about the Bishop Law Firm Scholarship
Read also: Education at Bishop Heber College
tags: #Christopher #Bishop #deep #learning #book

