Teaching Machines How to Draw

Teaching Machines to Draw: A Problem Solver’s Journey This blog post is on the topic of using machine learning for image generation. However, its goal is not to serve as a comprehensive guide to the field, nor a formal treatise on the subject. It is not even intended to be historically accurate, or to teach you about the mathematical formulas behind the techniques. Instead, it is a fictional story written from the perspective of a curious explorer who is attempting to solve a challenging problem....

February 13, 2025 · 17 min · 3580 words · Victor Liu

Attention Mechanism in Transformers Derived with Statistical Mechanics Interpretation

An full explanation and interpretation of scaled dot-product attention via the Boltzmann distribution and probability theory.

September 23, 2024 · 20 min · 4099 words · Victor Liu

Stochastic Gradient Descent

An overview of Stochastic Gradient Descent (SGD)

September 22, 2024 · 6 min · 1222 words · Victor Liu

Learning to Do Math With LLMs

I want to comment on the recent results achieved by OpenAI’s new o1 model family. Specifically, I will be dicussing formal verification, which can be applied to mathematics and computer programming. The name of this post is inspired by their September 12 2024 blog post titled “Learning to Reason with LLMs”. The abstract reads: “We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning....

September 15, 2024 · 6 min · 1099 words · Victor Liu

Overview of Diffusion Models

An overview of diffusion models, including their mathematical foundations, key concepts, and practical applications.

September 14, 2024 · 11 min · 2162 words · Victor Liu

Machine Learning as Interpolation

Exploring how machine learning models generalize by interpolating on data manifolds, with concrete examples from image and language processing

September 13, 2024 · 6 min · 1154 words · Victor Liu

Multi-Head Latent Attention

A short post on Multi-Head Latent Attention as presented in the DeepSeek-V2 paper.

September 13, 2024 · 3 min · 501 words · Victor Liu

Meaningful Feature Learning in Models

An overview of the challenges and solutions for learning meaningful features in machine learning models.

September 12, 2024 · 3 min · 528 words · Victor Liu

Expressiveness of Sparse Matrices

An evaluation of the expressiveness of sparse matrices in machine learning.

September 8, 2024 · 10 min · 2125 words · Victor Liu

Strassen's Algorithm in Ternary Matrices

An explanation of the potential benefits of Strassen’s algorithm in ternary matrices.

September 7, 2024 · 7 min · 1417 words · Victor Liu