Fine-Tuning Llama 3.2 3B on Medical QA: Week 4 - When Lower Loss Meant a Worse Model
What Happened This Week Week 3 produced a working fine-tuned model: one epoch, one dataset, a clear improvement over the base model. This week 4 was s…
Latest Testing & QA news from Tech News
What Happened This Week Week 3 produced a working fine-tuned model: one epoch, one dataset, a clear improvement over the base model. This week 4 was s…
Flash Attention: what it does and why it matters Your training job is paying for an A100 at $3/hour. The loss is going down, gradients are flowing, an…
How I got here On principle, you will never catch me parading myself as a some sort of expert data scientist. Technically, that's what I do in my day …
Part 1: From Scratch to Systems . This machine learning series will be a real ride. It’s an interactive journey where I’ll be sharing and raising lots…
Motivation Robot manipulation is the ability of a robot to interact with and manipulate objects in the physical world, such as grasping objects, movin…
How does a neural network actually learn to be less wrong? Not the hand-wavy version. The real one. The one with the derivative, the chain rule, and t…
Inside AlphaEvolve: How Neural Networks and Evolutionary Algorithms Are Self-Optimizing Software For several years, the role of Artificial Intelligenc…
If you have ever tried to apply Machine Learning to financial time series, you know the heartbreak of the "perfect backtest." You build a model, train…
"The ability to reason step-by-step is not just a feature. It might be the difference between a language model that sounds intelligent and one that ac…
1. The Problem It Solves Imagine you’re a loan officer at a bank. You have thousands of past loan applications with features like income, credit score…
The Problem Nobody Talks About You've spent hours training your neural network. The loss converged, metrics look good, and you're ready to deploy. But…
Last week I read the 1958 Rosenblatt paper. The one that started everything. The Perceptron, the first learning machine, the idea that memory lives in…
Greetings all! Quick context: this is part of an ongoing series where I'm building Rainbow DQN one component at a time on Snake and measuring what eac…
This is a submission for the Gemma 4 Challenge: Write About Gemma 4 🔬 AI for Scientific Discovery in the Real World: What Gemma 4 Changes The Moment A…
In Week 11 Tenacious-Bench, we trained a LoRA adapter on Tenacious-style B2B sales emails using Supervised Fine-Tuning (SFT). We got a real performanc…