ai, deepseek, machinelearning
title: The Rise of China's LLMs: A Complete History from 2017 to 2026 published: ture description: From Wu Dao 2.0 (1.75T params) to DeepSeek V3 ($5.6…
Tech news from the best sources
title: The Rise of China's LLMs: A Complete History from 2017 to 2026 published: ture description: From Wu Dao 2.0 (1.75T params) to DeepSeek V3 ($5.6…
class TinyTransformer(nn.Module): def __init__(self): super().__init__() # setting the constructor for the initial values that we are every gonna need…
Part 1: From Scratch to Systems . This machine learning series will be a real ride. It’s an interactive journey where I’ll be sharing and raising lots…
Motivation Robot manipulation is the ability of a robot to interact with and manipulate objects in the physical world, such as grasping objects, movin…
How does a neural network actually learn to be less wrong? Not the hand-wavy version. The real one. The one with the derivative, the chain rule, and t…
Inside AlphaEvolve: How Neural Networks and Evolutionary Algorithms Are Self-Optimizing Software For several years, the role of Artificial Intelligenc…
If you have ever tried to apply Machine Learning to financial time series, you know the heartbreak of the "perfect backtest." You build a model, train…
SANA-WM is worth watching for one reason: it combines longer video generation with explicit camera control. Five quick facts: It is an open-source 2.6…
Greetings all! Continuing the series where I build Rainbow DQN one component at a time on Snake. The first post covered encoding, the second covered m…
"The ability to reason step-by-step is not just a feature. It might be the difference between a language model that sounds intelligent and one that ac…
1. The Problem It Solves Imagine you’re a loan officer at a bank. You have thousands of past loan applications with features like income, credit score…
I recently built VoiceIQ — a complete Voice AI pipeline that listens to your voice, thinks using an LLM, and speaks the answer back. The best part? It…
Why does Claude respond faster when you pay more? And why does a longer conversation cost disproportionately more than a short one? For the longest ti…
The Problem Nobody Talks About You've spent hours training your neural network. The loss converged, metrics look good, and you're ready to deploy. But…
The Foundation of Modern AI Systems When people think of tools like ChatGPT, they often assume the intelligence comes from a single powerful system th…
Last week I read the 1958 Rosenblatt paper. The one that started everything. The Perceptron, the first learning machine, the idea that memory lives in…
Deep learning architectures are not random model names. DNN, CNN, RNN, and Transformer each appeared because data has different structure. Images need…
Greetings all! Quick context: this is part of an ongoing series where I'm building Rainbow DQN one component at a time on Snake and measuring what eac…
This is a submission for the Gemma 4 Challenge: Write About Gemma 4 🔬 AI for Scientific Discovery in the Real World: What Gemma 4 Changes The Moment A…
Deep learning is not just “a neural network with more layers.” That explanation is too small. The real idea is this: Deep learning lets models learn u…
CNNs changed computer vision because they stopped treating images like flat lists of numbers. An image has structure. Pixels near each other usually m…
In Week 11 Tenacious-Bench, we trained a LoRA adapter on Tenacious-style B2B sales emails using Supervised Fine-Tuning (SFT). We got a real performanc…
As I started my journey into AI and Data Science, I quickly realized that managing Python environments can be a total headache. Between broken depende…
Agriculture is the backbone of many economies, yet plant diseases continue to cause massive crop losses every year. What if farmers could detect disea…
1. What AGI Actually Requires (A Structural Definition) In open discussions, “AGI” is often described as: a very large model, a universal problem solv…