I published two IEEE Access papers as an undergrad, one on IoT security, one on cancer detection
I'm a 6th semester CS student at COMSATS University Islamabad. Over the past few months I've been doing deep learning research alongside my coursework…
Latest Architecture news from Tech News
I'm a 6th semester CS student at COMSATS University Islamabad. Over the past few months I've been doing deep learning research alongside my coursework…
Flash Attention: what it does and why it matters Your training job is paying for an A100 at $3/hour. The loss is going down, gradients are flowing, an…
When people first hear about Transformers, they often encounter words like Query, Key, Value, and Attention Heads and feel confused. But the main idea…
Modern AI models are becoming increasingly powerful, but their growing capabilities come with rising risks of degradation: the loss of rare patterns, …
Training a robot to pick up an object sounds simple until you realize how many separate systems are involved: a vision model to understand the scene, …
Training a robot to pick up an object sounds simple until you realize how many separate systems are involved: a vision model to understand the scene, …
title: The Rise of China's LLMs: A Complete History from 2017 to 2026 published: ture description: From Wu Dao 2.0 (1.75T params) to DeepSeek V3 ($5.6…
class TinyTransformer(nn.Module): def __init__(self): super().__init__() # setting the constructor for the initial values that we are every gonna need…
Part 1: From Scratch to Systems . This machine learning series will be a real ride. It’s an interactive journey where I’ll be sharing and raising lots…
Motivation Robot manipulation is the ability of a robot to interact with and manipulate objects in the physical world, such as grasping objects, movin…
Inside AlphaEvolve: How Neural Networks and Evolutionary Algorithms Are Self-Optimizing Software For several years, the role of Artificial Intelligenc…
If you have ever tried to apply Machine Learning to financial time series, you know the heartbreak of the "perfect backtest." You build a model, train…
Greetings all! Continuing the series where I build Rainbow DQN one component at a time on Snake. The first post covered encoding, the second covered m…
The Problem Nobody Talks About You've spent hours training your neural network. The loss converged, metrics look good, and you're ready to deploy. But…
The Foundation of Modern AI Systems When people think of tools like ChatGPT, they often assume the intelligence comes from a single powerful system th…
Last week I read the 1958 Rosenblatt paper. The one that started everything. The Perceptron, the first learning machine, the idea that memory lives in…
Deep learning architectures are not random model names. DNN, CNN, RNN, and Transformer each appeared because data has different structure. Images need…
Greetings all! Quick context: this is part of an ongoing series where I'm building Rainbow DQN one component at a time on Snake and measuring what eac…
Deep learning is not just “a neural network with more layers.” That explanation is too small. The real idea is this: Deep learning lets models learn u…
CNNs changed computer vision because they stopped treating images like flat lists of numbers. An image has structure. Pixels near each other usually m…
Agriculture is the backbone of many economies, yet plant diseases continue to cause massive crop losses every year. What if farmers could detect disea…
1. What AGI Actually Requires (A Structural Definition) In open discussions, “AGI” is often described as: a very large model, a universal problem solv…