Tech News
All News AI & ML Architecture DevOps Open Source Programming Team Management Testing & QA Web

Architecture

⚑ Report a Problem

Latest Architecture news from Tech News

All topics agents ai api architecture automation aws backend beginners career database devchallenge devops gemma javascript llm machinelearning mcp opensource performance productivity programming python react security showdev softwareengineering systemdesign tutorial typescript webdev
All EN RU
EN

Hermes-Crew Hybrid: A Hybrid Architecture for Secure Multi-Agent AI Workflows

Hermes-Crew Hybrid: A Hybrid Architecture for Secure Multi-Agent AI Workflows I built a hybrid system that combines a central orchestrator (Hermes) wi…

aisecuritycrewaiollama
Dev.to Jun 15, 2026, 03:16 UTC
EN

Open Notebook Review: Self-Hosted NotebookLM Alternative

Originally published on andrew.ooo — visit the original for any updates, code snippets that aged out, or follow-up posts. TL;DR Open Notebook (by Luis…

opennotebooknotebooklmalternativeselfhostedollama
Dev.to Jun 9, 2026, 10:10 UTC
EN

Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)

A reader on my last post said Ollama was leaving a lot on the table — that a tuned backend with multi-token prediction (MTP) could roughly double my 3…

ollamallmperformancemachinelearning
Dev.to Jun 9, 2026, 09:21 UTC
EN

Fitting WhisperX large-v3 + a 24B LLM on one 3090: a reproducible context-capping recipe

This is the technical, reproducible version of a fix I shipped on my own homelab. If you want the narrative version, that's on Medium. This one is the…

homelabollamalocalllmdevops
Dev.to Jun 3, 2026, 03:35 UTC
EN

Do You Have a Homelab? Secure Your Local LLM Artifacts

We used to build homelabs around Linux servers, Docker containers, and NAS drives. It was about uptime, RAID levels, and monitoring CPU temps. Now, th…

homelabllmsecuritysbomollama
Dev.to Jun 2, 2026, 09:24 UTC
EN

Local-first: a Model on Your Own Machine, Zero Cloud

This is the concrete, runnable walkthrough for Post 1 of the Portway series . The goal: stand up a single model behind an OpenAI-compatible endpoint o…

ollamapythonaillm
Dev.to May 30, 2026, 18:27 UTC
EN

I Tried Building a Complex Security Tool with a 1.5B Local Model — Here's What Broke

Problem: I had aider running on Lubuntu, three API keys configured, a detailed architecture diagram, and a clear goal — build a modular forensic data …

ollamaaiderlocalaicybersecurity
Dev.to May 27, 2026, 20:06 UTC
EN

Tesla P40 in a Homelab: 24GB of Inference on a Budget

The Tesla P40 is a seductive piece of hardware: 24GB of VRAM for a fraction of the cost of a modern RTX card. But after three weeks of fighting with i…

teslap40nvidiaproxmoxollama
Dev.to May 25, 2026, 16:15 UTC
EN

Building a Private RAG System: Lessons from a Local-First AI Journal

Most AI apps quietly send your data to the cloud. DiaryGPT does the opposite — and this is the full technical story. The Problem With AI + Private Dat…

aiprivacyollamallm
Dev.to May 23, 2026, 10:19 UTC
EN

Using Ollama with the Laravel AI SDK: Run Local LLMs for Free

Originally published at hafiz.dev API costs add up fast during AI development. You prompt an agent 50 times debugging a tool, that's 50 API calls. You…

laravelaisdkaidevelopmentollama
Dev.to May 18, 2026, 05:11 UTC
EN

I shipped local LLM features two months ago. Production never ran them once.

This is a submission for the Gemma 4 Challenge: Build with Gemma 4 Two months ago I shipped local-LLM features in TextStack — an open-source reader fo…

devchallengegemmachallengegemmaollama
Dev.to May 12, 2026, 11:23 UTC
EN

No More Hallucinated Citations: A Domain-Specific RAG System with Ollama, ChromaDB and AI Agents

TL;DR: I built a full-stack knowledge pipeline around a corpus of 2,514 academic PDFs focused on urban art. The system combines ChromaDB vector search…

ragollamachromadbaiagents
Dev.to May 11, 2026, 02:27 UTC
EN

Local LLMs in 2026: What Actually Works on Consumer Hardware

Local LLMs in 2026 work on three hardware lanes: 32-core CPU with 64GB+ RAM hits 10-25 tokens per second on Qwen 3 14B, an RTX 4090 hits 30-80 tokens …

ailocalllmollamaqwen
Dev.to May 10, 2026, 11:36 UTC
EN

pgvector + Ollama Setup

RAG Without the Chatbot: pgvector + Ollama for Operational Data Most RAG tutorials start with "upload a PDF and ask questions about it." That's fine f…

javalangchain4jollamapostgres
Dev.to May 6, 2026, 23:00 UTC
EN

[Day 3] I Had a Local LLM Analyze a Year of My Credit Card Statements

[Day 3] I Had a Local LLM Analyze a Year of My Credit Card Statements Intro Day 3: I'm going to hand a year of credit card statements over to a local …

localllmaidgxsparkollama
Dev.to May 5, 2026, 22:52 UTC
EN

Build a RAG agent with LangChain and Ollama

I started where a lot of us do: a LangChain RAG walkthrough. You chunk some text, embed it, retrieve top‑k chunks, and wire an LLM to answer questions…

pythonraglangchainollama
Dev.to May 5, 2026, 04:12 UTC
RU

Гефестыч: наш опыт автоматизации Code Review через LLM. «Грабли», решения, код

Привет, Хабр! Меня зовут Данил Чечков, я Team Lead команды High End Meta Backend в «Леста Игры». Мы занимаемся всей web-составляющей «Мира кораблей». …

llmpydantic-aiopenwebuillama.cppollamaragcode reviewself-hostedatlassian
Habr May 4, 2026, 11:08 UTC

© Tech News — Headline Aggregator

Sitemap Legal Notice Privacy Terms Copyright / Removal DSA Contact

Leaving the site

You are about to open an external website:

Continue →