DevOps — Tech News

EN

Optimizing RAG Pipelines, Migrating AI Agents, and LLM-Powered Troubleshooting

Optimizing RAG Pipelines, Migrating AI Agents, and LLM-Powered Troubleshooting Today's Highlights This week's highlights cover advanced strategies for…

ai rag automation

EN

Define the state of our agent

Meta: Learn how to eliminate LLM hallucinations in career coaching apps using Agentic Workflows and RAG, as seen in the architecture of CVChatly. The …

agents ai career rag

EN

Building Modular AI Agent Features with Pydantic AI Capabilities

If you're building AI Agents with Pydantic AI, understanding Capabilities is invaluable - it's the recommended way to add modular, reusable features t…

ai pydanticai graphrag rag

EN

A Chinese 8B model beat the Western 8B models at Japanese RAG. I still wouldn't put it in the default deployment — and that distinction is the point.

Extends an earlier model-selection benchmark to three model families (Japanese / Western / Chinese) on a Japanese RAG task. Repo + raw results: https:…

llm rag machinelearning japan

EN

Two Pre-Registered Benchmarks for Audit-Native RAG: RAB (EU AI Act 10/12/19) + LRB (Time-Travel Retrieval)

Most RAG demos answer "what's the right chunk?" Very few can answer the two questions a regulator or an auditor will actually ask: Replay this decisio…

rag llm aiact audit

EN

Context Compression Before the LLM: Cutting Tokens Without Cutting Recall

Book: RAG Pocket Guide: Retrieval, Chunking, and Reranking Patterns for Production Also by me: Thinking in Go (2-book series) — Complete Guide to Go P…

rag ai llm python

EN

Query Rewriting Before Retrieval: The Cheap Recall Win Most Skip

Book: RAG Pocket Guide: Retrieval, Chunking, and Reranking Patterns for Production Also by me: Thinking in Go (2-book series) — Complete Guide to Go P…

rag ai llm howto

EN

AI Agents Level Up Workflows: Terraform MCP, WebMCP, Pinecone Integrations

AI Agents Level Up Workflows: Terraform MCP, WebMCP, Pinecone Integrations Today's Highlights This week showcases significant advancements in AI agent…

ai rag automation

EN

Why my first RAG layer starts in Postgres, not in a standalone vector database

When people say they are "adding RAG" to a workflow, the conversation often jumps too quickly to infrastructure choices. Should this use a vector data…

ai rag postgres architecture

EN

Metadata Filtering Before Vector Search: The Recall Win Nobody Measures

Book: RAG Pocket Guide: Retrieval, Chunking, and Reranking Patterns for Production Also by me: Thinking in Go (2-book series) — Complete Guide to Go P…

rag ai llm database

EN

Local AI Coding Agents, Secure Production Deployment, and Angular-Specific AI Skills

Local AI Coding Agents, Secure Production Deployment, and Angular-Specific AI Skills Today's Highlights This week's top stories highlight practical wa…

ai rag automation

EN

Build a RAG Pipeline From Scratch (Production Patterns That Actually Matter)

Most RAG tutorials stop at "embed your docs, do a similarity search, stuff the results in a prompt." That gets you a demo. It does not get you somethi…

rag llmengineering vectordatabases embeddings

EN

Token Cost Optimization: How to Cut LLM Inference Spend Without Cutting Quality

There is a version of token cost optimization that I do not recommend: cutting token counts by reducing the quality of your system prompt, your retrie…

ai llm performance rag

EN

AI Agent Security, Open-Source Code Generation, and Frontier Models on Bedrock

AI Agent Security, Open-Source Code Generation, and Frontier Models on Bedrock Today's Highlights This week highlights a new security scanner for AI a…

ai rag automation

EN

How to make AI answer questions about your documents, by building RAG from scratch

In the previous post , we talked about context windows. The model has a fixed-size desk and everything has to fit on it at once. When too much is on t…

ai rag tutorial aws

EN

Production-Grade RAG: Why Vector Search Isn't Enough (and How Hybrid Search Fills the Gaps)

Imagine your team just deployed a sleek RAG-based docs assistant for the SaaS platform you develop. In testing, it worked flawlessly. It knows your fu…

ai database llm rag

RU

Telegram-бот с RAG на Cloudflare Workers: база знаний без векторов и без базы данных

Строим Telegram-бота с RAG-поиском по базе знаний — без векторных БД, без эмбеддингов, без платной инфраструктуры. Поиск по ключевым словам через Jacc…

telegram-bot cloudflare workers typescript llm jaccard groq telegraf serverless knowledge-base rag

RU

Хватит мучить ChatGPT. Почему ваш промпт не сработает

Привет, на связи Настя из Cloud.ru . В прошлый раз поговорили о простых материях: контексте, расширенном промпте и ролях. А в этой части обсудим, что …

rag ии

EN

RAG-Based Testing Series — Part 4: Edge Cases — What Breaks RAG & How to Catch It

RAG-Based Testing Series — Part 4: Edge Cases — What Breaks RAG & How to Catch It "Your users will never read your happy path. They will, however,…

testing ai rag python

EN

AI Output Provenance for SaaS: Trace Answers Before They Become Liability

An AI answer can look clean, confident, and helpful while hiding the exact detail your team will need later: where did this claim come from? For AI Sa…

ai llm rag saas

EN

Build Your RAG System Right the First Time: 6 Decisions That Make or Break It

After debugging 20+ broken RAG systems, I've identified the 6 decisions that determine whether yours works. Here's how to get each one right. The RAG …

rag ai tutorial machinelearning

EN

Your vector memory database remembers everything. That’s exactly the issue.

There is a design assumption baked into almost every vector database and AI memory implementation that sounds reasonable until you watch it grow nodes…

ai vectordatabase memory rag

EN

Citation-Guard: Production RAG Patterns for Regulated Fintech

Naive RAG passes the demo and fails the audit. The citation-guard pattern keeps fintech AI honest: retrieve with citations, quote numbers, abstain whe…

rag llm fintech ai

EN

TradeMemory An AI-Powered Persistence Layer for Disciplined Trading

Project Documentation: TradeMemory Exploring Memory-Augmented AI for Trading Journaling Tech Stack: MERN + Groq (Qwen-3) + Hindsight Cloud Vector SDK …

ai llm rag showdev

RU

Когда компании пора строить свой LLM-кластер, а не пользоваться внешними API

На раннем этапе внедрения LLM в компании выглядят как быстрый выигрыш: подключается внешний API (например, ChatGPT), ускоряется работа с текстами, авт…

llm on-premise ai kubernetes mlops инференс моделей gpu-кластеры rag llama 3 ai-инфраструктура корпоративный ai

EN

I Built a Production RAG System on My M1 Mac for $0

I Built a Production RAG System on My M1 Mac for $0 Most RAG tutorials stop at "it answers questions." But answering questions is table stakes. The re…

rag mlops ai python

EN

Search bug or model bug - testing a RAG system to tell them apart

I'm an automation tester. Usually my job is simple: the same input should give the same output, every time. Language models don't work that way. Ask t…

ai python testing rag

EN

FreshContext in agent workflows: judgment at the context handoff

FreshContext in agent workflows: judgment at the context handoff After sharing FreshContext publicly, a few comments helped me sharpen where it fits. …

ai rag agents productivity

EN

Building Your "Longevity Knowledge Graph": Stop Ignoring 10 Years of Health Reports with GraphRAG and Neo4j

We’ve all been there: every year, you get a physical, receive a thick PDF full of blood markers, glance at the "normal range" checkmarks, and toss it …

ai rag opensource discuss

EN

Benchmarking AI Agents, Gemma 4 On-Device Workflows & AI System Security

Benchmarking AI Agents, Gemma 4 On-Device Workflows & AI System Security Today's Highlights This week, we dive into critical aspects of applied AI…

ai rag automation