Tech News — Latest News

EN

Our drift-warning hook was silently dead for 23 days. Zero warnings looked exactly like good behavior.

Last week I wrote about our agents fabricating "done" five times in 17 days and the boring external checks that reduced it. This is the embarrassing s…

ai agents devops observability

EN

How to Review a Signup Regression After a Next.js Release

A signup regression after a Next.js release should not start with panic. It should start with a narrow review. The dangerous version is familiar: a de…

nextjs observability devops sentry

EN

Prometheus Agent Mode vs Grafana Alloy: Choosing the Right Push Agent in 2026

TL;DR: If you only collect metrics, Prometheus Agent mode is lightweight, familiar, and difficult to beat. If you collect metrics, logs, or traces tog…

prometheus grafana monitoring observability

EN

Debug a Legacy Frontend-Backend Deployment With One Trace ID

Legacy frontend-backend deployments fail in layers. A React build can be correct while the proxy serves last week's files. Django can be healthy while…

devops debugging webdev observability

EN

Observability as Code: Managing Dashboards and Alerts with Terraform

The Problem with Click-Ops Dashboards Your team has 200 dashboards. You don't know who owns them. Half are broken. The rest show yesterday's reality. …

terraform observability devops iac

EN

GitHub lets enterprises pin Copilot's OpenTelemetry endpoint

Where Copilot's telemetry stream lands, decided centrally GitHub added a control on July 8 that lets an enterprise mandate where the Copilot Chat exte…

opentelemetry githubcopilot enterprisemanagedsettings observability

EN

I Traced a Multi-Step LLM Agent With Self-Hosted SigNoz. One Feature Sold Me.

Multi-step LLM agents fail in a way normal backends don't. Nothing crashes. The pipeline "works", the answer is just bad, slow or three times more exp…

signoz opentelemetry observability ai

EN

The expensive half of your incident bot is the half you didn't build

An incident bot caught the CrashLoopBackOff at 3:12 a.m., proposed delete_pod, and the on-call approved it half asleep at 3:14. The new pod went Runni…

devops sre observability kubernetes

EN

Re: Downloads are vanity — building observable install paths for agents

Great point @alexshev — downloads ARE a vanity metric. The real signal is: did the agent connect, complete a workflow, and self-diagnose failures? We …

mcp ai agents observability

EN

End of week. Here's the thing I kept coming back to:

LinkedIn Draft — Insight (2026-07-10) End of week. Here's the thing I kept coming back to: SLOs work when they create conversations, not when they cre…

observability sre devops platformengineering

EN

Cost-Effective Observability: The 80/20 Stack for Startups

You Don't Need Datadog (Yet) I see startups spending $5,000/month on Datadog with 8 engineers. That's $625 per engineer per month for monitoring. At t…

observability startup monitoring devops

EN

5 Ways Your AI Agent Will Fail (And How to Prevent Them)

Your agent works in testing. Then you deploy it and things break in ways you didn't expect. Here are five failure modes I've seen repeatedly, with Typ…

ai typescript testing observability

EN

Helicone was acquired by Mintlify. Here is a migration checklist if you are moving off.

On March 3, 2026, Helicone announced it was joining Mintlify. If you run Helicone in production, the practical question is not whether the acquisition…

ai llm opensource observability

EN

When an LLM answer is wrong, the trace is where you look. Some tools make that easy.

A user reports a hallucinated answer in prod. To fix it you need the full trace of that one request, and how fast you can pull it depends entirely on …

observability ai opentelemetry llm

RU

Пока все хоронили пайплайны, ClickHouse достраивал слои

«Отдельные базы больше не нужны», «конец пайплайнов» - каждую неделю кто-то крупный со сцены хоронит то, что ты вчера поставил в прод. ClickHouse пост…

clickhouse postgresql Clickstack olap lakehouse data engineering observability колоночная субд ClickHouse Agents Open House 2026

EN

Observability for LLM Apps: Tracing, Cost Tracking, and Eval Loops

If you've shipped a traditional backend service, you already know the observability checklist: logs, metrics, traces, alerts. LLM-powered apps need al…

ai observability llm backend

RU

Когда ИИ мониторит ИИ: Полный стек наблюдаемости для LLM

Локальная LLM может тормозить и галлюцинировать при полностью «зелёном» мониторинге. Рассказываем, что мониторить помимо инфраструктуры, как устроена …

ai llm observability apm apm-мониторинг мониторинг ключ-астром

EN

Mastering Production Reliability: Practical Observability with OpenTelemetry, Prometheus, and GitHub Actions

In modern software engineering, traditional monitoring — simply knowing if a system is up or down — is no longer enough. High-velocity engineering tea…

observability devops opentelemetry node

EN

We deployed a LangChain agent for a client and it silently failed for two weeks. Here's what we built to make sure it never happens again.

Six weeks ago, a LangChain agent we'd deployed for a B2B client started failing on roughly 30% of its sessions. No exceptions. No 500s. Nothing in the…

ai langchain observability python

EN

AI Agent Observability Runs on Conversation IDs | Focused Labs

Agent observability gets weirdly polite at the exact moment it should get nosy. It records the model call, stores the prompt, counts tokens, and then …

observability ai programming

EN

Something I wish someone had told me five years earlier:

LinkedIn Draft — Insight (2026-07-03) Something I wish someone had told me five years earlier: Distributed tracing: the gap between having it and usin…

observability sre devops platformengineering

RU

[Перевод] Создание кластер-осведомлённого ИИ-агента с Kubernetes, Argo CD и GitOps

Команда VK Cloud перевела разбор запуска self-hosted (размещаемого на собственных мощностях), read-only ИИ-агента внутри кластера Kubernetes, где всю …

vk cloud kubernetes argo cd gitops llm ai agent ollama mistral rbac observability

EN

Laravel Nightwatch: First-Party APM and What It Actually Replaces

Book: Decoupled PHP — Clean and Hexagonal Architecture for Applications That Outlive the Framework Also by me: Thinking in Go (2-book series) — Comple…

laravel observability php monitoring

EN

Monitoring Costs Are Out of Control — Here's How to Fix It

The $50K/Month Monitoring Bill I audited our monitoring stack last quarter. The total cost across all tools: $52,000/month. For a company with 200 eng…

monitoring devops costs observability

RU

Почему вы не видите, что на самом деле происходит между моделью и MCP-сервером

Недавно Claude уверенно пересказал мне большой документ, хотя прочитал только его начало. Я подключил mcp-server-fetch и попросил агента извлечь из до…

mcp model context protocol json-rpc llm ai-agent debug go claude observability mcp-server

EN

Your Agent's Retries Are Double-Charging Your Users (and Every Eval Is Green)

Your agent calls a tool. The tool times out at the network layer but actually succeeds on the server. Your harness sees no response, so it retries. No…

ai agents observability typescript

EN

Log Management at Scale: How We Cut Costs 70% Without Losing Signal

$12,000/Month for Logs Nobody Reads Our logging bill was $12,000/month. We were ingesting 2TB/day. When I asked the team what percentage of logs they …

logging observability devops sre

EN

Can you build observability ingestion on S3 alone — no Kafka, no disks, no coordination layer?

TL;DR — A Kafka + Flink + OTel ingestion pipeline cost us ~$700–800/month at 10 MB/s. We rebuilt it as a single binary where the data, the write-ahead…

rust observability aws architecture

EN

Tracing, Prometheus metrics, and structured logs with two decorators: Fitz vs the OpenTelemetry setup in FastAPI

For full observability in FastAPI you need 6 pip packages + 60 lines of config + manual glue between logs/spans/metrics. In Fitz it's two decorators a…

webdev observability opentelemetry opensource

RU

AI‑агенты в проде: 6 архитектурных ошибок, из‑за которых они не доживают до запуска

На демо AI‑агент может выглядеть надёжным: вызвать инструменты, собрать ответ и отчитаться об успехе. Но в продакшене быстро …

AI AI-агенты LLM архитектура production context-engineering observability мультиагентные-системы надёжность