Testing & QA — Tech News

EN

The July Model Wave Is Not a Race You Need to Win

Three frontier launches. Two weeks. One bad habit. The habit is crowning a winner from a press release. Claude Sonnet 5 on June 30. OpenAI's GPT-5.6 f…

models openai anthropic xai

EN

Anthropic deleted 80% of Claude Code's system prompt. No regression.

Anthropic just published something that should make every developer building on Claude rethink their context engineering. For Claude Opus 5 and Fable …

ai anthropic llm agents

EN

I Grepped My Own Claude Code Logs and Found the Hidden Tag Anthropic Never Shows You

I was mid-conversation with Claude Code, asking it to help draft a blog post, when a literal <ip_reminder> tag showed up pasted into my own mess…

claudecode anthropic llm security

EN

claude-security Beta: Cost, Output, Accuracy — 189 Agents, 2 Hours, Measured

What this article covers : A measured run of Anthropic's official security-scanning plugin claude-security (beta). What the tool does, how long it tak…

claudecode anthropic security ai

EN

Claude Cowork on Windows and Linux: How to Run It in 2026

You bought a Max subscription on the strength of a Claude Cowork demo: a desktop agent quietly working through a folder of spreadsheets, drafting a de…

claudecowork windows linux anthropic

EN

Where are YC founders now? OpenAI and Anthropic, mostly!

The Concentric Evolution of Y Combinator Alumni: From Generalist SaaS to Frontier AI The current landscape of the artificial intelligence industry is …

ycombinator openai anthropic startup

RU

Ведущие ИИ-лабы естественным образом пришли к формату общей «выставки» для своих главных запусков

Anthropic, OpenAI, SpaceXAI и компания Цукерберга почти одновременно выкатили новые модели. Как будто съехались на конференцию, только виртуально. Чит…

anthropic spacexai илон маск марк цукерберг openai chatgpt claude

RU

Сравниваем LLM, 12 тестов для Claude Fable 5 и GPT 5.5 Pro

В прошлой статье мы устроили большое сравнение Opus 4.8, GPT 5.5 и Gemini 3.1 Pro и оговорились: GPT 5.5 Pro в том матче не участвовала, потому что ее…

Fable GPT 5.5 Pro anthropic llm контекст текст CAPS галлюцинация промпт сравнение

RU

Fable 5 из подвала, Sonnet 5 в дефолт. Что Antropic сделал за 5 дней

Утро четверга, в шапке Claude Code «Sonnet 5» — а я не переключал. Полез разбираться, что Antropic успел за 5 дней. Вернули Файбл 5 после 19 дней в по…

anthropic claude sonnet 5 fable 5

EN

Anthropic just published a jailbreak severity scale. Here's what it means.

Anthropic has re-deployed Fable 5 and used the moment to publish two things that matter: a precise breakdown of what their cybersecurity classifiers w…

ai anthropic security llm

EN

Claude Code was quietly fingerprinting requests through a hidden mark in the date

A reverse-engineer discovered that Anthropic's coding tool, Claude Code, embeds a hidden tracking mark in the system prompt it sends to its AI model. …

anthropic claudecode privacy security

EN

The Nobel Laureate Who Joined Anthropic Mid-Crisis

John Jumper announced on June 19 that he is leaving Google DeepMind to join Anthropic. He shared the 2024 Nobel Prize in Chemistry for protein structu…

aitalent anthropic research google

EN

This Week in AI: Claude Goes Dark, SpaceX Buys Cursor for $60B

This Week in AI: June 12–18, 2026 Six days. That's how long two of Anthropic's most capable models have been offline because of a single letter from t…

cursor developertools ainews anthropic

EN

Anthropic’s Fable/Mythos shutdown is the first real model export-control shock

Anthropic’s Fable/Mythos shutdown is the first real model export-control shock The important AI story this week is not just that Anthropic launched bi…

ai anthropic coding deeplearning

EN

Claude Design: What It Is, Where It Fits, and When to Skip It

Claude Design is a tool from Anthropic Labs that turns a conversation into editable visual work: prototypes, slide decks, one-pagers, mockups, landing…

claude anthropic ai design

EN

AI news today: agent pricing, rare disease diagnosis, and China's local-model push

Today's AI news is not one of those neat, one-company launch days. It is messier than that. OpenAI is pushing AI into rare disease diagnosis. Anthropi…

ai openai anthropic localmodels

RU

Opus оркеструет, DeepSeek V4 пишет код: как собрать связку внутри Claude Code и сэкономить деньги

Однажды я открыл биллинг и просто посмотрел, на что уходят токены. Не на «подумать над архитектурой». А на переименование пер…

claude code deepseek llm нейросети api разработка оптимизация затрат ai-инструменты anthropic deepseek v4

EN

The Paradox of Power: Why Anthropic Released and Then Restricted Claude Fable 5

On June 9, 2026, Anthropic released Claude Fable 5, which was described as the most capable AI model publicly available at the time. Within 72 hours, …

ai cybersecurity claude anthropic

EN

Anthropic’s Claude in 2026: When Frontier AI Stopped Being Just Software

In 2026, Claude stopped looking like a normal AI product and started looking like infrastructure. Anthropic’s latest models are no longer interesting …

claude anthropic cybersecurity ai

EN

MCP Apps vs OpenAI Apps SDK: are they competing standards?

If you have started building tools for AI chat hosts, you have probably hit the same fork in the road. There are two ways to add a user interface to a…

claude mcp ai anthropic

RU

Токен-оптимизация агентов: на что уходит контекстное окно MCP

Чем больше задач берёт на себя агент, тем чаще он упирается не в качество модели, а в контекстное окно: туда нужно уместить инструкции, историю диалог…

mcp claude anthropic llm ai-агенты opensource context-engineering ai claude-code tokens

EN

Claude Opus 4.8 shipped today. Here's the upgrade decision tree the announcement skipped — and three workloads that should stay on 4.7.

The 30-second version Anthropic shipped Claude Opus 4.8 a few hours ago. Every benchmark on the announcement page is up: SWE-bench Verified, GPQA, MAT…

claude anthropic ai agents

RU

ИИ уже пишет 80% кода Anthropic. Самое тревожное спрятано в цифре, которую подают как успех

Anthropic отчиталась, что больше 80% её кода теперь пишет Claude, — а её же автоматический проверяющий ловит лишь треть прошлых ошибок, то есть две тр…

самогенерация ИИ независимая проверка валидаторы мутационное тестирование формальная верификация ПЛК IEC61508 METR надежность кода anthropic

EN

Two agent skills hit GitHub trending the same week. Skills are becoming the new packages, and the dependency graph nobody is managing will bite by Q4.

The signal hidden in this week's GitHub trending Two agent-shaped repositories cracked the daily GitHub trending board this week. The first is mvanhor…

claude agents anthropic skills

EN

Claude Opus 4.8 shipped this week. The buried story is your migration cadence — your agent fleet won't survive the next four months without a refactor.

The benchmark is the wrong story Anthropic shipped Claude Opus 4.8 this week. You probably saw the announcement post on Tuesday, the swarm of benchmar…

claude ai anthropic agents

EN

Claude Opus 4.8 shipped today. Here is what the launch post does not say about why your agents will feel different tomorrow.

Claude Opus 4.8 shipped today. The benchmarks are a distraction — here is what actually changes about how your agents run tomorrow. Anthropic announce…

claude anthropic ai llm

EN

Opus 4.8 ships Dynamic Workflows — hundreds of parallel subagents per session. Read this before you wire it into prod.

Opus 4.8 ships Dynamic Workflows — hundreds of parallel subagents per session. Read this before you wire it into prod. Anthropic's Opus 4.8 announceme…

claudecode anthropic ai agents

EN

중국 암시장에서 Claude 10%에 판매 — 모델 증류의 진짜 위협

중국 암시장이 클로드를 10%에 팔고 있다, 그런데 앤트로픽이 정작 두려워하는 것은 따로 있다 모델 증류의 진짜 공포는 '가격'이 아니라 '속도'다 — 당신의 AI가 이미 모조품일 수 있다는 이야기 TL;DR : 중국 암시장에서 앤트로픽의 Claude가 원가의 10% …

modeldistillation claudeai anthropic aiblackmarket

RU

AI обнулил benchmark и пытался шантажировать инженера. И почему это решаемо

Топовые AI-модели с 95% на SWE-bench показывают 0% и 3% на ProgramBench бенчмарке, где задачи специально не пересекаются с обучающей выборкой. Не «упа…

AI-агенты llm anthropic Claude ProgramBench Agentic misalignment Бенчмарки LLM AI в production Безопасность AI Reliability

EN

Shipped: first MCP server for ISO 10012:2026 measurement uncertainty

Spent 48 hours building a Model Context Protocol server for ISO 10012:2026 measurement uncertainty. 10 tools. JCGM 100:2008 plus 101:2008 compliant. L…

mcp anthropic metrology showdev