Kir-News.
A daily scan of Chinese AI: open-weight labs, production agents, training economics, and the policy and chip stories behind them. Scored every morning, summarised in plain English, published here at 07:00 Stockholm.
-
DeepSeek made another permanent price cut on V4, this time on cached input tokens, the part of a prompt the model has seen before and can reuse. In practical agent coding tests, total spend dropped 83 percent because nearly all tokens hit the cache. Tencent's QClaw agent platform already wired V4 in alongside Hunyuan 3. Separately, Anthropic confirmed three bugs that quietly degraded Claude Code for two months, which lands awkwardly as Chinese alternatives keep getting cheaper. Combined picture: the cost floor for agent workloads in China keeps sinking, and Western incumbents are losing the reliability argument too.
Lead story: DeepSeek V4 cache pricing cut: 83% off agent coding billsRead the full briefing → -
Product photography quietly hit an inflection point. Frontier image and video engines stopped failing on hands, fabric and edges — Kling and Seedance for video, GPT Image 2.0, Nano Banana and Flux Kontext for stills. None of them on its own gives a brand a coherent catalog. The advantage now belongs to whoever can compose them. We do that inside Weavy, a node-based AI canvas where we build custom photoshoot pipelines per client — fashion, construction, interior, anything that ships a physical product. Best model per asset type, brand consistency baked in, and a Design App the client's team runs themselves. Shein already runs AI models in its catalog. ASOS cut production costs 23% in 2024. A four-person team made a $700 spot that hit 5B views. Below: which engines win which phase, how the pipeline is composed, and where AI still loses.
Lead story: Fashion shoots are getting 90% cheaper. Most of the tools come from China.Read the full briefing → -
DeepSeek released V4 today, and the technical report is unusually detailed. The headline: a 1.6 trillion parameter model (with only 49 billion active per query, thanks to a sparse "mixture-of-experts" design) that handles a 1 million token context window while using a fraction of the memory of its predecessor. They swapped the standard AdamW optimizer for Muon, introduced a new residual scheme called mHC to stabilize very deep networks, and replaced reinforcement learning post-training with on-policy distillation from specialist teacher models. Coding benchmarks reportedly beat GPT-5.4 and Claude Sonnet 4.5. Huawei Cloud shipped same-day Ascend support, and PPIO is serving the API. For the Chinese open-weights ecosystem, this is the most consequential release of the quarter.
Lead story: DeepSeek V4 lands: 1M context, mHC residuals, MuonRead the full briefing → -
DeepSeek released V4 today as open weights, with explicit Huawei chip support and same-day adaptation from Hygon's DCU accelerators. Separately, Chinese frontier labs (Zhipu, MiniMax, Moonshot) have quietly dropped OpenAI as their benchmark target and are now chasing Anthropic's Claude on agentic coding, with real revenue to show for it: Zhipu's API business grew 60x year-over-year and raised prices 83% without losing volume. A new inference-only GPU unicorn, Xiwang, is pitching a chip aimed at cutting per-token serving costs by 90%. The throughline: the Chinese stack is consolidating around domestic silicon plus agent-shaped APIs, and the economics are starting to work.
Lead story: DeepSeek V4 lands, Hygon adapts Day-0, Huawei pairing confirmedRead the full briefing → -
Chinese AI labs shipped a wave of frontier open models this month. Moonshot's Kimi K2 Thinking (a trillion-parameter model that cost only about $4.6M to train) can chain 300 tool calls autonomously, beating GPT-5 on some web-browsing benchmarks. GLM-5 from Zhipu and Qwen3.5 from Alibaba debuted architectures that make long-context inference dramatically cheaper — Qwen3.5 is priced at roughly 1/18th of Google's Gemini 3. Underneath it all, ByteDance's Seedance 2.0 is quietly rewriting the economics of AI video, with small teams producing viral 90-second spots for under $700.
Lead story: Kimi K2 Thinking: 1T params, $4.6M, 300 tool callsRead the full briefing →