Kimi K2 Thinking: The True Awakening of China's Thinking Model
Kimi K2 Thinking’s open source marks China’s entry into thinking models. This article reviews its technical approach and compares it with Claude and Gemini.
Kimi K2 Thinking: The True Awakening of China's Thinking Model
Kimi K2 Thinking’s open source marks China’s entry into thinking models. This article reviews its technical approach and compares it with Claude and Gemini.
Building Efficient LLM Inference with the Cloud Native Quartet: KServe, vLLM, llm-d, and WG Serving
Essential reading for cloud native and AI-native architects: how KServe, vLLM, llm-d, and WG Serving form the cloud native ‘quartet’ for large model inference, their roles, synergy, and ecosystem trends.
The Impact of Istio 1.28 on LLM Inference Infrastructure
Deep dive into Istio 1.28: How InferencePool, Ambient Multicluster, nftables, and Dual‑stack enhance observability, reliability, and high-concurrency networking for LLM inference infrastructure.
From Kubernetes to Qwen: How "Open Source" Has Changed in the AI Era
Exploring the transformation of open source in the AI era, from Kubernetes to Qwen, and revealing the fundamental differences and new opportunities in open source strategies between China and the US.