Speak at Ray Summit 2025

One of my favorite conferences takes place November 3-5 in San Francisco! This year’s conference spotlights the critical layers of AI development: open source infrastructure, multimodal data, post-training optimization, and scalable ML platforms, highlighted by a full track dedicated to vLLM. This is the definitive gathering for the community of builders, operators, and innovators shapingContinue reading “Speak at Ray Summit 2025”

“Massive Scrum” of Models: New Data on China’s AI Gold Rush

Subscribe • Previous Issues Inside China’s AI Registry China’s Cyberspace Administration (CAC) maintains the world’s only comprehensive, publicly accessible registry of generative AI tools (GAT). Every public-facing generative AI service—whether text, image, audio, video, or multimodal—must register before deployment. In April, Trivium posted an Excel file that lists all the GATs in this registry. The Excel fileContinue reading ““Massive Scrum” of Models: New Data on China’s AI Gold Rush”

From Demos to Dollars: Quiet Engineering, Big Commercial Pay-offs

Deploying generative AI systems is an engineering discipline rather than a science project. Foundation models and novel prototypes win headlines, but the commercial race will be decided in the production trenches—where reliability, cost, and governance matter more than benchmark scores. These infrastructure shifts are now separating fragile demos from revenue-generating services, and deserve the focusContinue reading “From Demos to Dollars: Quiet Engineering, Big Commercial Pay-offs”

The “boring” truth about successful AI

Subscribe • Previous Issues From Demos to Dollars: Quiet Engineering, Big Commercial Pay-offs Deploying generative AI systems is an engineering discipline rather than a science project. Foundation models and novel prototypes win headlines, but the commercial race will be decided in the production trenches—where reliability, cost, and governance matter more than benchmark scores. These infrastructure shifts areContinue reading “The “boring” truth about successful AI”

RAG’s Next Chapter: Agentic, Multimodal, and System-Optimized AI

While autonomous agents and large-scale reasoning models are currently attracting significant attention and investment, I find that Retrieval-Augmented Generation (RAG) and its variants remain foundational to building practical, knowledge-intensive AI applications. The RAG space isn’t static; it’s continually evolving, offering compelling solutions for real-world AI challenges. Take GraphRAG, for instance—a design pattern that garnered attentionContinue reading “RAG’s Next Chapter: Agentic, Multimodal, and System-Optimized AI”

RAG Reimagined: 5 Breakthroughs You Should Know

Subscribe • Previous Issues RAG’s Next Chapter: Agentic, Multimodal, and System-Optimized AI While autonomous agents and large-scale reasoning models are currently attracting significant attention and investment, I find that Retrieval-Augmented Generation (RAG) and its variants remain foundational to building practical, knowledge-intensive AI applications. The RAG space isn’t static; it’s continually evolving, offering compelling solutions for real-world AIContinue reading “RAG Reimagined: 5 Breakthroughs You Should Know”

Time Bought, Advantage Lost? The Limits of Semiconductor Sanctions

The technological and economic competition between the United States and China has increasingly centered on AI capabilities, with semiconductor access becoming the critical battleground. Since 2022, these export controls have evolved from targeted restrictions to a complex regulatory regime with far-reaching implications. I have tracked Washington’s semiconductor export controls since they were first rolled out,Continue reading “Time Bought, Advantage Lost? The Limits of Semiconductor Sanctions”

Why AI Efficiency Outruns Hardware Shortages

Subscribe • Previous Issues Time Bought, Advantage Lost? The Limits of Semiconductor Sanctions The technological and economic competition between the United States and China has increasingly centered on AI capabilities, with semiconductor access becoming the critical battleground. Since 2022, these export controls have evolved from targeted restrictions to a complex regulatory regime with far-reaching implications. I haveContinue reading “Why AI Efficiency Outruns Hardware Shortages”

Workflow, Not Wizardry: The Real Levers of AI Success at Work

After two years of breathless predictions about AI transformation, there remains a stark divide between promise and practice. Major tech companies continue their massive infrastructure investments – with capital expenditures approaching 30% of revenues – while many enterprise clients struggle to demonstrate meaningful returns. Recent data shows 42% of companies abandoning most of their generativeContinue reading “Workflow, Not Wizardry: The Real Levers of AI Success at Work”

🚨 New Data Reveals Why Most Gen-AI Pilots Fail

Subscribe • Previous Issues Workflow, Not Wizardry: The Real Levers of AI Success at Work After two years of breathless predictions about AI transformation, there remains a stark divide between promise and practice. Major tech companies continue their massive infrastructure investments – with capital expenditures approaching 30% of revenues – while many enterprise clients struggle to demonstrateContinue reading “🚨 New Data Reveals Why Most Gen-AI Pilots Fail”