The Four Wars of the AI Stack - Dec 2023 Recap
Latent Space Latent Space
4.81K subscribers
669 views
0

 Published On Jan 26, 2024

The Data War - with OpenAI announcing a partnership with Axel Springer (see also its deal with the AP and its Data Partnerships program), the NYT bringing a lawsuit on OpenAI demanding shutdown of all GPTs, and Apple now offering $50m for data contracts with publishers. Meanwhile there is an undeniable rise in interest in synthetic data both at NeurIPS and at Deepmind.

The GPU/Inference War - with the price per million Mixtral output tokens starting at ~$2 and rapidly racing down to $0.27 in a week (details below), and fresh benchmark drama between Anyscale and other inference providers. New research into new model architectures (Mamba, RWKV), and moving compute off Nvidia (Modular, tinycorp, Apple MLX) make more out of existing GPU resources.

The Multimodality War - with Midjourney soft-launching v6, a web UI, and now reported making over $200m/yr, Assembly AI raising a $50m Series C for “building the Stripe for AI models”, Replicate (historically Stable Diffusion-centric) raising a $40m Series B to serve AI Engineers, and Suno AI coming out of stealth and returning to monkey - all steady point solution improvements while OpenAI and Google continue work on God Models that compete with all of them at once.

The RAG/Ops War - the debate on whether you need a Vector DB, vs power users adopting new vector DBs like turbopuffer; the debate between LangChain (now at v0.1, with TED talk and State of AI survey) vs LlamaIndex (now with Step-Wise Agent Execution); and continuing LLMOps developments (HumanLoop’s new .prompt file, Openlayer) vs framework-driven tooling like LangSmith and new approaches like Martian (who announced their $9m seed).

The above “wars” are selected for being essential components in the “AI stack” where major money is being made and deployed

Full notes: https://www.latent.space/p/dec-2023

Chapters:
00:00 Intro
01:42 The Four Wars of the AI stack: Data quality, GPU rich vs poor, Multimodality, and Rag/Ops war
03:35 Selection process for the four wars and notable mentions
08:11 The end of low background tokens and the impact on data engineering
10:10 The Quality Data Wars (UGC, licensing, synthetic data, and more)
21:44 The GPU Rich/Poors War
26:29 The math behind Mixtral inference costs
34:27 Transformer alternatives and why they matter
41:33 The Multimodality Wars
45:40 Multiverse vs Metaverse
54:00 The RAG/Ops Wars
1:00:00 Will frameworks expand up, or will cloud providers expand down?
1:05:25 Syntax to Semantics
1:07:56 Outer Loop vs Inner Loop
1:11:00 Highlight of the month

show more

Share/Embed