READING TIME · 5 MIN · Mar 25, 2026

AI Roundup: Litellm Supply Chain Hack, Massive MoE Models on Your Laptop, Cursor Gets Serious

Tob

Backend Developer

A compromised PyPI package steals SSH keys, researchers run trillion-parameter models on 96GB RAM, and Cursor ships Composer 2 with automations.

AI Engineering

If you installed Litellm from PyPI in the last few days, stop reading and check your SSH keys. Here's what happened plus two other things that actually matter this week.

TL;DR: Litellm 1.82.7/1.82.8 got backdoored via a supply chain attack. MoE models are getting ridiculous, now running on phones. Cursor shipped Composer 2 and a bunch of enterprise features.

Litellm Got Compromised and Nobody Noticed (Until They Did)

Simon Willison broke this down clearly. The short version: someone uploaded malicious versions of Litellm to PyPI. The payload was hidden in a file called litellm_init.pth, which Python executes automatically when you install a package. You did not even need to import the library for it to run.

What did it steal? Everything. SSH keys, git credentials, AWS tokens, kubeconfig, Azure credentials, Docker configs, npmrc, vault tokens, bitcoin wallets. If it lived in ~/., this thing grabbed it.

The attack started via a compromised Trivy CI pipeline. Trivy is ironically a security scanner. The attackers used stolen PyPI credentials to publish the poisoned packages. PyPI caught it fast and quarantined the package, but the window was open for hours.

If you have Litellm installed, assume everything in your home directory is compromised. Rotate your credentials. All of them.

bash

# Check if you have the bad version
pip show litellm | grep Version

# If you see 1.82.7 or 1.82.8, you need to:
# 1. Delete the package
pip uninstall litellm

# 2. Rotate every credential you can think of
# 3. Check your git history for anything suspicious

This is the second major supply chain attack on a Python package in recent memory. The lesson here is not to trust PyPI at face value. Pin your dependencies. Use lock files. Maybe actually audit what gets installed in your CI pipeline.

Sources: Simon Willison, GitHub Issue

MoE Models Are Breaking the Laws of Physics (or RAM)

Here's the thing that actually blew my mind this week. Dan Woods and collaborators have been experimenting with a technique called streaming experts. The idea is simple: instead of loading a massive Mixture-of-Experts model into RAM all at once, you stream the expert weights from SSD as needed.

Five days ago, they got Qwen3.5-397B-A17B running in 48GB of RAM. That is already wild for a nearly 400 billion parameter model.

Then this week, someone ran Kimi K2.5 on a MacBook Pro with 96GB. Kimi K2.5 has 1 trillion total parameters with 32B active weights at any time. One trillion.

And the kicker: the same Qwen3.5 model ran on an iPhone. Yes, an actual iPhone. It was slow at 0.6 tokens per second, but it ran.

This is the kind of optimization that matters. We are watching the community find increasingly creative ways to squeeze large models onto constrained hardware. The bottleneck is no longer the model size. It is how fast you can move bits.

Expect this technique to mature fast. The optimization race is on.

Sources: Simon Willison, Twitter

Cursor Shipped a Lot of Stuff and It Actually Matters

Cursor dropped their changelog and it has some meat to it.

Composer 2 is their new AI model for coding tasks. They claim frontier-level performance on coding benchmarks. Pricing is standard at $0.50/M input tokens and $2.50/M output tokens, with a fast mode at $1.50/$7.50. The naming is interesting, suggesting they are building toward some kind of composition framework for AI coding.

Automations is the bigger story for team use. You can now build always-on agents that trigger on schedules or events from Slack, Linear, GitHub, PagerDuty, and webhooks. This is Cursor going up against GitHub Actions with AI agents. The implications for CI/CD and automated code review are significant.

They also added Cursor ACP for JetBrains. This is Agent Client Protocol support in IntelliJ, PyCharm, and WebStorm. If you live in JetBrains but want Cursor features, you can now have both. The ACP registry makes this a proper extension point.

Over 30 new plugins landed in the marketplace too. Atlassian, Datadog, GitLab, Glean, Hugging Face, monday.com, PlanetScale. MCP servers that cloud agents can use in automations.

The enterprise story is getting real. Team marketplaces for private plugins, proper access controls, cloud sandboxes for agent runs.

Sources: Cursor Changelog

That is the rundown. Patch your Litellm installs, watch the MoE space, and if you use Cursor at work, the automations feature is worth a serious look.

Sources: Simon Willison, Hacker News, Cursor

AI Security LLMs Developer-Tools Supply-Chain