Cursor 3 and Gemma 4: Two Big Moves Reshaping AI Coding

    This week saw two significant developments in the AI developer tooling space. Cursor dropped version 3 with a radical agent-first redesign, while Google released Gemma 4 bringing frontier-level multimodal intelligence to devices you can hold in your hand.

    Tob

    Tob

    Backend Developer

    4 min readAI Engineering
    Cursor 3 and Gemma 4: Two Big Moves Reshaping AI Coding

    The AI coding tooling space moved fast this week. Two announcements stand out: Cursor's version 3 overhaul and Google's Gemma 4 release. Both point to a future where AI agents are first-class citizens in the development workflow.

    TL;DR: Cursor 3 rebuilds its interface around agents with parallel execution and self-hosted cloud options. Google Gemma 4 delivers strong multimodal performance in sizes that run comfortably on laptops and phones.

    Cursor 3: Agents Take Center Stage

    Cursor has been quietly winning the AI code editor race for over a year. Version 3 flips the script entirely. The old VS Code fork is gone, replaced by an interface built from scratch with agents as the primary unit of work.

    The headline feature is the Agents Window. Instead of a single conversation, you can spin up multiple agents across repos and environments. Local, cloud, SSH remote, worktrees. All visible in one sidebar. Agents run in parallel and hand off to each other seamlessly.

    The handoff UX is worth highlighting. Move an agent from cloud to local when you want to test and tweak on your own machine. Move local to cloud when you close your laptop and want work to keep running. This is the "always-on coding partner" idea finally done right.

    Self-hosted cloud agents is the enterprise play. Your codebase, build outputs, and secrets stay inside your network. Cursor handles the orchestration, but the data never leaves your infrastructure. For teams with compliance requirements, this closes a gap that held back adoption.

    Design Mode lets you point at UI elements in the browser and have the agent work directly on those. Shift-drag to select, Cmd-L to add to chat. Practical for iterating on frontend changes without describing what you see.

    Cursor 3 also shipped Automations, which are like GitHub Actions but for agents. Trigger on schedule or events from Slack, Linear, GitHub, or webhooks. The agent spins up a cloud sandbox, follows your instructions, and uses MCPs you have configured. Memory tools let agents learn from past runs.

    Composer 2 powers it all. Cursor's own frontier model with strong results on hard coding tasks. Pricing is public: Standard at $0.50/M input tokens, Fast at $1.50/M input and $7.50/M output.

    The new plugin marketplace now has over 30 partners including Atlassian, Datadog, GitLab, and Hugging Face. MCPs ship with plugins, so cloud agents can read from and write to your existing stack automatically.

    Gemma 4: Multimodal Intelligence on the Edge

    Google DeepMind quietly published one of the more impressive model releases of the year. Gemma 4 comes in four sizes, all multimodal, all with 128k to 256k context windows.

    The standout numbers: the 31B dense model hits an estimated LMArena score of 1452. The 26B mixture-of-experts variant hits 1441 with only 4B active parameters. These are competitive with models twice their size.

    The four sizes serve different use cases. E2B and E4B are for on-device: 2.3B and 4.5B effective parameters, with audio support on top of text and image. The 31B is for serious workloads on a beefy machine. The 26B MoE balances quality and efficiency.

    Google added Per-Layer Embeddings (PLE), a second embedding pathway that gives each decoder layer its own token-specific signal without bloating the main hidden size. The shared KV cache reduces compute and memory for long context generation. Both optimizations matter for deployment on constrained hardware.

    On the multimodal side, the vision encoder handles variable aspect ratios natively. You can encode images to different token budgets depending on what your use case needs. Speed, memory, or quality. Pick two.

    Deployment is broad. Transformers, llama.cpp, MLX (Apple Silicon), WebGPU in the browser, Rust via Mistral.rs. Google worked with the community to make it available where developers already work.

    What This Means

    Cursor 3 and Gemma 4 are pushing in the same direction from different angles. Cursor assumes AI agents are the future of software development and rebuilds everything around that assumption. Gemma 4 makes that future easier to run locally without sacrificing capability.

    The gap between cloud and local AI coding tools keeps narrowing. Self-hosted agents, efficient on-device models, and better handoffs between environments mean developers can keep sensitive work close while still leveraging AI.

    Both are worth watching closely. The tooling choices developers make today shape what building software looks like tomorrow.

    Sources: Cursor Blog, Hugging Face Gemma 4, Hacker News

    Related Blog

    Cursor 3 and Gemma 4: Two Big Moves Reshaping AI Coding | Tob