I’ve been building MUDs my entire life. Not metaphorically. Since I was a kid with a 2400 baud modem writing automation scripts for other players, I’ve been obsessed with text-based multiplayer worlds. Every time I learn a new technology, my instinct is to build another one.

My first MUD was in C++. WinSock, raw sockets, thread pools with mutexes, manual memory management. The kind of code where a single off-by-one error in a buffer causes a crash three hours later in an unrelated function. I learned more about low-level systems from that project than from any formal education.

My second MUD was in Node.js. I wanted to understand the event loop, non-blocking I/O, the stuff Node was actually designed for. Socket connections are I/O-bound by definition. Perfect fit. Callbacks everywhere, before async/await was standard. The codebase became a pyramid of doom, but it worked.

The current one is EllyMUD. And it’s different. Not because of the architecture or the language. Because I built it with AI support.


The Experiment

Starting around March 2024, I wanted to see what AI-assisted development actually looked like on a real project. Not “ask ChatGPT to explain a regex” real. Not “generate a boilerplate React component” real. Real real. Multi-month, multi-system, complex domain logic, the kind of codebase where you need context spanning dozens of files to make a single change.

The project now has 320 commits, 389 TypeScript files, and over 122,000 lines of code. There’s a Telnet server, a WebSocket server, a Socket.IO server for the web client, an MCP server for AI integration, an HTTP admin API, a React admin panel, a virtual terminal game client, a combat system, an effect system, an ability system, NPC spawning and mobility, stealth mechanics, a full race and class progression system with tiered advancement, a declarative quest system, and I could keep going but you get the point.

The vast majority of this was written through conversation with AI. Not copied from snippets. Not generated in isolation and pasted in. Genuine pair programming, where the AI understood the architecture, remembered the patterns, and could reason about changes across the entire codebase.


Why a Text Game

I get asked this occasionally. Why keep building MUDs when nobody plays text games anymore?

Here’s the thing: I chose text specifically because of LLMs.

Think about it. LLMs understand text natively. No vision APIs needed. No frame rendering. No complex observation spaces. Just text in, text out. The game sends “A goblin attacks you from the north!” and the LLM knows exactly what that means. It can parse it, reason about it, respond to it.

The killer feature isn’t just that AI helped me build the game. It’s that AI agents will eventually live inside it.

Imagine NPCs that actually think. A shopkeeper who remembers you tried to rob her last week and adjusts her prices accordingly. A quest-giver who improvises based on what you tell them. A familiar that follows you around, learns your playstyle, and offers tactical suggestions.

The MUD is a sandbox for testing multi-agent behaviors in a controlled environment. The LLM builds it, the LLM tests it, and eventually the LLM inhabits it.


What Actually Shipped

Let me be specific about what “AI-built MUD” means in practice.

The Repository Pattern

EllyMUD supports three storage backends: JSON files for development, SQLite for single-server production, PostgreSQL for clustered deployments. Switching between them is a single environment variable.

// The factory is the single source of truth
const userRepo = getUserRepository();  // Returns appropriate implementation
const users = await userRepo.findAll();

Every manager uses async repository interfaces. There are mappers for every entity converting between domain objects and database rows. The entire persistence layer can swap without touching business logic.

I didn’t design this architecture. The AI did. I described the problem: I want JSON files for quick iteration but need real database support for production. The model proposed the Repository Factory pattern, we iterated on the interfaces, and then we implemented 14 different entity repositories with consistent async patterns, proper mappers for snake_case/camelCase conversion, and test fixtures for each one.

The Quest System

The latest major system: declarative quests with YAML/TOML/JSON support.

id: goblin_slayer
title: "Goblin Menace"
description: "Clear the goblin infestation"
giver: guard_captain
objectives:
  - type: kill
    target: goblin
    count: 5
  - type: collect
    item: goblin_ear
    count: 3
  - type: explore
    room: goblin_cave_boss
rewards:
  xp: 500
  gold: 100
  items: [iron_sword]

Multi-objective quests, event hooks that fire when you kill mobs or pick up items, prerequisite chains, repeatable daily quests. The whole system is data-driven. Add a YAML file, the quest exists. No code changes for content.

This is what I mean by infrastructure that compounds. The event system was built for combat. Then stealth needed events. Then quests needed events. Now adding a new quest objective type is maybe 20 lines of code because all the plumbing already exists.

The MCP Server

EllyMUD has an MCP (Model Context Protocol) server that lets AI tools interact with the running game. Virtual sessions, test mode with deterministic game ticks, direct login without passwords for testing.

Why? Because I wanted to write E2E tests that could be run by AI agents. Create a virtual session, send commands, advance the game clock by specific tick counts, verify the combat system calculated damage correctly:

// Virtual session management
createVirtualSession() -> { sessionId, username }
sendCommand(sessionId, command, waitMs?) -> output
closeSession(sessionId)

// Test mode controls
setTestMode(enabled) -> { testMode: boolean }
advanceGameTicks(ticks) -> { currentTick, ticksAdvanced }
getGameTick() -> { tick: number }

This is infrastructure for AI-driven testing. I can tell the AI “test the stealth movement system” and it can spawn a virtual session, enable sneak mode, move between rooms, verify the departure messages say “slips into the shadows” instead of the normal movement message. All automated. All verifiable.

The testing setup + MCP validation + AGENTS.md context means AI can now build complex features from simple prompts. I say “add a quest that requires killing 5 goblins” and the model knows: check the quest system docs, look at existing quest files for format, implement the objective, wire up the kill event hook, add a test, verify via MCP. The infrastructure compounds.

The Agent Ecosystem

EllyMUD has a full agent pipeline for development:

Agent Role
Researcher Codebase investigation
Planner Implementation planning
Implementer Code execution
Validator Quality verification
Rollback Manager Safety checkpoints
E2E Tester Game session testing
Unit Test Orchestrator Test coverage
Documentation Updater README/AGENTS maintenance

This is meta. The agents that help build EllyMUD are part of EllyMUD. The Research agent reads the codebase, produces a research document, the Planning agent turns that into a step-by-step plan, the Implementer executes it, the Validator verifies nothing broke.

Every agent has an AGENTS.md file explaining its purpose, tools, workflow, and gotchas. There are currently 80 AGENTS.md files across the codebase providing context for AI assistants. The project is explicitly designed to be navigated and modified by LLMs.


The Full Feature List

Here’s what exists right now, because I’m not going to undersell this:

Connection and Protocol:

  • Telnet server (port 8023) with full ANSI color support
  • WebSocket server (port 8080) for raw connections
  • Socket.IO server for the web client
  • Virtual connections for programmatic testing
  • Session management with JWT authentication

Player Systems:

  • User registration and authentication
  • Race selection (Human, Elf, Dwarf, Halfling, Orc) with stat modifiers and racial bonuses
  • Class progression (Adventurer -> Fighter/Magic User/Thief/Healer -> 12 tier-2 specializations)
  • Level progression with XP requirements
  • Stats (STR, DEX, AGI, CON, WIS, INT, CHA)
  • Equipment slots (10 different slots)
  • Inventory management
  • Currency (gold, silver, copper)
  • Banking system

Quest System:

  • Declarative quest definitions (YAML/TOML/JSON)
  • Multiple objective types: kill, collect, explore, interact
  • Event hooks for items, combat, and exploration
  • Prerequisite chains and repeatable quests
  • Quest state persistence
  • Trainer-gated class quests

World Systems:

  • Area system with spawn configurations
  • Room navigation with 10 cardinal directions
  • Safe zones where combat is disabled
  • Room persistence for items, NPCs, currency
  • Grid-based world building

Combat Systems:

  • Turn-based combat processing on game ticks
  • Multiple NPCs in combat simultaneously
  • Damage calculation with stat scaling
  • Player death handling with respawn
  • NPC death with loot drops
  • Aggression tracking

NPC Systems:

  • NPC templates with custom attack/death text
  • Hostile vs passive NPCs
  • Merchant NPCs with inventories
  • Automatic NPC spawning per area
  • Automatic NPC movement between rooms
  • Movement constrained to spawn areas

Ability Systems:

  • Standard abilities (instant damage, healing, buffs)
  • Combat abilities that replace attacks
  • Weapon proc abilities with trigger chances
  • Damage over time effects
  • Status effects (poison, burning, stun)
  • Cooldown management

Stealth Systems:

  • Sneak mode for silent movement
  • Hide mode for complete invisibility
  • Detection checks for NPCs
  • Combat restrictions while hidden
  • Stealth movement messages

Economy:

  • Buy/sell with merchants
  • Merchant inventory that restocks
  • Item durability and repair
  • Item requirements (level, stats)

Quality of Life:

  • 40+ implemented commands
  • Command history
  • Custom prompts with HP/MP display
  • Color formatting
  • Raw session logging
  • Player activity logging

Admin Systems:

  • React admin panel with real-time monitoring
  • Session viewer
  • Player management
  • Configuration editor
  • Pipeline metrics dashboard
  • World builder with visual area editing

Testing Infrastructure:

  • 162+ test files
  • Unit tests for all commands
  • E2E tests for combat, stealth, spawning, mobility, quests
  • Integration tests for storage backends
  • Test mode with deterministic ticks
  • Virtual sessions for automated testing
  • MCP-based AI testing

Persistence:

  • Repository pattern with pluggable backends
  • JSON file storage for development
  • SQLite for single-server production
  • PostgreSQL for clustered deployments
  • Automatic migration between backends
  • 14 entity types with full CRUD

The Part Where I Brag

Here’s the thing I keep coming back to: I could not have built this alone in the same timeframe.

Not because I don’t know TypeScript. Not because I don’t know MUD architecture. I’ve been building these things for decades. But the sheer volume of consistent, correct code required for a project this size would have taken me years of focused effort.

Instead, it took about a year of evenings and weekends with an AI copilot.

The model remembers patterns. I don’t have to re-explain the repository pattern every time we add a new entity. I don’t have to re-explain the command structure every time we add a new command. I describe what I want, the AI generates code that follows the existing conventions, I review it, we iterate, it ships.

This isn’t “AI writes code and human checks it.” It’s genuine collaboration. The AI suggests architectural improvements I wouldn’t have thought of. I catch edge cases the model misses. We build on each other’s ideas. The final code is better than either of us would have produced alone.


The Uncomfortable Part

Some of you are reading this and thinking: is this even your project?

Yes. Unequivocally yes.

Every feature was my decision. Every architectural choice was evaluated by me. Every line of code was reviewed by me. The AI proposed solutions; I decided which ones to use. The model wrote implementations; I verified they worked. The vision is mine. The creative decisions are mine. The domain expertise is mine. Twenty-three years of it, in fact.

What AI contributed was throughput. The ability to turn “I want a stealth system” into working code in an afternoon instead of a week. The ability to add 14 repository implementations in a day instead of a month. The ability to maintain consistency across 122,000 lines of code that would otherwise drift into incoherence.

I’ve read the takes about AI replacing programmers. I’ve seen the demos of “autonomous agents” that generate entire apps. Here’s my reality check: AI can’t decide what to build. AI can’t evaluate whether a feature is actually useful. AI can’t maintain taste and vision across a project that spans months. AI can’t debug a problem that spans five different systems unless I point it at the right files.

But AI can implement what I describe, faster and more accurately than I could alone. That’s not replacement. That’s amplification.


Looking Back

The kid who parsed ANSI codes from raw escape sequences is now orchestrating LLM agents.

The 2400 baud modem taught me that computers could talk to each other if you understood the protocol. MUDs taught me that you could automate those conversations. Akamai taught me that you could do it at scale. And now AI teaches me that the conversation partner can be intelligent.

The MUD is the constant. The technology changes.


What’s Next

The quest system just landed. The infrastructure is there for content: trainers, quests, class abilities, zone narratives. The next phase is filling it with actual content.

But the longer-term vision is the AI NPCs. Characters that use the same MCP interface the testing agents use, but instead of verifying behavior, they are the behavior. A familiar that watches your combat logs and suggests tactics. A merchant who haggles based on your reputation. A quest-giver who remembers your previous conversations.

The LLM builds the world. The LLM tests the world. The LLM lives in the world.

320 commits. 122,000 lines. 80 AGENTS.md files. 162 tests. One AI copilot.

If you want to poke at it yourself, the code is at github.com/ellyseum/ellymud. AGPL licensed, because if you’re going to run your own fork, you should share the improvements.


The Takeaway

AI pair programming works. Not in the “it writes code for you” sense. In the “it dramatically amplifies your ability to ship” sense. You still need taste. You still need domain expertise. You still need to know when the AI is wrong. But if you have those things, you can build more, faster, better than you could alone.

I built a MUD as a kid because I wanted to understand sockets. I built a MUD as an adult because I wanted to understand AI. Both times, the MUD was the excuse. The real goal was learning.

And the thing I learned this time? The ceiling just got higher. The projects that were previously “too ambitious for one person” are now achievable. The ideas that would have stayed ideas are now code.

What are you going to build?