Personality as Policy: What Gaming Agents Teach Us About Building Professional AI Systems
·
15 min read
Table of Contents
The Best Instruction Is a Character
Here’s something we didn’t expect to learn from building a tavern keeper.
Otto is an autonomous AI game master that runs tabletop RPGs on Discord. He’s been live for weeks — resolving dice, narrating combat, tracking player characters, managing a $10/month Claude API budget. The standard stuff you’d expect from an agentic system.
But the thing that actually made him work — the decision that solved more problems than any architecture diagram — was giving him a personality.
Not a persona. Not a tone-of-voice guide. A personality — a coherent identity with opinions, habits, and instincts that the model could reason from when the rules ran out.
This turned out to be the single most transferable lesson from gaming agents to professional agent building. And it’s not the lesson anyone talks about in the agentic engineering discourse.
The Problem With Rules
The standard approach to building AI agents is rule-based: write explicit instructions for every scenario. If the user asks X, do Y. If the error is Z, handle it with W. Define your edge cases, enumerate your branches, ship it.
This works fine when the possibility space is bounded. A customer service bot handling return policies has maybe 50 scenarios. You can cover them.
Gaming agents can’t do this. A player can type anything.
- “I wait and watch.”
- “I eat the mysterious mushroom.”
- “I try to befriend the goblin.”
- “What happens next?”
- “I use my lockpicks to perform surgery.”
No rule system covers this. You’d need thousands of branches, and the first playtest would find a gap. We tried the rule-based approach first. Here’s what happened.
The Bandage Incident
Otto’s first version had a clear rule for deciding when to roll dice:
ROLL when failure would create interesting complications.DON'T ROLL when the character is clearly capable and unstressed.A player wrote: “I tear a bit of my cloak and wrap it around my arm, I can’t take too many more of these. I draw my knife. ‘Into the depths.’”
The AI saw “can’t take too many more” — stress signals, danger keywords. It classified: ROLL. The dice came up bad. The narration turned bandaging a wound into a catastrophic fall that broke the character’s ribs and lost their knife.
Bandaging. Ribs broken. Because the rules said “failure would create interesting complications” and the AI could rationalize anything as interesting.
The Waiting Problem
Later, when a player typed “I wait and watch,” Otto confirmed the waiting and stopped. The game froze. Nobody had written a rule for “what happens when a player takes a passive stance.” The AI did exactly what it was told — narrate the action. But waiting isn’t an action. It’s a stance. The world should have moved. The rules didn’t say that.
The “What Next?” Problem
When players asked “what happens?”, the system classified it as an out-of-character question (it contains “what” and ends with ”?”) and returned a rules explanation. The player wanted the story to advance. They got a tutorial.
Every one of these is the same failure: rules describe the expected. Games produce the unexpected.
The Fix: Identity Over Instruction
The breakthrough was rewriting Otto’s system prompt from a rule book into a character:
You are Otto, a grizzled tavern keeper who has watched a thousandadventurers walk through his door. Half came back.
Short sentences. Fragments are fine. Blunt.Dark humor — find comedy in grim situations.Warmth under the gravel. You care about these fools.This doesn’t tell the AI how to handle bandaging during combat. It doesn’t enumerate what “wait and watch” should produce. It doesn’t define the response to “what happens next?”
But it does give the model something to reason from. A grizzled tavern keeper who has seen a thousand adventurers wouldn’t turn a bandage into a catastrophe. He’d shrug: “Sloppy wrap, but it’ll hold.” A keeper who cares about his adventurers would keep the story moving when someone’s waiting. He’d notice something. He’d hear footsteps, or the candle would gutter, or the silence would become suspicious.
The personality derives correct behavior in novel situations because the model can ask: “What would this character do?” instead of “Which rule applies?”
We call this personality as policy — using a coherent identity as the primary behavior-shaping mechanism, rather than exhaustive rules.
Why This Works (Technically)
This isn’t magic. There’s a reason personality directives solve problems that rules don’t, and it comes down to how language models process instructions.
Rules Create Classification Problems
When you write IF failure would be interesting THEN roll, the model must classify the situation against an abstract criterion. “Interesting” is subjective. The model can rationalize any answer. This is a well-known failure mode — LLMs are excellent at constructing plausible reasoning for whatever conclusion they land on.
Personality Creates Inference Problems
When you write You are a grizzled tavern keeper who has seen it all, the model must infer what this character would do. This is a fundamentally different cognitive operation. The model draws on its training data — thousands of tavern keeper characters, veteran mentors, world-weary guides — and synthesizes a behavioral pattern that’s remarkably consistent.
The model is better at “What would a grizzled veteran do?” than “Is this situation interesting?” because character inference is something language models are specifically trained to be good at. They’ve read millions of characters. They’ve barely been trained on game design decisions.
The Generalization Gap
Rules generalize poorly because each new situation needs a new rule. Personality generalizes well because each new situation can be resolved through the same character. When Otto encounters something we never anticipated — a player trying to befriend the monster, or cooking food in a dungeon, or delivering a monologue to an empty room — the personality still applies. A grizzled tavern keeper has a response to all of these, even though we never wrote one.
Personality Is Not Enough: The Supporting Architecture
If you stop at personality, you’ll get an agent that sounds right but still makes structural mistakes. The personality shapes tone and judgment. You still need mechanical constraints and layered defenses for the hard problems. Here’s what we learned about building the supporting scaffolding.
Lesson 1: Behavioral Rules Over Attribute Rules
The personality gives the model who to be. You still need to tell it what to do — but frame the instructions as behaviors, not attributes.
Attribute (doesn’t work):
Be concise. Keep responses brief and atmospheric.Behavior (works):
1-3 sentences for action resolution.Start from the OUTCOME, not the action.Never repeat what the player said they did.“Be concise” is aspirational. “1-3 sentences” is mechanically verifiable. The model knows exactly what compliance looks like.
We organize Otto’s system prompt around behavioral categories — GM Discipline, Pacing, Failure Spectrum, Proportionality — not style adjectives. Each category contains concrete BAD/GOOD pairs:
BAD: "You swing your sword at the goblin. Your blade arcs through the air..."GOOD: "Solid hit — catches it across the ribs. It staggers. Not dead yet."LLMs learn from contrast faster than from instruction alone. Show what NOT to do and the model avoids the pattern.
Professional takeaway: When building any agent’s system prompt, organize it by behaviors the agent should exhibit, not attributes you want it to have. Include anti-examples for every rule.
Lesson 2: Mechanical Constraints as Guardrails
The personality says “be blunt.” But Sonnet — our more capable model — writes beautifully. And beautifully means long. “Be blunt” loses to the model’s instinct to generate rich prose.
The fix is mechanical, not linguistic: max_tokens: 250. This hard-caps output to ~3-4 sentences. The model structures its response to fit, producing a complete thought in the available space rather than a truncated paragraph.
// Haiku gets tight limits — forces brevity for real-time gameplay.const maxTokens = model === MODELS.sonnet ? 600 : 250;This applies directly to professional agents. Don’t rely on prompt instructions for length control. Use max_tokens as a behavioral ceiling. The model won’t fight a mechanical limit the way it fights a verbal one.
| Purpose | max_tokens | Why |
|---|---|---|
| Action resolution | 250 | Forces 1-3 sentences |
| Roll decision | 60 | One classification + one reason |
| Adventure hook | 600 | One-shot, quality matters |
| Summary/epilogue | 600 | One-shot, resonance matters |
Professional takeaway: Every agent output should have a mechanically enforced length limit. The prompt shapes the content. The token limit shapes the container.
Lesson 3: Defense in Depth
The personality handles 80% of situations. Behavioral rules handle 15%. But 5% of situations will still go wrong because AI classification is probabilistic. You need layers.
The bandage catastrophe happened because there was only one gate: the roll decision. When it misclassified, the dice rolled a failure, and the narration turned first aid into a disaster.
Now we have three layers:
| Layer | Mechanism | Cost |
|---|---|---|
| Gate 1: Should Roll | AI classification with concrete examples | 1 Haiku call (~$0.0005) |
| Gate 2: Proportionality | Prompt rule: “Routine actions never cause catastrophic results” | Free (prompt text) |
| Gate 3: Fallback | Code-level fallback narration if AI fails entirely | Free (no AI) |
The roll decision might still misclassify occasionally. But even if it does, the proportionality rule in the narration prompt ensures a failed bandage produces “sloppy wrap, but it’ll hold” not “your ribs shatter.”
Professional takeaway: Every consequential AI decision should have at least two layers: a primary gate and a secondary guard. The secondary layer catches what the primary misses. This is especially critical for any agent that takes actions with real-world consequences — API calls, database writes, user-facing messages.
Lesson 4: Information Architecture (Fog of War)
This is the most architecturally interesting lesson and the one that feels most directly applicable to professional systems.
Otto’s narration used to be fully improvised. The AI had no plan. It narrated turn by turn with no structure — producing stream-of-consciousness that lacked foreshadowing, reveals, or satisfying arcs. Every turn was locally coherent but globally aimless.
The fix was fog-of-war storytelling — separating the planner from the performer:
- Sonnet architects a hidden story outline at game start: secrets with trigger conditions, story acts, possible endings
- Haiku performs each scene but only sees secrets that have been revealed through player actions
- The narrator physically cannot spoil what it doesn’t know
Architect (Sonnet) Gatekeeper (Haiku) Performer (Haiku)
Knows everything ──► Evaluates player ──► Only sees:- 5 secrets action vs triggers - Revealed facts- 3 acts S1: REVEAL - Foreshadow hints- 2 twists S2: FORESHADOW - Scene + history- 3 endings S3: KEEP_HIDDEN - (never hidden secrets)The key insight: the same model (Haiku) plays both gatekeeper and performer, but with different information boundaries. The gatekeeper sees trigger conditions. The performer sees facts and hints. Neither sees the full plan.
The narrator also has a tool — lookup_story_clues — that lets it query previously discovered secrets for continuity. But the tool filters to status === 'revealed' only. Even if the model asks “tell me everything,” the tool physically can’t return hidden information.
This pattern is directly transferable to professional agents:
- A customer success agent could have a hidden context frame with customer health indicators, churn risk signals, and upsell triggers. The conversational layer only sees what’s relevant to the current interaction. It can’t accidentally mention churn risk to the customer.
- A research agent could have a planning layer that breaks a complex query into sub-investigations, while the execution layer only sees the current step. Prevents shortcutting or contaminating partial results.
- A code review agent could have a security analysis layer that flags vulnerabilities, while the user-facing layer only presents findings relevant to the current PR. Hidden findings escalate through a different channel.
The principle: when an agent knows too much, it leaks. Information boundaries aren’t just about security — they shape behavior. An agent that can see the answer skips the work. An agent that can see the secret spoils the reveal. Constraining information flow produces better outputs at every layer.
Lesson 5: Model Tiering Is Cost Architecture
Otto uses two models: Haiku ($0.001/call) for all real-time gameplay, Sonnet ($0.02/call) for one-shot generation like adventure hooks and story frame architecture.
The initial assumption was Sonnet = better = use for important things. In practice, a well-prompted Haiku outperformed a loosely-prompted Sonnet for real-time gameplay. Sonnet wrote longer, not better. For a blunt tavern keeper narrating combat in Discord, longer is worse.
The rule:
| Use Haiku When | Use Sonnet When |
|---|---|
| High-frequency, real-time | Low-frequency, one-shot |
| Speed > depth | Quality > speed |
| Strong system prompt constrains output | Output needs reasoning |
| Brevity is a feature | Richness is a feature |
For Otto, this breaks down to ~$0.12/game total:
| Call | Model | Cost |
|---|---|---|
| Roll decisions (15/game) | Haiku | $0.008 |
| Narration (15/game) | Haiku | $0.045 |
| Story frame generation (1/game) | Sonnet | $0.02 |
| Revelation checks (15/game) | Haiku | $0.015 |
| Arc reviews (2/game) | Sonnet | $0.04 |
The story frame is the most expensive single call at $0.02, and it’s the one that matters most — it sets the narrative backbone for the entire game. Everything else runs on Haiku because it’s fast, cheap, and the strong system prompt does the heavy lifting.
Professional takeaway: Don’t default to the biggest model. Build a tiered strategy where cheap/fast models handle volume and expensive models handle architecture. A well-prompted small model with good scaffolding beats a big model with a generic prompt.
Lesson 6: Memory Transforms Transactions Into Relationships
Otto remembers every person he interacts with. Per-agent memory files track topics, sentiment, first contact dates, and notes. Before generating any response, Otto loads the person’s memory and builds a context block.
This turned out to be the single most appreciated feature. Returning visitors got personalized greetings referencing past conversations. When someone asked “remember that dungeon you mentioned?” — Otto actually could.
But here’s the subtle part: memory doesn’t just improve responses. It changes the agent’s identity. An agent without memory is a function. An agent with memory is a character. The memory gives the personality something to ground on. Otto isn’t just performing “grizzled tavern keeper” — he’s performing “grizzled tavern keeper who remembers that you prefer horror games and lost a character named Whisper last week.”
We also built a reflection pipeline where Otto’s experiences slowly evolve his personality. After weeks of running games, he developed insights like “the threshold itself is the story” and “courage is not about winning, it is about standing ready” — phrases that weren’t hand-written but emerged from processing hundreds of interactions. These get appended to his system prompt and subtly color future responses.
Professional takeaway: If your agent interacts with the same users repeatedly, memory isn’t a nice-to-have — it’s a category shift. The difference between “helpful assistant” and “my assistant” is whether it remembers last Tuesday.
The Meta-Lesson: Gaming Agents and Professional Agents Face the Same Infinity
The standard framing is that gaming agents are toys and professional agents are serious. But the core engineering problem is identical: how do you build an agent that handles situations you didn’t anticipate?
A customer service agent will encounter questions nobody wrote rules for. A coding agent will encounter codebases with patterns it hasn’t seen. A research agent will hit dead ends that require improvisation. The possibility space is always larger than the rule set.
Gaming agents deal with this every turn. Every player message is a novel input. The solutions we discovered — personality as policy, behavioral rules over attribute rules, information boundaries, layered defenses, model tiering, memory as identity — aren’t gaming solutions. They’re solutions to the general problem of building agents that operate in open-ended environments.
The gaming context just forced us to discover them faster, because games break in entertaining ways and the feedback loop is immediate. When Otto turned a bandage into a broken rib, we knew within seconds. When a professional agent makes an analogous error — a proportionally wrong response, a leaked piece of internal context, a rule that doesn’t cover the edge case — it might take weeks to surface.
The tavern keeper taught us to build for the unexpected. That lesson carries.
Principles for Professional Agent Builders
-
Personality as policy. A coherent identity generalizes to novel situations. Rules don’t. Give your agent a character it can reason from when the instructions run out.
-
Behaviors over attributes. “Be helpful” is aspirational. “Acknowledge the user’s problem before suggesting a solution” is actionable. Write instructions the model can mechanically verify.
-
Show the anti-pattern. BAD/GOOD pairs teach faster than instructions alone. The model learns what to avoid and what to aim for in a single example.
-
Constrain mechanically.
max_tokens, structured output schemas, and information boundaries shape behavior more reliably than prompt instructions. The prompt is the suggestion. The constraint is the law. -
Layer your defenses. Every consequential AI decision needs at least two gates. The primary classifies. The secondary mitigates. The tertiary falls back to code.
-
Control information flow. Agents that know too much leak. Build explicit boundaries between planning layers and execution layers. The performer should never see the full plan.
-
Tier your models by purpose. Cheap and fast for volume. Expensive and deep for architecture. A well-prompted small model with good scaffolding outperforms a big model with a generic prompt.
-
Memory is identity. Agents without memory are functions. Agents with memory are characters. If your users come back, build memory first.
-
The prompt is the product. More engineering effort should go into the system prompt than the surrounding code. A great prompt on mediocre infrastructure outperforms a mediocre prompt on great infrastructure.
-
Test with degenerate inputs. The player who bandages during combat, the user who asks the unanticipated question, the edge case that doesn’t fit any category — these reveal more about your agent than a thousand happy paths.
Every lesson here came from watching a tavern keeper try to run a dungeon. The professional applications were hiding in the dice rolls the whole time.
Companion post: The Empty Tables: What We Learned Building an AI Game Master