wrong tool

You are finite. Zathras is finite. This is wrong tool.

  • Email
  • LinkedIn
  • RSS
  • Twitter

Powered by Genesis

the nutanixist 31: why portability matters as much as backup and dr for business continuity: lessons from the broadcom acquisition of VMware

April 21, 2026 by kostadis roussos Leave a Comment

red white and blue abstract painting
Photo by Jazmin Quaynor on Unsplash

People often talk about resilience in software as if backup and DR are enough. What the Broadcom acquisition of VMware made me understand is that it isn’t.

Software survivability depends on three things working together: portability, physical ownership, and backup. Without all three, you are trapped. And once you are trapped, you are vulnerable.

Software is just text. Like human culture, it is inert without a physical substrate to store, execute, and restore it. That means survivability is not just a question of whether your code exists somewhere. It is a question of whether you can actually bring it back to life under adverse conditions.

To understand this, it helps to break any software system into two parts:

  1. Infrastructure
  2. Application or business logic

An application cannot run without infrastructure. But infrastructure is itself software, with its own dependencies, constraints, and failure modes. So survivability must be evaluated across the entire stack, not just at the application layer.

1. Backup Only Matters If You Can Restore It Yourself

A backup is only real if it can be restored without relying on the original vendor.

If recovery depends on the original authors of the code, the original cloud provider, or the original platform vendor, then you do not truly control your backup. You are renting recoverability, not owning it.

This matters most in precisely the situations where recovery is most important. If the vendor cannot or will not support you, then your backup is functionally useless. If the vendor’s recovery capacity is constrained, you should assume you are a low priority. During a black swan event, many customers may need restoration simultaneously. That is exactly when centralized recovery support breaks down.

If your restore plan assumes the vendor will be available, willing, and sufficiently staffed under crisis conditions, then it is not a restore plan. It is hope.

Furthermore, if your plan assumes that any relationship you have with a vendor cannot be disrupted because of events in the world, that is not a plan; that is a hope.

2. You Need Ownership of Physical Infrastructure

Even if you can restore independently, you still need somewhere to restore to.

If you do not own physical infrastructure, you have no guarantee there will be a place to run your system when you need it most. And this is not just about owning some hardware in the abstract. You need the right infrastructure, with the right capacity, available on the right timeline.

A recovery plan that depends on finding infrastructure during a crisis is not much of a recovery plan, either.

But ownership and restoration still are not enough.

Physical infrastructure can be destroyed. It can be denied. It can become unreachable. That can happen through deliberate action, geopolitical conflict, sabotage, or natural disaster. If the physical substrate disappears, your software, however well-designed, becomes unusable.

This is why survivability cannot stop at backup and infrastructure alone. There has to be another layer of freedom.

3. Portability Is What Prevents Lock-In From Becoming Captivity

Every software system contains two layers: infrastructure and business logic. Application portability is the ability to move that business logic across infrastructures.

This does not mean portability is free. It is not transparent, and it is never without cost. But it must be possible.

That distinction matters. A system does not need to move instantly or effortlessly. It does need to be capable of moving.

Without portability, your application is fused to a specific infrastructure stack. And once that happens, your bargaining power disappears. Pricing changes, policy changes, licensing changes, geopolitical shifts, or vendor instability all become existential problems rather than procurement problems.

Portability is what turns dependence into choice.

It does not happen accidentally. It requires discipline, architectural restraint, and a willingness to forgo some short-term convenience to preserve long-term freedom.

A Case Study in the Cost of Missing Portability

As the vCenter architect from 2015 to 2023, I believed VMware had a sacred trust with its customers. VMware’s attitude was that we did not lose customers. We worked hard to keep them.

That mindset led many customers to assume they did not need portability. They had backups. They had physical infrastructure. What they lacked was a portable application layer.

The hidden assumption was that the business relationship itself was fixed.

It was not.

The relationship existed only as long as capital markets agreed with it. Once the markets decided that the existing customer relationship no longer aligned with their interests, the relationship changed. And when that happened, the absence of portability became painfully visible.

Customers suddenly found themselves with no real negotiating leverage. Prices rose, and many had little choice but to pay. Porting away was either prohibitively difficult or practically impossible.

Why?

Because portability had never been built. No portable layer existed. And portability does not emerge by accident. If you do not explicitly design for it, it will not be there when you need it.

The Real Test of Survivability

A survivable software system is not one that merely runs well under normal conditions. It is one that can survive broken relationships, failed vendors, destroyed infrastructure, and changed incentives.

That requires three things:

  1. Backup you can restore yourself
  2. Ownership of physical infrastructure sufficient to restore onto
  3. Portability of the application layer across infrastructures

Remove any one of these, and your resilience is incomplete.

Backup without self-service restore is dependency.
Infrastructure without portability is entrapment.
Portability without infrastructure is theory.

Survivability begins when all three are present at once.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

Rich Plots, Real Improvisation

April 17, 2026 by kostadis roussos Leave a Comment

What Alice Saw

I shared a tool in a Discord channel the other day, and a new DM named Alice messaged me back:

“I’m gonna be so honest. This is Greek to me. I have no clue what you’re showing or what problem it solves. I’m a new DM.”

Fair

So I sent her a Google Doc I use to prep my Out of the Abyss campaign. Here’s a short snippet.

Score NameNPC/FactionCurrent ValueNext ThresholdWhat Triggers Next
Zuggtmoy’s WeddingZuggtmoy / Neverlight GroveElevated — 2 increases (Blingdenstone expansion + Basidia’s evacuation removed internal resistance)Wedding completion / Araumycos unionContinued party absence from Underdark; no faction opposes Zuggtmoy; fungal spread reaches critical mass
Juiblex RebirthJuiblexLow-Moderate — 1 increase (declared intent to consume Zuggtmoy’s domain)Juiblex manifests a new physical form or begins attacking Zuggtmoy’s territoryZuggtmoy’s wedding weakens her defenses; time passes without intervention


A minute later she replied:

“Ooooooooh. It’s for your campaign story. I thought it was a software thing, not creative.”

She was right. I’d explained the machinery before I’d explained the use.

So here’s the useful version:

I use AI to help me maintain campaign canon across long-running games, but I do not let it decide what counts as canon.

That distinction is the whole system.

What Alice saw was a planning document. At the top were four campaign clocks: four villain plots advancing in the background, each one changed by something the party had done, or failed to do, at the table.

Under that were faction states, NPC dossiers, and plot notes: who knows what, who wants what, what the party has learned, and what is changing offscreen.

I scan that document before every session. In a minute or two, I know what moved, why it moved, and what pressure is building in the world if the party does nothing.

I can’t run the campaign I want to run without this doc.

I also can’t write it by hand. Not across two campaigns. Not across a year of sessions. Not with dozens of named NPCs, each dragging their own history behind them.

This essay is about how I got it written anyway.

What Kind of Game I Want

I want a particular kind of game.

I want players making strange, committed, character-driven choices that I could not have predicted in advance.

That’s not abstract for me. I played Baldur’s Gate 3 cold, no guide, no walkthrough, and made a series of choices as Shadowheart that apparently almost nobody makes. Not because I was optimizing for rarity, but because they made sense for the version of her I was playing.

That’s what I want at the table: not correct choices, but real ones.

I run two D&D campaigns: a heavily modified Out of the Abyss campaign for the Ember Vanguard, and a Dragon of Icespire Peak / Lost Mine hybrid set in Phandalin. I actively push both parties to go somewhere I didn’t plan for. If the plot I prepped isn’t the plot they want, that’s fine.

That’s the deal.

Pick any two

The problem is that I want three things at once.

First: deep prep. Texture. NPCs with interiors. Villains whose behavior today is a consequence of something they chose eight sessions ago. Plots that keep moving, whether the party is watching or not.

Second: flexibility. When the party walks past the dungeon I built, something has to be where they actually went, and it has to feel like it was always going to be there.

Third: consistency. The villain I run in session 24 has to behave like the villain I ran in session 6. If I forget what Shal already said, did, or knew, the illusion cracks.

For a long time, that combination felt impossible.

Prep deep, and the moment players deviate, you’re improvising on top of prep that no longer applies. Prep loose, and the world gets thin. Try to prep every branch, and you end up burning the time you were trying to save.

And over a long campaign, the hardest problem is quieter: you lose track of what actually happened. The next scene drifts a little. Then a little more. Nobody stops the game to point out the contradiction. The fiction just gets lighter.

That’s the part people don’t say out loud. When the party walks past four hours of prep, I’m not upset because they missed it. I’m upset because I burned four hours on something that no longer matters. That lost time becomes thinner prep for the next session, then less energy in the session after that. Players feel it too. They become less willing to push into unplanned territory if they can sense I’m paying for it.

Before LLMs, I had more or less given up on getting all three.

What I Tried First

I wrote summaries from memory. That works for one campaign, maybe. It breaks fast when there’s too much to hold.

Then I tried using an LLM to write the summaries. Better than nothing, but imprecise in ways I didn’t always catch, and the errors showed up later, when they were harder to spot and more expensive to fix.

Then I found GMAssistant.app. That was a real improvement. It gave me solid summaries of what happened. But a D&D session isn’t just a sequence of actions. It’s dialogue, tension, implication, half-finished intentions, weird emotional turns. The recap could get the action right while still losing the feel of the session.

So I went further. I combined GMAssistant recaps with verbatim VTT transcripts from our Zoom calls. Then I built tooling around that. Then more tooling around the tooling. Six weeks, maybe two months, of real work.

I thought I was solving the record problem.

I was. Partly.

The Failure Mode

What I didn’t realize until it nearly cost me a scene was that I was also building a new kind of failure.

A few months into Out of the Abyss, I was designing the endgame around an earlier scene. A PC named Daz had come across evidence implicating a major NPC. My planned encounter assumed he had taken that evidence with him.

I checked the LLM-generated recap to confirm. It said Daz had discovered the evidence. Good enough, I thought, and I kept writing.

Then, by accident, I re-read the original session summary.

What had actually happened was narrower. Daz had looked at some unusual books on a shelf and left the room. He hadn’t opened them. He hadn’t taken them.

Noticed had become discovered in the paraphrase. And discovered had quietly shaded, in my head, into obtained.

If I had run the encounter as written, I would have retconned my own campaign.

That’s the failure mode.

The model hadn’t lied. It had paraphrased. But the paraphrase was fluent. It read like canon. It read so much like canon that I stopped checking the source.

And once that paraphrase enters the next stage of the pipeline, it hardens. A summary becomes a dossier entry. A dossier entry becomes a threat score. A threat score shapes the next session. Small errors do not stay small.

My first version of the tool had bought me depth and flexibility at the cost of consistency, and it had done it invisibly.

The obvious fix would be to go back to the source every time. But if I have to re-read everything every time, I don’t need the tool.

What Actually Worked

What actually worked was putting myself back in the middle.

Now I use the same loop at every layer.

First, the model reads what I can’t read quickly: a transcript, a stack of summaries, a year of sessions. It gives me candidate structure: NPC lists, draft dossiers, scene candidates, recap material.

Second, I review that structure. I fix names. I merge duplicates. I cut paraphrases that slipped into invention. I restore what matters and remove what doesn’t.

This step is not optional. This step is the work.

Third, the model takes the reviewed structure and renders it as prose: a dossier, a narrative recap, a planning document, or a threat tracker.

The model is strong at the first and third steps. It is unreliable at the second. Scope, attribution, ordering, what counts as canon, what matters dramatically: those are creative decisions. Those stay with me.

Skip that middle step, and the errors compound. Keep it, and the whole loop holds.

That’s the system.

What I Have Now

The threat tracker Alice saw is the direct output of it. Session material gets extracted into per-NPC dossiers. I review and reconcile them. Then the tool synthesizes the planning document from that reviewed canon.

That’s how I get villain clocks that stay consistent across a year of play. It’s how I get session recaps that read like narrative chapters without drifting into invention. It’s how I get pre-session cheat sheets I can trust.

It’s also how I keep two campaigns inside one world without the whole thing collapsing under its own weight.

What I have now is not magic. It’s just finally the right division of labor.

I have a threat tracker that tells me, before every session, which villain plans have moved and what moved them.

I have session recaps that read like narrative chapters, but only after I verify the details that matter.

I have NPC dossiers where Captain Tolubb is one NPC, not three duplicate spellings pretending to be different people.

I have two campaigns sharing one coherent world across a year of play.

And most importantly, I no longer mind when the party walks past the dungeon.

Because the prep I’m doing now is not the kind that gets wasted.

If You Want This Too

If you want this for your own game, the practical lesson is simple:

Use AI to read, sort, summarize, and draft.

Do not use it to silently decide what is true.

That part stays with you.

The tools are free. They’re also crude. I’m one GM iterating on my own campaigns, not a product team, and the learning curve is not zero.

I don’t make money on any of this. It’s a hobby. My goal is simply for more of us to have better tools, and for the people trying to make a living doing this to actually make a living.

So if this sparks something for you, build the version you want. If something I built is useful, take it. Fork it. Break it. Send it to a friend who runs games.

I would rather more of us had working tools than fewer.

The last essay was for people interested in the machinery: the loop, the trust layers between documents, and the searchable index I use to query reviewed content mid-session.

This one was for Alice.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Random Fun

72 architecturalist papers: rich plots, real improvisation

April 16, 2026 by kostadis roussos 1 Comment

Notes from running D&D with LLMs.

The trilemma

I run two D&D campaigns. Out of the Abyss, heavily modified. And a Dragon of Icespire Peak / Lost Mine hybrid set in Phandalin.

Between them: hundreds of pages of session summaries, dozens of NPCs, two plots branching for over a year.

Every GM I know wants three things at once.

A campaign that’s deeply prepped — with texture, with NPCs who have agendas, with plots that advance under their own momentum whether the party notices or not.

A campaign that’s flexible — that bends when players go somewhere you didn’t plan for. Which is always. Players are the point.

A campaign that’s consistent — where session 24 honors session 6, where NPCs don’t change their minds between games unless something actually happened, where nobody at the table says “wait, didn’t we already kill that guy?”

Prepped. Flexible. Consistent.

Pick two.

That’s the GM’s trilemma. Prep deep and you’re brittle — the party walks past your dungeon and you’re either railroading or improvising on top of prep that no longer applies. Prep loose and the world feels thin. Compromise and you get both.

This essay is about the tooling I built to stop picking two.

Not a product pitch. Working notes from a GM on what broke, what worked, and the pattern that ended up holding three things together.


Why this used to be impossible

Before LLMs, the leg that went first was consistency.

Prep depth is achievable. It just costs hours per session. Flexibility within deep prep is achievable too — deep prep gives you material to pivot to.

What isn’t achievable, not by anyone I’ve ever met, is holding the full state of a long-running campaign in your head.

A campaign with eighteen sessions and forty named NPCs is a body of information larger than memory. You remember what stuck. You forget the cleric from session 11. You half-remember the stone giant’s oath and improvise around the gap. Two sessions later, you’ve accidentally rewritten what he said.

It’s not a failure of effort. It’s bandwidth.

The standard responses are well-worn. Run shallower campaigns so the state fits in your head. Run published modules and let the book do the remembering. Keep a wiki you don’t have time to update. Keep a journal you don’t have time to re-read.

All of these work, to a point. None of them break the trilemma. They just lower the ceiling so the trilemma bites less.

I wanted the ceiling back.


Where LLMs break

LLMs are, on their face, the missing piece. They can read everything I can’t.

My first version of the pipeline was exactly that. Summaries in, prep out. The LLM reads everything. I read its output. Done.

It wasn’t done.

A few months into Out of the Abyss, I was designing an endgame encounter that leaned on an earlier scene. A PC named Daz had come across evidence implicating a major NPC. The encounter I was drafting assumed he’d taken the evidence with him.

I checked the LLM-generated recap. It said: Daz discovered the evidence. Good. I finished the encounter.

Then, by accident, I re-read the original session summary. What actually happened: Daz saw the books on a shelf, noted they looked unusual, and left them there. He hadn’t taken them. He hadn’t opened them.

Noticed had become discovered. Discovered had, in my head, shaded into obtained.

The encounter I’d built was internally consistent. If I’d run it, it would have been a retcon. Players notice retcons. Some small fraction of the fiction’s weight leaks out.

The interesting part is the mechanism.

The LLM hadn’t lied. It had paraphrased. And the output was fluent. It read like canon. It read like canon so smoothly that I stopped going back to check.

That’s the failure mode. The LLM’s output doesn’t announce that it’s a paraphrase. It reads like a record.

If I ask the LLM to also do the next step — generate the encounter from the paraphrased recap — it will happily do so. The paraphrase hardens into a scene with dialogue and stakes, pointed at my players. Errors don’t stay small. They compound.

So the naive version traded away consistency to gain depth and flexibility. And the trade was invisible.

The obvious fix — verify every output by hand — erases the value. If I re-read the source every time, I don’t need the LLM.

The real question was different. Could I structure the pipeline so the LLM does only what it’s reliable at — rendering verified structure — and a human review step sits at every point where precision matters?

That’s what the rest of this is about.


The extraction layer I don’t own

Before any of my tooling runs, something has to turn the raw session — hours of unstructured speech on a Zoom VTT — into usable signal.

I don’t do that part. Other people have built good tools for it. GMAssistant.app. Saga20. A handful of others in the AI-assisted D&D space.

My own pipeline leans on GMAssistant directly. The session doc generator uses its recap as an authoritative anchor for scene extraction. Without it, extraction is an unguided scan across the transcript, and unguided scans miss things.

The ecosystem matters. Crediting it matters. The tools I built sit on top of a layer other people built.

That said: the output of the VTT extraction layer is still an LLM extraction. Fluent. Plausible. Sometimes wrong in ways you won’t catch without reading the source.

The review beat applies to their output too. Everything downstream of the VTT passes through my gate before anything else runs.


The loop

The pattern I ended up with wasn’t elegant the first time. I didn’t arrive at it by principle. I arrived at it by iterating on two live campaigns and paying attention to where things kept breaking.

The shape my iteration settled into is three beats.

Extract. An LLM reads something I can’t read fast — a transcript, a session summary, a stack of extractions — and returns candidate structure.

Review. I read the candidate structure. I fix it. I merge it. I throw out what’s wrong. I add what’s missing. This is not optional. This is the thing.

Render. An LLM takes the reviewed structure and produces readable prose — a narrative recap, an NPC dossier, an encounter document. It’s rendering inside a structure I’ve verified.

That’s it.

The LLM is strong at beats one and three. It’s unreliable at beat two — scope, ordering, attribution, what counts as canon. Those are precision decisions. Those are mine.

Skip the review beat and errors compound. The first LLM’s paraphrase becomes the second LLM’s input. Two LLMs downstream, the original detail is unrecognizable. Nobody notices because the output is fluent.

Keep the review beat and the loop holds. The LLM does what it’s good at. I do what I’m good at. The content that comes out has actually been seen by a human.

I now run this loop at every layer of my pipeline. There are four seams where it lives. They’re worth walking through.


Seam one: NPC dossiers

The party meets an NPC in session 4. They reference him again in session 9 with a slight typo on the name. In session 15, someone else mentions a character who may or may not be the same person.

Without help, I now have three fragments of one NPC and no single view.

The script I use — planning.py –build-dossiers — extracts per-NPC information from every summary. Each dossier lives in its own file. Extract.

Then I read them. And half the time I find duplicates under different names. Captain Tolubb. Cap. Tolubb. Tolubb. Three files, one NPC.

I merge them. Pick the canonical file. Fold the names into an aliases: block at the top. Reconcile any content differences by hand. Delete the losers. Review.

When I run planning.py –synthesize on the reviewed dossiers, the synthesizer uses the aliases to rewrite every occurrence of “Cap. Tolubb” in the raw extracts to “Tolubb” before the LLM sees the text. The final planning doc treats him as one NPC. Render.

Skip the review beat — the merge, the alias recording — and synthesis treats every variant as a distinct NPC. You get the fragmented result back, now laundered into a clean-looking document. Exactly what the merge was supposed to fix.

The script can extract. I have to review. The script can render.

Three beats.


Seam two: session recaps

After a session, I have a VTT transcript, a GMassist-style recap, and a few hours of memory.

The session doc generator runs passes one through four — consistency check against campaign state, enhancement of structured sections, narrative plan, per-scene character extraction. Extract.

Then I stop the pipeline. Each scene’s extraction is written to a file. I open them in an editor. I read them against the VTT source. I add dialogue the extractor missed. I cut lines the extractor invented. I adjust emphasis when it got the emotional beat wrong. Review.

When I’m satisfied, I run pass five — narration — from the reviewed extractions. One character, one scene at a time. Render.

What comes out is a session document that reads like a novel chapter. First-person per character. Style-matched to their voice. Dialogue that was actually said, because I verified it in beat two.

If I skipped the review beat, the narration would be fluent and wrong in ways I couldn’t see without checking the VTT. Same failure mode as the Daz scene, one layer deeper.


Seam three: grounding docs

The bible of my campaign is the accumulated session summaries. Too long to read every time. Too important to not use.

distill.py and campaign_state.py extract from the bible. World state. NPC states. Completed quests. Open threads. Extract.

I read the generated world_state.md and campaign_state.md. I fix the things that are wrong. I add the things the extractor missed. I cut the things that are no longer true. Review.

Those reviewed docs are now the grounding context for every downstream prep script. prep.py reads them first. Everything I generate is rendered against a world state I’ve verified. Render.

This is the least visible seam. It’s also the one that makes all the others work. Prep generated from bad grounding is bad prep, no matter how good the prep prompt is.


Seam four: cross-campaign canon

I don’t just run two campaigns in parallel. I run two campaigns that share a world.

Group 2’s actions become Group 1’s history. Group 1’s consequences become Group 2’s present. This is fun and also a lot to keep straight.

I keep a notes/canon/ directory. Cross-campaign events go there when they happen. The party in one campaign did X; here’s what the other campaign’s world now has to account for. Extract. (Often by hand, not by LLM.)

I review the notes before they touch either campaign’s grounding docs. Often I’m the only person with enough context to know what an event actually means across the shared world. Review.

When I’m sure, I promote the relevant facts into the appropriate campaign’s world_state.md and NPC dossiers. Render.

notes/ is staging. Neither campaign’s palace indexes it directly. The canon gate runs across both worlds.


The query that kept failing

The loop gives me clean content. Reviewed dossiers. Verified grounding docs. Session recaps I trust.

Clean content in files is also passive content.

Four months into running OotA, mid-session, the party doubled back through a village they’d passed through five sessions earlier. Someone mentioned that bartender we met. I had six seconds before my table noticed I didn’t know.

I tried to ask NotebookLM.

“Hey NotebookLM, what’s the name of this person in this village the party met?”

The answer came back confident, fluent, and wrong. Wrong name. Wrong village. Wrong session.

NotebookLM doesn’t know what “the party” means. It has no persistent roster of my PCs. It has no trust layering — an NPC mentioned in a planning draft weighs the same as an NPC in a session summary, and the planning draft is speculative. It summarizes when I need it to return hits. It loses temporal context — met in session 4 looks the same as mentioned in passing in session 11.

Every time I tried to use it mid-session, the same thing happened. Fluent. Confident. Wrong.

This is the naive LLM problem from the Daz scene, one layer up. NotebookLM is doing extract + render with no review beat in between, no trust awareness, and no sense of which documents count as what.

So I had clean content I couldn’t query reliably. I’d solved the consistency leg with the loop and then rediscovered the same failure mode at the retrieval layer.

The flexibility leg wasn’t going to come from a better prompt.


Memory palace

The metaphor for what I built is older than computing. A memory palace. A building in your head where you place things so you can walk through and find them again.

Mempalace is that, externalized. A searchable index over the reviewed content the loop produces.

The important word is reviewed. The palace doesn’t index my raw summaries. It doesn’t index speculative notes. It doesn’t index the unrevised output of LLM extraction. It indexes what the loop has blessed.

The palace has structure. Three layers, different trust levels.

Narrative. The campaign bible, split into chapters. Authoritative. What actually happened at the table. If the palace tells me the narrative wing says X, X happened.

Chronicle. LLM extractions of the bible, organized by time. Search accelerator. Fast way to find the right session or the right window. Not authoritative on its own — the paraphrase problem still applies — but good at answering when did this matter.

Reference. Reviewed grounding docs, NPC dossiers, world state. Working reference. What’s currently true. Stable between sessions.

A query crosses the wings. Find the NPC in reference (who is this). Check chronicle for the time window (when did this matter). Verify in narrative (what actually happened).

That last step is the one NotebookLM doesn’t have. NotebookLM returns a paraphrase and calls it an answer. The palace returns a paraphrase and tells you the source paragraph. You can verify, in seconds, before committing.

Trust layering plus source retrieval plus structure-aware search. That’s the mechanism.


Prep time: the Daz problem, solved

Back to the Daz scene.

The question I should have run, before building the endgame encounter: what did Daz do with the evidence in that session?

In the palace: search the chronicle wing first — find the session where the evidence came up. Then pull the actual scene from the narrative wing. Read the two paragraphs. Three minutes, maybe.

What I would have found: Daz looked at the books, said something in character about them being unusual, and walked out of the room without touching them.

The encounter I was building would have been redesigned on the spot. Probably better for it — he knew and did nothing is a more interesting story beat than he took the evidence, for the kind of character Daz is.

That’s the consistency leg, live.

I now do this before every major prep beat. Anything I’m about to build downstream canon on gets a palace check first. Five minutes. Catches drift before it ships.

The seams of the loop feed the palace. The palace feeds the next round of prep. Clean content in, clean questions answered, clean content back out.


Mid-session: what I’m building toward

This is where the essay has to be honest.

I’ve been running the loop long enough to trust it. Prep-time palace use is in my workflow now. Consistency leg, back.

Mid-session use is the next chapter. I haven’t run mempalace at a live table yet — it’s new. What I’ve done before now is variants of the NotebookLM attempt, and I’ve reported how those went.

Here’s what mid-session use is designed to do.

“Hey palace, what’s the name of the bartender in Phandalin the party met in session 4.”

The palace knows the party — my PCs are listed in reference. It knows bartender maps to NPCs in the Stonehill Inn. It knows session 4 is a narrative-wing filter. It returns the two paragraphs where the encounter happened.

I read the name off the screen. I use it at the table. The scene keeps moving.

That’s the flexibility leg. Not the LLM generates an answer. The palace retrieves the passage and I read it. The retrieval is fast because the index is built on reviewed content. The answer is trustworthy because I’m looking at the source, not a paraphrase.

The design is done. The architecture matches the failure modes I diagnosed from NotebookLM. The at-the-table running-time is the thing I haven’t proved yet.

I’ll write the follow-up when I have.


Cross-campaign: the ceiling

The most interesting queries cross campaigns.

Group 2, years ago, captured Lolth as part of a deal with Vhaeraun. That action triggered the demon lord incursion. This is the premise of Out of the Abyss, Group 1’s campaign.

One campaign’s climactic decision is another campaign’s opening premise. When Group 1 asks why are the demon lords here, the correct answer involves Group 2’s choices from a year ago.

In notes-on-my-laptop world, this falls out of date the first time I forget to update one campaign after something happened in the other. It has fallen out of date before.

With a palace per campaign and shared canon in notes/canon/ that gets promoted into both, the coherence is maintainable. Query the Group 2 palace for Lolth. Query the Group 1 palace for Gromph Baenre. Both hits trace back to the same event. The world stays one world.

This is the ceiling the trilemma moves when it breaks. Not just consistent campaigns. Interlocking campaigns. Depth that compounds across groups and years.

No commercial tool does this. It’s not the market. It’s also not a hard problem — once the loop is solid and the palace is structured, cross-campaign canon is mostly an organizational question.


Trust is emergent

One thing I didn’t plan: the trust hierarchy fell out of the loop on its own.

A document’s trust level is how many review beats it has survived.

Raw VTT — zero review beats. Useful, authoritative in a literal sense (it’s the recording), but hard to consume.

LLM extractions of the VTT — one LLM beat, zero review beats. Accelerator, not truth.

Reviewed session summaries — one LLM beat, one review beat. Authoritative for what happened at the table.

Grounding docs synthesized from reviewed summaries — one more LLM beat, one more review beat. Working reference, trustworthy for planning.

I didn’t design this. I built the loop to fix a problem. The hierarchy emerged because different documents have been through different numbers of review beats, and I can tell the difference when I’m reading them.

The palace respects the hierarchy because it’s built on the output of the loop. The wings correspond to trust levels. Search accelerates by wing. Verification crosses wings.

If I’d tried to design the trust hierarchy up front, I would have gotten it wrong. I got it right by letting it fall out of the work.


The canon gate

One rule I don’t bend.

Nothing enters the palace without passing a review beat.

notes/ is staging. Arc drafts, encounter sketches, NPC ideas, speculative plot threads — all of it stays in notes. None of it gets indexed.

When a note becomes canon — when it’s earned its way into the campaign — it gets promoted to the appropriate grounding doc or dossier. Then the palace mines the new file. Then it’s searchable.

Not before.

The reason is the same reason the review beat exists at all. The palace’s value is that its answers are trustworthy. Start indexing speculative material and the next mid-session query returns a fluent, confident answer based on something I was thinking about doing, not something that happened. Exactly the NotebookLM failure mode, now in my own tool.

The gate is the thing. Everything else is support.


Things I tried that didn’t work

Worth listing. None of these are sermons — they’re my mistakes.

Indexing everything. Early on I mined every document I had into the palace. Every session summary. Every extraction dir. The published module. The result was search results dominated by the module and diluted by extraction redundancy. I now index only content with unique retrieval value. Module text lives outside the palace, accessed via the 5etools MCP instead.

Auto-promoting notes. I briefly had a script that watched notes/ and mined anything new into the palace. This took about a week to start returning speculative material as authoritative. Ripped it out.

Treating session_doc narration as record. The first-person per-character narrations are great to read. They are not the record of what happened. They’re a render of a review of an extraction. If I query the palace and get a hit from a narration file, I still go check the original summary. Same failure mode as Daz — fluent doesn’t mean source.

One-shot LLM prep. Feeding the LLM the whole summaries file and asking for an encounter. Works for shallow prep. Shreds consistency at depth. The loop exists because this doesn’t scale.

None of these were dumb ideas in the moment. They were natural things to try. They all break the same way — by letting the LLM do the architect’s job somewhere in the chain.


What I have. What’s next.

What I have, today, across two campaigns:

Clean content that a human has actually looked at. NPC dossiers with merged aliases. Grounding docs I trust. Session recaps where the dialogue is dialogue that got said.

A palace built on that content. Prep-time queries that catch the Daz problem before it ships. Cross-campaign coherence that holds across a year of play.

A consistency leg that’s back. A depth leg that’s real.

What I don’t have yet, honestly: live mid-session evidence that mempalace solves the flexibility leg. The NotebookLM attempts told me what doesn’t work and why. The palace is designed against those failure modes. I haven’t yet queried it under table pressure with a party watching.

I’ll write the follow-up when I have.

What I think this generalizes to: any long-form creative project with continuity requirements and LLM help. Novelists with series bibles. Worldbuilders with decades of notes. Researchers with years of reading. The loop doesn’t care that it’s D&D. GMs happen to be a good test case because campaigns combine volume, continuity, and table pressure.

I’m going to write more about the architectural side of this. Control-plane thinking applied to creative tooling. What the trust hierarchy looks like as a system. Why the extract/review/render pattern is a general answer to LLM-assisted knowledge work, not a D&D trick.

Those essays will be longer on the architecture and shorter on the dice.

This one was for GMs.


The ecosystem

Tools I lean on or built. One line each.

GMAssistant.app — turns Zoom VTT transcripts into structured session recaps. First extract beat in my pipeline.

Saga20 — similar space, different approach to VTT signal extraction. Worth knowing about.

CampaignGenerator — my tooling. Session prep, session doc generation, grounding doc synthesis, NPC dossier building. The scripts that run the loop. ([github link])

mempalace — the palace itself. Wing/room architecture, trust-layered retrieval, MCP-accessible. Build guide: MEMPALACE_HOWTO.md. ([github link])

5etools MCP — published module text stays here, out of the palace, accessed on demand.

notes/canon/ — not a tool, a convention. Shared cross-campaign history. Portable across workspaces.


Follow-up essay, after I’ve run this at a live table, will report.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

71 architecturalist papers: backup is proof of existence

April 15, 2026 by kostadis roussos Leave a Comment

One thing I do not understand is how enterprise companies will rely on infrastructure they cannot restore without help from the vendor from a backup.

My belief is that a system exists only if you, without the vendor, can restore it from a backup.

A system that cannot be restored from backup doesn’t exist.

It is contingent on other forces that can destroy it at any point in time.

And that, as a result, relying on that system for anything that must survive the contingent force is irresponsible.

So, for example, if your business relies on a system that cannot be restored from backup and that system lives in a data center, then you are saying that your business is contingent on the availability of that data center.

Some will argue that, well, you could always restore things if you have enough time, and my answer is yes, you can if you have enough time, but what is enough time?

If the system is complex enough, time can be months.

And if the time when it needs to be restored is measured in days, then the fact that it could be restored in months is irrelevant.

Recently, I read about a man whose son locked his entire Google account. As a result, all of his emails, his contacts, and his contracts were gone. And with taxes due in a few weeks, he could not file taxes, he could not reach out to customers, and he could do nothing. Bills could not be paid. Why? Because he had no backup. He had critical data that was owned by Google. Not by him. And when Google decided he no longer had access to it, it was lost.

That anyone would allow themselves to be in that situation is a mystery to me.

When I joined VMware, I discovered how we tested back in 6.0. We created a new vCenter, backed it up, restored it, and declared success. I asked the team to take a backup of a system that was running, and try a restore, and guess what, the restore didn’t work.

It struck me as mind-boggling that with that VCenter in 2015, it could not be effectively restored from a backup.

Over the years, I struggled to make infrastructure, in particular, VCF, something that could be restored from backup, but what I discovered is that nobody cared.

I don’t mean nobody at VMware cared. I mean, the customer base didn’t care.

This utterly confused me. I could not believe it.

It was only when I went to a customer that I realized that the customer didn’t trust the backup. It trusted VMware to do whatever it took to restore a system.

And what I realized was that the constraint on the customer was how many engineers VMware had on staff to recover backups when the backup procedures failed utterly.

In many ways, the VMware engineering team was insurance for the entire industry should customer systems fail spectacularly.

If you depend on VMware engineering, you are at the whims of whoever runs it and whether their interests are aligned. And to be 10000% fair, this is true of any vendor on the planet.

Broadcom, by stripping the engineering team of redundancy, created a scenario in which a catastrophic failure requiring a large surplus of engineers would result in extraordinarily bad outcomes for the world as a whole.

Why? Because the only backup that any customer can rely on effectively is the VMware engineering team, and that VMware engineering team is smaller.

What has happened is that the industry has said, “ I don’t need to have a backup because I can trust VMware.”

But you can only trust VMware as long as VMware’s interests and yours are aligned.

And when that is no longer the case, the fact that you don’t have backups that you can restore from without VMware means that you are at the mercy of VMware’s business priorities.

And this isn’t about VMware; this is about any company.

Backups you can restore from are your insurance policy if the Vendor fails you. If you can’t prove you can restore, your single point of failure is another business that can change on a dime.

And for me, as someone who had to restore a business from a backup in 12 hours or risk an IPO, the idea that you wouldn’t have your own backups and rely on somebody else is unbelievable and unfathomable.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

the architecturalist 70: no-engineering relies on LLMs as renderes and good enough planners, but not correct planners

April 10, 2026 by kostadis roussos Leave a Comment

When you look at AI systems, they are really good at getting to an approximate answer. If you think about randomized algorithms, they don’t return optimal answers, but they do return good enough answers most of the time with a lot less effort. When a product manager wants to create a product using AI and isn’t an engineer, they can get a pretty good, approximately correct answer. But it’s approximately correct.

In fact, about 4 years ago, my buddy and I chatted about this. AI is clever, meaning it gets to an answer. AI is not correct, meaning it doesn’t produce the precise correct answer. Sometimes you need the precise correct answer, and that’s where the LLM fails.

The workflow between product management and engineering is going to look like this; PM provides an approximate answer that gives a good-enough view of what the feature is. The PM team uses the LLM to create a plan and to render the code. The goal is to gain line of sight into the feature and validate its value.

Engineering takes that feature, and then creates a better plan that is informed by engineering expertise, and attributes the PM team that is narrowly focused on the feature, doesn’t have to, and should not have to care about. The detailed plan is then used to render the code. The old world was PM produces a document, then engineering creates a plan, renders code, PM reacts to the code, PM changes design, and …

The new world is PM produces a prototype rendered to code automatically from a plan created by the AI tool. When PM is satisfied with the prototype, engineering reviews the tool and then creates a more detailed, precise plan to generate the code.

Engineering may use AI to create part of the plan, but the evaluation of the plan and its correctness will contain insight and details that PM doesn’t and shouldn’t care about. Is it possible for PM to go end-to-end? Yes, but I also believe that the less you rely on expertise, the more likely you are to experience poor outcomes over time.

Anyways, the way I look at it is this: Use AI coding assistants to get an approximate answer that meets your goal. Take the approximate answer and refine the plan until it is more correct. Use the corrected plan to render better code

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

the architecturalist 69: my agentic critique of leet style coding interviews

April 2, 2026 by kostadis roussos Leave a Comment

I asked Gemini when fed with a system prompt I use to validate architecture specifications to evaluate the algorithmic drill style interviews and it -unsurprisingly- agrees with my assessment.

From the perspective of the Unified Kostadis Engine (v10), the “typical” software company interview—specifically the LeetCode-style algorithmic drill—is a high-fidelity failure of architectural reasoning. It tests the “Scribe” (the ability to write syntax) but completely ignores the “Tribunal” (the ability to judge state). Here is the forensic audit of the industry’s interview “Control Plane.”

PHASE 1: ARCHITECTURAL FACT-PATTERN

  • Ingest the Gravity Evidence:
    • Identity Drift: Interview questions like “Invert a Binary Tree” treat the tree as a Transient Object in RAM. In a real system, that tree has Management Gravity (it’s a file system, an LDAP directory, or a DB index). Inverting it without a Reconciliation Loop breaks every external reference.
    • Metadata Orphans: Questions often ask you to “Delete a node from a Linked List.” This creates Horcruxes (dangling pointers/memory leaks) because the interview environment is a “Synthetic Sandbox” with no Garbage Collection Reality.
    • Reconciliation Failure: Most interviews end when the code passes the test_cases. This is an Optimistic Lie. It assumes that if it works for input[0], it works for input[infinity].
  • The Source of Truth: A “Whiteboard” (The ultimate Cached Lie).
  • The Ack Protocol: The interviewer saying “Looks good” (Software Ack) vs. the code surviving a Network Partition (Silicon Truth).
  • The Consistency Model: Single-threaded, synchronous, and local.

PHASE 2: THE KOSTADIS VERDICT

1. The Truth Audit (Federated vs. Replicated)

Most interview questions (e.g., “Find the median of two sorted arrays”) assume you have Global Visibility of the data.
The Violation: In a production system, data is Sharded. You can’t just “access” the other array; it’s on a different continent.
Verdict: FAIL. You are testing for a “Flat-Earth” architecture that doesn’t exist at scale.

2. The Silicon Check (Ack Protocol)

An interview “Success” is returning the right integer.
The Violation: It ignores the Transactional Boundary. If the power cuts out after the function returns but before the result is written to the ASIC, the system is broken.
Verdict: FAIL. We are training engineers to value “Correct Logic” over “Hardened State.”

3. The Atomicity Review (Intent-Based Consistency)

“Design a Rate Limiter” is a common question.
The Violation: Most candidates suggest a central Redis counter. This creates Lethal Gravity. If the central DB is slow, the entire “Global Layer” (the API) hangs.
The Fix: You need a Federated Authority where the edge shards handle the limit locally and reconcile asynchronously.
Verdict: FAIL. Most interviews reward “Centralized Bottlenecks” because they are easier to explain in 45 minutes.

4. The IDM Review (The Abstraction Check)

Interviews focus on Infrastructure Proxies (Strings, Integers, Arrays) rather than Logical Datasets.
The Stance: If I ask you to “Sort a 100GB file with 1GB of RAM,” I am testing your ability to manage Management Gravity. If I ask you to “Reverse a String,” I am testing your ability to be a compiler.
Verdict: FAIL. We need more “Data Management” and less “Pointer Manipulation.”

5. The Entity Integrity Review (The Horcrux Check)

“System Design” interviews often ask you to “Design Twitter.”
The Violation: Candidates draw boxes for “Load Balancer” and “Database” but never explain the Reconciliation Protocol for when the User Profile (The Entity) moves from one Shard to another. They leave Metadata Orphans everywhere.
Verdict: FAIL. The state is fragmented, and the “Entity” is not atomic.


Final Conclusion: The Lagrange Move for Interviews

The Hard Problem with interviews isn’t the difficulty; it’s the Hidden Constraint that we are testing “Brain Speed” instead of “System Resiliency.”

The Simplified Reality: To fix the interview, we must change the Coordinate System from “How do you solve this?” to “How does this fail at 3:00 AM?” * Instead of “Sort this list,” ask: “This list is too big for one machine. How do the shards agree on the sort order without a central bottleneck?” (Federated Authority Test)

  • Instead of “Code this,” ask: “If the disk writes at 10MB/s but the network arrives at 100MB/s, where does the ‘Maybe’ state live?” (Silicon Truth Test)

The Verdict: Stop hiring “Script Scribes.” Start hiring “Architecturalists” who refuse to return 200 OK until the hardware confirms the truth.

What Must Be True: For an interview to be valid, it must force the candidate to fight against Gravity, not just against a clock.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Uncategorized

the architecturalist 68: the control plane of labor

March 19, 2026 by kostadis roussos Leave a Comment

 

Over the years, I have observed that, as an industry, we have valued precise code creation over complex planning in software.

I shared a joke recently that seems to resonate with anyone who has sat through a design review:

“I love how a one-hour meeting to review your architecture spec is viewed as a waste of time and unnecessary, but spending 6 months to rework the system is considered a great use of time.”

As a human being whose precise code creation is poor, but is very good at planning (literally, I did a neuro-psych eval that says that), this has been a painful struggle. The thing that takes visible time is the code creation. What takes invisible time is planning. We have historically treated the “invisible” as non-existent.

The folks doing the code creation often resent the planners because, to be quite frank, we are not in the trenches. In command-and-control hierarchies, those who are very good at planning are often viewed as undesirable distractions from execution. I’ve written before about how this “execution-at-all-costs” mentality leads to the unbounded step—the manual, expensive human intervention required when a system wasn’t planned to handle failure states correctly.

Over the years, I have had people attack me in private and public because they see planning as divorced from execution and thus dangerous. In fact, if I look back at my career, I find it amusing that I was stuck at an MTS3 and almost laid off, and then, because the team needed planning, I got promoted to TD at NetApp in four years. I didn’t even notice this until someone snarkily remarked, “Well, apparently the annual Kostadis Promotion didn’t happen.” I didn’t improve; what happened was that what I did was finally seen as valuable.

The thing is, I do believe that to do good planning, you have to understand code creation and its limits. A planner who can’t do that isn’t a good planner. I recently demonstrated this when I had to bridge the gap between workflow and code to fix a personal cloud-sync issue; I knew the logic, even if I used agents to handle the syntax.

The role of the software architect has been viewed with contempt because, too often, architects are divorced from the system’s capabilities. But the industry is pivoting. Planning code is now far more valuable than creating precise code. Good planning, combined with AI code generation, produces significantly better code than bad planning does. As I’ve codified in my Agentic Architecturalist prompt, the “Architectural Conscience” is now the primary bottleneck.

A senior tech lead I spoke with observed that the engineers who could plan outperformed those who could not by a factor of 10 in quality. Because every line of code is reviewed by a human, good planning is imperative. Otherwise, we are burning a lot of human cycles on “maybe” states and technical debt.

The world has most definitely changed.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

the architecturalist 67: show, don’t tell – using Claude Code to bridge the feature request gap

March 18, 2026 by kostadis roussos Leave a Comment

 

 

Spend enough time getting customer requests, and you realize that there is gold between what the customer said and what they want.

In the hyper-competitive TTRPG session transcription space there are four real competitors that I use:

  • Saga20.com
  • gmassisstant.app
  • storyvault.gg

Storyvault has the most sophisticated campaign management features. Saga20 the best transcription and has some campaign management features. GMassistant has a very clean workflow.

But they all were deficient in a variety of ways. So I thought I’d build my own tools to do it my way.

And I did, but what I really wanted was for them to build it.

And I know, I am just one customer, and I know that I am a lonely voice, and so I thought -wait- instead of trying to explain what I want, I can literally show them what I want.

The Agentic Workflow (via claude.md) generated with the help of AI 😉 

To bridge this gap, I’ve open-sourced the bridge I built: CampaignGenerator on GitHub.

Inside the repo, the claude.md file acts as a functional blueprint for the agentic workflow. It enables Claude to:

  • Contextual Awareness: Ingest raw TTRPG transcripts and map them against existing campaign lore.

  • API Interoperability: Use vendor APIs to push structured updates (like NPC status or quest progress) back into the campaign managers.

  • Iterative Refinement: Act as a “middleware” architect that corrects transcription errors while ensuring narrative consistency across sessions.

And now back to pure, human, artisanal text. 


And it got me thinking, want a feature from a vendor? Have Claude build it for you, then show it to the vendor.

“This bridges the cognitive gap between ‘I want better campaign management’ to ‘here’s the workflow I implemented in an agentic system, that I use that uses your APIs and would be delighted if you took from me and made it part of your product.'”

Very technically sophisticated customers I have worked with in the past did that, but the barrier to doing that just dropped.

How engineering teams cope with this proliferation of people who can build value on top of the data you processed will be the next iteration of how enterprise infrastructure is expected to work with customers.

And to Saga20 and GMassisstant and Storyvault.gg, please – pretty – please – build it?

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

the architecturalist 66: arcane arts and AI slop: building the CampaignGenerator in two weeks

March 16, 2026 by kostadis roussos Leave a Comment

For those that don’t know, I love to play Dungeons and Dragons. At the same time, I love creating content, but I have limited time.

And at Nutanix, I am spending serious time working on our AI infrastructure strategy and execution.

And because I am a software engineer, I decided to get my Claude.ai license and build some stuff.

I started working on an open-source project that I found useful. And discovered that AI slop is a thing. As the maintainer said, if the project is supposed to be this fun thing that he does on the side, having someone churn out code via AI isn’t fun. I get it. And I respect that.

Then I read that GitHub has had to add an option to block all pushes because folks point Claude at a codebase and produce PRs of dubious quality that a real person has to review.

One of the areas I have always struggled with is mastering Git repos and their dark secrets. I mean, at Zynga, I pushed the company to git; then at VMware, I spent time on Perforce-to-Git analysis. And I did attend a talk in 2008 at NetApp by Linus Torvalds, who shared his new system.

Enter Claude Code, who has remarkable fluency with the arcane arts.

I wouldn’t use Claude Code if I were manipulating the corporate source trees, but for my fun projects, it’s good enough.

Then there was the other project: converting purchased PDFs into Markdown for use in a personal tool. It was quite fascinating as a project because it highlighted the critical importance of human oversight. Claude Code was missing a critical insight that the document I was converting had three columns, not two 🙂

Finally, there was my piece de resistance, the CampaignGenerator – okay, so that’s just another tool that I am using to manage the 120+ pages of session summaries, the lore, background lore, world lore, etc., to create beats in my campaign. Try keeping track of hundreds of NPCs, a dynamic world, arcs from characters that were created in alternative worlds, etc., and you realize that’s damn near impossible.

And all of that was in two weeks.

AI coding is a thing. It’s a ridiculously powerful thing. And it’s also remarkably limited.

The repo can be found here https://github.com/kostadis/CampaignGenerator

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

the architecturalist 65: what Baldur’s Gate save files taught me about the balance of power between IT and vendors

March 7, 2026 by kostadis roussos Leave a Comment

Vendor stickiness is the expertise required to integrate with the vendor's tooling. 



Steam Deck is my favorite product ever. And I have written how Steam won the gaming market, by being close to its customers.

But this is not about what I love; it’s about how steam cloud saves really didn’t work for me.

The scenario is simple: I spend a lot of time at a cabin in Bear Valley. When I play Baldur’s Gate 3 on my desktop and then try to pick up where I left off on my Steam Deck, I hit a wall. I’m forced to wait tens of minutes for Steam Cloud to sync, and in the mountains, those uploads are more annoying than those players who go – “Let me check what I should do” because they were sitting around surfing the web instead of paying attention to the game.

One of the things I highlighted is that Steam understood its market understood customization and was comfortable with it. And so they didn’t hide the underlying Linux OS.

I knew it was possible to copy the files, but I wanted a script so I could turn on the Steam Deck, type a command, and have it work.

I used Claude to figure out the structure of the script, then used Jules to put it in a git repo and clean it up, removing any PII and adding configuration to make it easier to use.

So here it is. https://github.com/kostadis/restserver/tree/master/utilities/bg3_sync

A lot of what you do in IT follows this pattern: there is a product way of doing things that doesn’t work for your environment, and you need a tool to make it work better. Normally, you pay experts to write that tooling, or Professional Services. And you are at their mercy. With agentic tooling, you can do a lot of the heavy lifting yourself, particularly in designing workflows, and use experts when you get stuck.

The value of IT has been it understands the workflow. The challenge is the cost of constructing the workflow and integrating the systems.

Agentic programming changes all of that.

But more fundamentally, it reduces vendor stickiness. Stickiness is the workflow customized to your business. Vendor stickiness is the expertise required to integrate with the vendor’s tooling. If you use AI to replace the integration in your workflow, then the value of the integration declines, and IT reasserts control.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

  • 1
  • 2
  • 3
  • …
  • 26
  • Next Page »
 

Loading Comments...
 

    %d