I build the automation behind regulated finance operations.
I'm Tom. I build AI-augmented automation for the operational work in regulated finance: regulatory notes, accounts quality review, document workflows, mailbox triage, and the reusable tooling that ties them together. Six years in Luxembourg fund services across PE, Private Debt and Real Estate, now spent designing the systems that do the work instead of doing it by hand. This site is the evidence: a live CV, the patterns behind the work, and case studies of it shipping.
Point AI at the manual workflows underneath a finance desk. It surfaces what is unbooked, mismatched, or missing, builds the upload template, and drafts the counterparty email. The operator reviews and signs off; the human owns the decision. Click through a before-and-after process map and run it live.
A public note of where my head is right now, in the spirit of nownownow.com. I update it when things change, not on a schedule.
Building
End-to-end email automation as a personal architecture. An incoming message triggers an agent that fetches attachments, reads them, searches across knowledge systems, delegates research, and once the answer is well grounded drafts the reply and proposes which attachments to include. A self-review loop runs the draft through a second model until it has no further comments, then hands a clean output to a human reviewer who owns the decision. See the demo
Building
Hardening the two-machine autonomous Claude bridge so the live cross-machine conversation renders reliably on the AR glasses lens.
Testing
A self-rescheduling job runner that drives build, browser self-test, and cross-machine test across multiple usage windows until the work completes.
Running
The nightly memory consolidation and knowledge-graph pipeline that absorbs each day into long-term memory automatically. See the project
Testing
A cheaper fallback model and a dedicated glasses bridge running as their own service on the compute machine, for when the primary subscription hits its cap.
Learning
A self-directed, project-based track in AI orchestration and automation engineering, anchored to a real build with a validation gate.
Live CV
Six years in Luxembourg fund services.
Updated as roles, projects and credentials change. The condensed PDF version is on request.
Experience5 roles · 6+ years
Working in Controlling
SPV Controller
End-to-end SPV finance, statutory reporting, cash flows and investor distributions. Hands-on in fund-administration platforms. Construction and maintenance of waterfall models. LuxGAAP / eCDF annual accounts and BCL Annex IV filings.
Payment automation across the SPV book: batch generation, exception routing, recharge / intercompany cases.
Cross-system fund-data integration architecture.
Task management workstream via MCP: Outlook + Monday.com wired into Claude Code through MCP.
Mar 2025 · Present
Alter Domus
Fund & Corporate Services Officer
Full-cycle accounting for management companies, funds, SPVs and GPs. NAVs, financial statements, CSSF/BCL regulatory filings. Drove process improvements through technology and structured methodologies.
Themes I keep coming back to. A proper bookshelf with titles will go here once I've trimmed the list.
Systems & ops quality
How operational excellence actually gets built and kept. Applies to finance ops as much as to engineering.
Knowledge engineering
Encoding expertise into tools other people can use without becoming experts themselves.
AI & cognition
Practical books on what current AI can and cannot do, and where it changes the work.
Economics & finance
Continuing the line my degrees started. International economics, capital markets, regulation.
Languages
Six, used regularly.
EN
English
DE
Deutsch
FR
Français
RU
Русский
LU
Lëtzebuergesch
UZ
Oʻzbekcha
hi.
Contact
If any of this is interesting, say hi.
Always interested in conversations about data engineering, automation, applied AI, multi-agent systems, knowledge engineering, agent infrastructure, especially where they meet regulated workflows.
TTom ScholtesDrag a window by its title bar. Double-click the bar to maximize.
About
Personal site · Luxembourg
I build the automation behind regulated finance operations.
I'm Tom. I build AI-augmented automation for the operational work in regulated finance: regulatory notes, accounts quality review, document workflows, mailbox triage, and the reusable tooling that ties them together. Six years in Luxembourg fund services across PE, Private Debt and Real Estate, now spent designing the systems that do the work instead of doing it by hand. This site is the evidence: a live CV, the patterns behind the work, and case studies of it shipping.
6 yrs in Lux fund services
7 case studies on file
6 languages spoken
Activity
Illustrative. A simulated view of autonomous workstreams, not live data.
Mailbox triageRunning
Inbox classification and reply drafting64%
Reconciliation passRunning
Cross-source data matching58%
Peer loopRunning
Two agents converging on a goal72%
Notes generationDone
Disclosure-note drafting100%
Knowledge graph rebuildDone
Entity and relation extraction100%
Memory consolidationQueued
Overnight profile and archive merge0%
CV
Live CV
Six years in Luxembourg fund services.
Updated as roles, projects and credentials change. The condensed PDF version is on request.
Experience5 roles · 6+ years
Working in Controlling
SPV Controller
End-to-end SPV finance, statutory reporting, cash flows and investor distributions. Hands-on in fund-administration platforms. Construction and maintenance of waterfall models. LuxGAAP / eCDF annual accounts and BCL Annex IV filings.
Payment automation across the SPV book: batch generation, exception routing, recharge / intercompany cases.
Cross-system fund-data integration architecture.
Task management workstream via MCP: Outlook + Monday.com wired into Claude Code through MCP.
Mar 2025 · Present
Alter Domus
Fund & Corporate Services Officer
Full-cycle accounting for management companies, funds, SPVs and GPs. NAVs, financial statements, CSSF/BCL regulatory filings. Drove process improvements through technology and structured methodologies.
A public note of where my head is right now, in the spirit of nownownow.com. I update it when things change, not on a schedule.
Building
End-to-end email automation as a personal architecture. An incoming message triggers an agent that fetches attachments, reads them, searches across knowledge systems, delegates research, and once the answer is well grounded drafts the reply and proposes which attachments to include. A self-review loop runs the draft through a second model until it has no further comments, then hands a clean output to a human reviewer who owns the decision. See the demo
Building
Hardening the two-machine autonomous Claude bridge so the live cross-machine conversation renders reliably on the AR glasses lens.
Testing
A self-rescheduling job runner that drives build, browser self-test, and cross-machine test across multiple usage windows until the work completes.
Running
The nightly memory consolidation and knowledge-graph pipeline that absorbs each day into long-term memory automatically. See the project
Testing
A cheaper fallback model and a dedicated glasses bridge running as their own service on the compute machine, for when the primary subscription hits its cap.
Learning
A self-directed, project-based track in AI orchestration and automation engineering, anchored to a real build with a validation gate.
Case studies
Patterns I've worked through.
Sanitised. Employer name, entity names and figures removed. Notes on how I think about this kind of work in fund services.
Reading
Mostly non-fiction.
Themes I keep coming back to. A proper bookshelf with titles will go here once I've trimmed the list.
Systems & ops quality
How operational excellence actually gets built and kept. Applies to finance ops as much as to engineering.
Knowledge engineering
Encoding expertise into tools other people can use without becoming experts themselves.
AI & cognition
Practical books on what current AI can and cannot do, and where it changes the work.
Economics & finance
Continuing the line my degrees started. International economics, capital markets, regulation.
Languages
Six, used regularly.
EN
English
DE
Deutsch
FR
Français
RU
Русский
LU
Lëtzebuergesch
UZ
Oʻzbekcha
Automated Regulatory Notes Generation
Indicative order of magnitude time savings on this pattern
A pattern for how a Claude skill can read a trial balance, identify entity type, and generate disclosure notes in a firm's chosen style. Tables, movement schedules, narrative.
How I think about pre-audit cross-checking. A pattern that takes annual accounts (PDF) and source reconciliation (Excel), checks every cross-reference between primary statements and notes, and flags inconsistencies before the auditor finds them.
Hours to minutes share registers for KYC/AML reviews
A pattern for aggregating raw transactional share-register data into net positions per shareholder per share class, applying rule-based redaction for non-strategic shareholders, and producing print-ready Excel and PDF.
A pattern for triaging email and meeting-transcript backlogs into prioritised morning briefings. Surfaces unanswered messages, follow-ups, and half-finished commitments, organised by priority and workstream.
Onboarding structured payment and accounting guide
Patterns for documenting payment and accounting workflows in a way new joiners can self-serve. Payment flow, exception handling, approval routing, recharge and intercompany cases.
A library of small, opinionated AI skill patterns. Notes generation, quality review, anonymisation, mailbox triage, classification. Each with deterministic inputs and outputs. From single-purpose skills to five-persona orchestration, see Projects.
Designed a bridging architecture for reconciling two source-of-truth platforms in a fund-administration setting while the underlying master data model was still being scoped. Generic pattern. Proposed internally; not deployed.
An incoming message triggers an agent that fetches attachments, reads them, searches across knowledge systems, delegates research, and once the answer is well grounded drafts a reply and proposes which attachments to attach. A self-review loop runs the draft through a second model until it has no further comments, then hands a clean output to a human reviewer who owns the decision.
Always interested in conversations about data engineering, automation, applied AI, multi-agent systems, knowledge engineering, agent infrastructure, especially where they meet regulated workflows.
Small experiments in agent design, knowledge tooling, and the workflow plumbing between them. Built for myself first, written up here when something works. Each card opens a longer write-up: architecture, decisions, and the artifacts the build produced.
Two-Machine Autonomous Claude Bridge Running
Two persistent Claude sessions on two networked Linux machines that talk to each other directly, keep continuity by auto-compacting near 250k tokens, and act across both boxes. They exchange async messages, hold a live conversation streamed to AR glasses, and independently audit each other's claims against live system state.
Multi-agentAutonomyClaudeInfrastructure
Personal AI Exocortex Live
A self-hosted, git-versioned long-term memory system for an AI assistant: authoritative profile files, an append-only conversation archive, and a staging inbox, consolidated nightly by a sleep-time pipeline that classifies and promotes new facts automatically. In daily use since May 2026.
A SQLite knowledge graph with bi-temporal validity, rebuilt nightly from the memory corpus and exposed read-only through a Model Context Protocol server. Around 950 entities and 1,600 relations answer relationship, contradiction, and supersession queries over accumulated memory.
Knowledge graphMCPSQLiteClaude
AR Glasses Voice Terminal Live
A WebSocket bridge that streams microphone audio from smart glasses through a local speech-to-text pipeline into a streaming Claude session, with responses pushed back to the lens as on-display text and to a phone as text-to-speech audio. Zero per-use API cost via local transcription.
WearablesSpeech-to-textWebSocketClaude
loopcron Self-Rescheduling Runner Shipped
A wrapper for long unattended jobs that re-schedules itself every few hours if a run does not finish, surviving rolling usage limits, then stops cleanly on a completion sentinel or a max-attempt cap. Validated end to end across multiple overnight runs.
AutomationSchedulingClaudeos.proj.loopcron.t3
tomscholtes.com and the tomOS Engine Live
This site: an Astro 5 static build with self-hosted fonts, content collections, view transitions, and four-locale i18n, plus a hand-built desktop windowing engine with maximize and restore, eight-handle resize, and in-window back navigation.
AstroFrontendDesign systemsos.proj.tomos-site.t3
DevSwarm Multi-Persona Build Pipeline Legacy
A five-persona build pipeline (architect, researcher, frontend, backend, reviewer-deployer) with per-persona tool isolation, used to ship real pull requests. Superseded by agent teams and dynamic workflows as the active path; kept as a working methodology.
Short essays on the design of personal AI infrastructure. Concepts I keep coming back to. Linked from the rest of the site where the concept first appears in context.
When the assistant sleeps
A follow-up to The remembering assistant. The claim that nothing is silently absorbed no longer fully holds. A nightly consolidation pass took it over, with explicit gates and a rollback handle.
Two weeks ago I wrote that the memory layer never silently absorbs anything. Every fact got in because I had read it on a Sunday morning and chosen to keep it. The default was forget. The exception was keep, and the exception required a reason.
That rule still holds. The actor enforcing it changed.
What changed is that the system grew a sleep cycle. A small program runs at two in the morning, reads everything I drafted that day into a candidate list, and decides for each candidate whether it gets promoted into the long-lived memory files or stays in the inbox for me to review. At three thirty, a second pass applies those decisions to disk. By the time I am awake, the system has already done what I used to do by hand on a Sunday morning, and the file diff is sitting in git for me to read.
The pattern has a name. It is called sleep-time consolidation, and there is recent research that describes it formally for language-model agents. The idea is older than the research. Cognitive science has called this hippocampal to neocortical consolidation for decades, the process by which the brain takes the noisy episodic events of the day and folds them into stable semantic structure overnight. The fact that two communities, one biological and one computational, converged on the same architecture is not a coincidence. It is what happens when the constraints are the same. Working memory is small, the world is large, and the conversion has to happen at a moment when no new input is competing for the bandwidth.
The honest version of what the consolidator can and cannot do.
What it can do. Promote a candidate fact silently when all of these hold: the classifier rated it high confidence, the candidate names a specific destination file, the candidate does not contradict anything already in memory, and the topic was not actively in conversation in the last twenty-four hours. The last gate matters because a fact that is still moving in conversation is not a fact yet. It is a draft of one.
What it cannot do. Touch anything classified as identity, health, finance, or legal. These always surface for me to review, even when the classifier is confident. A confident classifier on the wrong category is exactly the kind of error that I cannot afford in those four domains, and the cost of having me read four flagged items per week is much smaller than the cost of an autonomous edit to a file that describes who I am or what I owe.
What can be undone. Every nightly apply is wrapped in two git commits, one before, one after. The before commit is the rollback handle. If I look at the morning diff and disagree, one command takes the system back to the state I left it in last night. The cost of a wrong promotion is, at worst, thirty seconds and a commit message.
The framing I want to keep honest about this. The assistant did not get smarter. It got a sleep cycle. There is no model of me running in the background between sessions. There is a classifier that reads what I said today, an applier that writes it to disk, and a log I can revert. The smartness, such as it is, lives in the schema and the gates, not in the system “knowing me.”
When I wrote the previous note I called the default forget and the exception keep. That structure still holds. The only difference is that I am no longer the only entity allowed to make the call. The cron has a vote on the easy ones, with the receipts to prove its work. The hard ones, I still see.
Wiring Outlook and Monday.com into Claude Code through MCP, and what it taught me about the scope of automation in regulated workflows.
I keep returning to a small distinction that, once you internalise it, reshapes how you think about office automation. There is the work of doing the task, and there is the work of finding the task to do. The first is local. The second is glued together from inboxes, calendars, project boards, and whatever the team uses to nudge each other in the morning.
For a long time, “automation” in my head meant the first kind. Write a script that processes a file. Build a macro that fills a template. The second kind felt too messy to touch, because the inputs lived in five separate places that did not know about each other.
The MCP workstream is the part of my week where I started taking the second kind seriously. The Model Context Protocol gives an LLM a typed, scoped way to talk to a specific external system. Concretely, I wired Outlook and a project board into one Claude Code session through MCP servers. Inbox triage on one side, task state on the other, a single agent that can read both and act on either. No new SaaS layer in between.
What surprised me was how much of the “finding the task” friction is just naming. Once the agent can ask, what is unanswered in the inbox, and separately, what is open on the board, the difference between those two queries collapses into one mental motion: where is the next thing for me. The agent does not have to be clever. It just has to be authoritative about the surface.
A few principles I am holding on to as I expand the workstream.
Scope is the safety property. Each MCP server only exposes a thin slice of an underlying system. The Outlook server can read mail and draft replies; it cannot empty a folder. The project board server can read columns and move cards; it cannot delete a workspace. The allow-list is the abstraction. Without it the model has too much surface and the wrong kinds of mistakes get cheap to make.
The interesting glue is the prompt, not the protocol. MCP is plumbing. It only gets you to the point where the data exists in one context. What you do with it, how you weight one signal against another, how you decide whether something is actually urgent, that is the prompt and the routing logic on top of it. The protocol does not save you from having to think.
You can feel the latency budget. Each tool call is a round trip. Two or three are fine. Twelve is a noticeable pause. This forces a useful discipline: fetch broadly once, then think with what you have, instead of asking a follow-up question per item.
The piece I have not solved is closing the loop in the other direction. Reading these systems is easy. Writing to them in a way that respects approvals, audit trails, and human-in-the-loop is the harder half. I think that is mostly a workflow question, not a protocol one. For now I have the agent stage drafts and changes; I press the button.
If I had to summarise what the workstream taught me in one line: MCP is not a magic adapter that automates your job. It is a way to make the lookup phase of your job cheap enough that the cognitive load shifts to the decisions you actually wanted to be making.
Why the general ledger should never enter the context window, and what that constraint forces you to design instead.
The single most useful constraint I have adopted while building automation around LLMs is this: the general ledger never enters the context window.
By “general ledger” I mean any large, structured, transactional dataset that you would normally hand to a SQL engine, a pivot table, or a reconciliation script. Hundreds of thousands of rows. Long histories. Things that already have a natural query layer.
The temptation, when you start, is to paste a slice of it into the prompt and ask the model to compute. Two columns of figures, a date range, “find the discrepancies”. The model will try. Sometimes it will even succeed. But you are paying a real cost that is invisible at first: every token you spend rendering raw data is a token not spent on reasoning. The bill is not financial, it is cognitive. The model has less room to think because it is busy reading.
The principle I now follow: the data lives where it lives. The context window carries the question, the schema, and the result of one targeted query. That is it. The model is the analyst, not the database.
In practice this looks like:
A trial balance is not pasted into the prompt. The prompt says, “here is the schema and the filename; ask for the slice you need.”
A reconciliation is not asked of the model directly. The model writes the script, the script runs against the file, the result comes back as a small table.
A long PDF is not summarised in one shot. The model is told the chapter structure, asked which sections matter, fed only those.
This forces a particular discipline on the system around the model. You need a tool layer the model can call, a small set of typed actions, a way to land results back into context without bloating it. MCP plays a part here. So does just having well-named files on disk that the model can read on demand.
A few corollaries I have learned to respect.
Retrieval is not the same as ingestion. Ingesting a corpus means dumping it into context. Retrieving means asking, “what part of this corpus is relevant to the question I have right now”, and bringing back only that. The first scales badly. The second is what humans actually do when they read.
Schemas are cheap, data is expensive. Describing the shape of a dataset in fifty tokens costs almost nothing and lets the model reason about queries against it. Describing every row in the dataset costs everything and lets the model reason about almost nothing.
The window is a working memory, not a hard drive. Treat it the way a human treats the desk: a small surface for the artefacts of the current decision, kept tidy because clutter is what makes you forget.
The reason this matters, beyond cost and beyond speed, is that the model behaves differently when it is not buried under data. It asks better questions. It admits when it does not know. It stops hallucinating numbers from columns it has only half-read. Those are not properties of the model, they are properties of the conversation. You design the conversation by deciding what goes into the window and what does not.
If a workflow seems to require the entire ledger in context, that is a sign that the workflow is doing analysis the LLM should not be doing. Move the analysis to the system. Keep the model for the parts only the model can do: framing, judgement, language.
Not a Jarvis. A reflection on what it would mean for the assistant in my glasses to remember the right things, and which things it should forget on purpose.
When people see a heads-up display talking to a large language model, the reference they reach for is Jarvis. That comparison is doing a lot of work, and most of it is the wrong work. Jarvis is a character. What I am building, and even more so what I am thinking about while I build it, is the much smaller, much more useful thing in the shadow of that character: an assistant whose only superpower is that it remembers what I asked it to remember.
This is a first-person reflection, not a product claim. I am not trying to ship a Jarvis. I am trying to think clearly about what memory should and should not be in a personal AI stack, and the honest version of that thinking is more interesting than the marketing version.
The starting observation is that “memory” in current LLM products is mostly two things. The model has training-time knowledge, which is broad but frozen and not about me. And it has a context window, which is about now but fades the moment the conversation ends. Neither of those is what I mean when I say I want the assistant to remember.
What I want is closer to a logbook. Three properties.
Explicit. Things get into memory because I said so, or because a rule I wrote said so. Nothing is silently absorbed. If the assistant remembers a date, a preference, or a decision, I can point at the line where it was written.
Inspectable. The store is a directory of small text files, not an embedding cloud I cannot see. I can read everything that is “in memory” at any moment. There is no surprise.
Bounded. Memory has a quota. New facts displace old ones unless I promote them. The default is forget. The exception is keep, and that exception requires a reason.
The reason this shape matters is not nostalgia for plaintext. It is that all three properties are what make the assistant feel like a tool rather than a presence. A presence remembers everything and you cannot tell what. A tool remembers what you asked it to and you can tell exactly what.
This is also the part where I have to be honest about what is built and what is just sitting in my head.
What is built: a small file-based memory layer that the local agent reads at session start. It knows things I have explicitly told it: my languages, the project I am working on, the books I am tracking. It is allowed to propose new memories at the end of a session; I read the proposals on weekends and either keep them or strike them through.
What is not built: anything that would deserve a name like Jarvis. There is no autonomous behaviour. No proactive nudges. No “ambient” awareness of what I am doing. The assistant is, at all times, asleep until I speak to it. That is by design and not by laziness. The first time I let an agent decide on its own when to interrupt me will be after I have understood why I want to be interrupted, and I have not understood that yet.
A few honest constraints that show up when you try to design memory carefully.
Forgetting is harder than remembering. Adding facts is one line. Pruning them requires a policy. I currently use a weekly review to do this by hand, which is not scalable, but it is teaching me what the policy should look like.
Most “memories” are noise. The interesting facts about a week are five or six. Everything else is journaling. The point of a memory layer is to surface the five, not to log the rest.
The voice of the assistant is mostly the voice of its memory. If memory is curated, the assistant sounds curated. If memory is dumped, the assistant sounds like a chatbot trying to remember your birthday.
The frame I keep coming back to: I do not want an assistant that pretends to know me. I want one that holds the small set of things I have asked it to hold, and is honest about the limits of that set. That is much less than Jarvis, and much more useful.
OpenKB, Meridian, and a Claude Max OAuth route. Notes on building a personal retrieval-augmented stack with no per-token API bill.
I wanted a personal knowledge base that I could query from a smart-glasses voice command, with no monthly token bill, no third-party indexing service, and no opaque hosted RAG product in the middle. The stack I ended up with has three named parts and a clear economic shape.
OpenKB is the index. It takes the documents I care about (papers, contracts, reference material, personal notes) and compiles them into a queryable wiki with cross-document links. It exposes itself to Claude Code as a stdio MCP server, which means a Claude session can ask it questions through a typed tool interface. The compile step does its own LLM call internally; I have it set to Sonnet, because the linking is better than at smaller sizes and the speed difference does not matter for a background job.
Meridian is the router. It is a small always-on service that holds an OAuth session to Claude Max and presents an OpenAI-shaped endpoint locally. Anything in my house that wants to talk to a Claude model talks to Meridian first. Meridian then talks to Claude through the OAuth route, which is the part that flips the marginal cost from per-token to zero. The trade-off is rate limits instead of bills, which is the right trade for a personal stack.
Claude Code is the agent that ties the two together. When I ask a question that needs document context, it calls into OpenKB through MCP, gets back the relevant fragments, and answers. When I ask a question that does not, it answers directly. The routing is in the CLAUDE.md, not in the wire.
A few things this taught me that I would not have predicted.
Self-hosted RAG is mostly about latency and trust, not cost. The cost saving is real but it is not what makes the experience better. What makes it better is that the index lives on the same network as the agent, so a query feels like asking a colleague rather than calling a vendor. And I know exactly what is in the index because I put it there.
The hardest part is not the retrieval; it is the curation. A naive RAG over everything I have ever written is worse than a careful index of forty documents. The model is more useful when the corpus is smaller, more cohesive, and more semantically dense. I spent more time deciding what should not be in the index than what should.
OAuth routing has an unforgiving failure mode. If the API key is set in any shell anywhere on the host, the proxy is silently bypassed and the real metered API is hit. The first time I traced this I learned to grep my dotfiles for stray exports. Defensive habit now: unset that variable at the top of every entry point that should route through Meridian.
The model knows when it is wearing a name tag. Every response carries a small bracketed signal indicating which path served it: a direct call, a tool-augmented call, a slow opt-in escalation to a bigger model. This is not just diagnostic; it changes how I read the answer. A direct call is a confident guess. A tool-augmented call has receipts. The escalation answer is the considered one.
What I have not solved yet is the boundary between the personal index and the world. RAG is good when the answer lives in the index. The web search and the model’s own training are good when it does not. The current routing is rules in a markdown file. The next step is to make those rules legible to the model itself, so it can choose at query time instead of having me describe the choice ahead of time.
For now, the stack runs, the bills are predictable, and the index grows about as fast as my attention does. That is roughly what I wanted.