Notes on Building an AI-First Fintech
Company design, the management layer, and the work that compounds.
Every company small and large today is building agents. The harder problem is in the layer above them: the system that coordinates those agents against the actual shape of the business.
Two months ago, I left Big Tech for Imprint to find out what it actually takes to build an AI-first company. Most of what passes for “AI-first” today is a strategy slide and an Anthropic license. The version I’m interested in is the one where AI sits at the core of how the company works: blended teams of humans and agents, operating through systems that learn.
I’m leading the AI, Infrastructure, Security, and IT foundations org at Imprint: the team leading the company’s AI transformation. Two months in, I wanted to document and share what I’m learning and where I think it’s headed. This is the first in what I expect will be a series. Today: the management layer that is load-bearing for a company, how we’re approaching enablement at two very different speeds inside the same company, and the company-design underneath both.
The management layer is where the moat lives
Cognition has Devin. Cursor has Background Agents. GitHub has Copilot Agents. Notion has Agents. Linear has Agents. The pitch is always the same: give us your work and our agent will do it.
From inside an engineering org actually trying to deploy these things, the question that matters isn’t which agent. It’s: which agent should pick up which work? How do they share context about your product, your codebase, your conventions? How do you teach them the tribal knowledge that lives inside your team? How does the system improve?
None of those questions have an answer you can buy. They live in what I think of as the management layer: the system that sits above all of these agents and coordinates them against the actual shape of your business. The management layer is the operating system for human-agent work: it decides which work goes to which agent, gives that agent the company context it needs, evaluates whether the output is good, and compounds what the system learns over time.
The frontier isn’t the agent itself. It’s the system that orchestrates the agents: the layer where your company’s proprietary knowledge gets encoded, where work gets routed, where outputs get evaluated, and where the whole thing compounds with use.
The AI enablement loop, at various speeds
The same loop runs across the whole company: adopt the tools, learn from what’s working, compound what works, but it’s running at very different speeds depending on where you look.
At the leading edge, a group of engineers are operating a year or two ahead of the median. They’re delegating well-scoped work to coding agents. Designing review systems instead of doing individual reviews. Starting to think about evals before tests. This is where the next layer of leverage is being invented.
Across the long tail, every function from Commercial to Risk to Marketing, the same loop is just getting started. Different tools, simpler workflows, same compounding logic. The work here is mostly distribution: getting what already exists into the right hands.
The leading edge invents what the long tail will eventually run on. The long tail is where the company-wide leverage actually lands. A good AI strategy holds both at once, and a lot of the work of an AI-first executive is staying honest about which curve you’re on for each part of the business.
The long tail: distribution is the bottleneck
AI tooling has a chicken-and-egg problem at most companies. People experiment with it on their own, get good at it in pockets, and never spread that knowledge sideways. The result is a tiny group of power users and a long tail of people who tried it six months ago, decided it wasn’t ready, and never gave the new version a chance.
AI adoption isn’t a tools problem in the long tail. It’s a distribution problem. The best skill in the world is worth nothing if it doesn’t travel. Here’s the loop we’re trying to run for every function at Imprint:
Adopt. Most people avoid AI because they aren’t sure where to start. So every new hire gets onboarded to AI in their first week. We built a “Claude Day 1” guide that walks people through what they have access to, what each tool is good for, and a few real workflows from across the company they can try in their first hour. The goal is to make AI feel approachable. To show that you don’t need to be technical, or even particularly curious, to start using it well.
Learn. We built a dashboard that breaks AI usage down by team across every tool we use. The point isn’t to promote token maxing. It’s to let leaders see who’s using AI well, who needs coaching, and where the next adoption wins are. The dashboard tells us where the gaps are. The real work is the closing mechanism: how people who are stuck get unstuck, and how what works at the leading edge travels sideways.
Compound. Build the infrastructure that lets one person’s automation become the team’s automation. We built a marketplace where anyone can browse, install, and contribute reusable AI workflows. Setup is frequently the #1 blocker to usage. The trick is that the workflows are automatically loaded for all employees, so adoption isn’t gated on setup. Commercial’s revenue reporting becomes a tool the whole company can run on demand. Risk’s triage workflow becomes a tool Marketing can run. Every contribution is a permanent capability the next person gets for free.
Done right, the long tail looks less like a training program and more like a flywheel.
A central AI team should build a foundation, not become the AI help desk.
It’s tempting to play that role, but it’s the wrong shape of org for what we’re actually trying to build.
The point of this loop is that every function eventually owns their own AI transformation. Commercial owns their reporting agents. Risk owns their own triage agents. Marketing owns their campaign manager agents. My team’s job is to give them the patterns and the infrastructure to do it themselves. If we’re the only forcing function, this doesn’t scale. If every AI system runs through us, we’re the bottleneck we set out to remove.
The leading edge: AI writes, reviews, and tests code
Engineering is the part of the loop that’s furthest along, and not because engineers are special. It’s because that’s where most of the research and tooling investment has been pointed. The model providers are racing to automate engineering workflows specifically, and the result is that the engineering side of every company is operating with tooling that’s a year or two ahead of every other function.
Three layers, each compounding on the one before it.
AI writes code. Most engineering orgs have crossed this threshold by now. Engineers use AI to draft, complete, refactor. What changes now isn't just speed. Engineers stop being the people writing code and become the people directing the work. The bottleneck moves from typing to thinking. We’re investing in a managed agent platform that runs autonomous coding agents in the cloud, so engineering teams can delegate well-scoped work to agents and stay focused on the higher-order decisions.
AI reviews code. The next layer of leverage is review. If you let AI write more code without raising the review floor, you’ve only moved your bottleneck. We’re building an internal review service that lets every repo owner define what “good” looks like for their code, and a configurable system enforces it on every PR. Engineers shift from doing individual reviews to designing the review system itself. This is what “raise the floor without growing the queue” looks like in practice.
AI tests code. Engineers at Imprint use AI to write tests today, but we haven’t built a full system around it yet. Once generated, the tests are deterministic, so traditional unit tests handle that fine. The interesting question is which tests the agent chose to write. Did it catch the edge cases that matter to the business? Did it test the boring 90% and miss the important 10%?
That’s not really a testing problem. It’s an agent-quality problem. And the way you measure agent quality is with evals.
Testing now has two halves: the deterministic half and the judgment half. Traditional tests answer the deterministic question: does this function return the expected value? Evals answer the judgment question: was this behavior actually good?
I expect this to become one of the defining engineering leadership problems of the next 18 months.
The underlying shape: a system every team can configure and extend
Writing, review, and testing compound, but only if the system underneath them is built right. Here’s my mental model underneath all three.
The leverage doesn’t come from picking one agent off the shelf and rolling it out company-wide. It comes from building a system that encodes your proprietary knowledge of how your products actually work, one every team can configure and extend for their own surface area. Teams own their agents. They teach their coding agents how to code on their stack, and they teach their review agents what “good” looks like in their repo. The team that owns the rewards product teaches its coding agent the nuances and the design system, so when a rewards ticket lands the agent already knows how to ship it well. The team that owns the backend encodes everything load-bearing about migrations as a review dimension, so every incoming PR is held against that bar automatically.
In this world, engineers stop being the people doing production and review, and start being the people designing and maintaining the system of agents that does. That’s where the next order of magnitude of leverage lives. Not in any single agent, but in the system of agents the team owns, shapes, and improves over time.
Configuration is the user acquisition funnel. Self-evolution is the product. This system gets sharper with every use. Every PR the review agent flags, every ticket the coding agent ships, every fix that lands becomes signal. The team trains a system that compounds their expertise over time. That compounding loop is the real IP, and the part that doesn’t transfer when someone swaps the underlying model.
Writing, review, and testing each change what a single engineer can take on. The system underneath them changes what an engineering org can ship without growing.
AI-first is a company-design bet, not a tech bet
Everything we’ve discussed is an AI company-design strategy: the management layer, the enablement loop, and the system underneath.
Most companies are treating AI adoption as a tooling decision. Pick a model, hand engineers access, hope for productivity gains. That may produce local productivity gains. It doesn’t transform the company.
What does is treating AI as a constraint on company design. The moment you commit to scaling without scaling headcount the way most companies do, every decision downstream changes. Hiring criteria change. Onboarding changes. How you structure engineering work changes. Which functions even need to exist, and at what size, changes.
The tooling is downstream of the bet, not the bet itself.
Three predictions
If I’m right about the direction, here are three things I’d expect to be true within the next 18 months. I’d rather put markers down now and be wrong than stay vague — so consider this the part of the post you can come back and grade me on.
Engineers become agent system designers. The job shifts up a layer. The question stops being how do I write this function or how do I test this function, and starts being what knowledge does my agent need, what workflows is it running against, and what system moves our company forward fastest.
And this looks different depending on what kind of company you’re in. Take consumer vs. B2B:
- Consumer companies will build agents optimized for data and experimentation. Pull the right data, run methodical analyses, design and read experiments at scale. The big-consumer-tech-shaped problem: small directional signals across enormous populations, where the agent’s job is to systematically explore and surface what’s working.
- B2B companies will build agents optimized for synthesis and judgment. Read customer interviews, generalize from a handful of accounts to a product principle, hold complex workflows in their head, balance design tension against usability. The enterprise-shaped problem: small numbers, deep context, the agent’s job is to reason carefully about a few high-stakes cases rather than aggregate over many.These are not just different prompts. They’re different agent architectures, different evaluation criteria, different knowledge bases, and different definitions of *good*. The companies that recognize this early and build the management layer that fits *their* problem shape will probably build a real moat. The ones that try to apply a generic agent template across both regimes will likely end up with a system that fits neither.Product judgment becomes executable. As more product behavior gets encoded into AI systems, the line between designing the product and building the product starts to blur. The work is no longer just specifying an experience and then implementing it. The work becomes defining what good looks like, encoding that judgment into a system, and improving the system as it runs. The highest-leverage people will be the ones who can hold product judgment, technical architecture, and evaluation in the same loop: decide what good looks like, build the system that enforces it, and keep making that system better.
Evals are the forcing function. They’re how “good” moves from a PM’s head, a designer’s mocks, or an engineer’s intuition during code review into a form a system can actually score against. The strongest product teams over the next 18 months will look smaller and more blended. The skill that distinguishes good system designers will be how well they can articulate “good” in a form a system can evaluate against.Tool-only AI transformations fail. Cursor licenses for everyone, Claude seats for every department, an internal Slack channel where people share prompts. Very little of this compounds in a way that survives a model swap. In 12 months, leaders who picked that path are likely to be looking at flat productivity metrics and wondering why the trillion-dollar wave didn’t lift their boat. AI is a distribution and orchestration problem before it’s a tools problem.
The bet I’d make
The thing I’m most confident about after sixty days: companies that pick the AI-first bet early and commit to it in their hiring, their onboarding, their org design, and their AI infrastructure will operate at multiples of what their headcount would otherwise allow.
The rest is execution. That part is hard, and I’m in the middle of it.
More to come. Next up: the actual mechanics of the management layer, and what it looks like to build it inside a regulated business
If you’re working through similar questions at your own company, or you think I’m getting any of this wrong, I’d love to compare notes. Let’s talk!


