Most companies approach AI by buying a tool and hoping someone uses it. Hiring an autonomous role is a different decision. You are not adding a feature to a system your team already operates; you are handing one function of the business to a named worker that runs on a schedule, acts without being prompted, and is held to the same standard you would hold a junior hire: did the work get done, and is there a record of it.

That framing changes everything about how you choose the first role and how you judge it. This is a practical guide for a mid-market operator deciding where to start. The short version: pick the most painful, repetitive, deadline-driven job in the building, connect it to the systems where the work actually lives, decide which actions need a human nod, and then watch real outbound work land within weeks. Below is how to do each of those without getting it wrong.

What Hiring a Role Actually Means

An autonomous role is a worker, not a chatbot. It has a name, a remit, and a working schedule. It wakes up on its own cadence — daily, hourly, whatever the job needs — looks at the state of the world, and does the next thing the job requires. A credit controller chases the invoices that crossed their due date overnight. A compliance manager checks which certificates lapse this month and acts before they do.

The single most important property is what we call a delivery contract: a run cannot end quietly. Either the agent completes a real, logged outbound action — an email sent, a reminder raised, a record updated — or it escalates to a human and says so. It does not drift into "I had a look and nothing seemed urgent." That is the behaviour that separates an employee from an assistant, and it is the behaviour you should test for from day one.

This is live, not theoretical. At Legacie, a property developer and block manager in Liverpool, named roles run credit control, property compliance, bid writing and management accounting. At WH Scott Group, an industrial lifting and inspection business across the UK and Ireland, the same pattern runs document control, HSQE and group credit control. The roles differ; the mechanics are the same.

Pick the Most Painful Deadline Driven Function

Resist the temptation to start with the most exciting function. Start with the one that hurts. The first role should sit at the intersection of three properties:

•Painful — it costs you real money or risk when it slips, and your team dreads it.
•Repetitive — the same shape of task recurs constantly, so a standing schedule fits naturally.
•Deadline-driven — there is a clock, which makes "good" easy to define and easy to measure.

For most mid-market operators that points squarely at credit control or compliance, and usually one of those two first.

Credit control

Chasing overdue invoices is the cleanest possible first role. The work is relentless, deeply unloved, and directly tied to cash. The logic is legible: an invoice has an amount, a due date, a customer, and a history. The agent can run every morning, find what aged past its terms overnight, and send a chase in the right tone for that customer's stage — a gentle nudge at seven days, something firmer at thirty. Say a £2,140 invoice that is 32 days overdue: the agent knows that is the second reminder, references the first, and asks for a payment date rather than repeating itself. The deadline is built in, so the definition of done is obvious.

Compliance

Compliance is the other natural starting point, especially in regulated or asset-heavy businesses. Gas safety certificates, electrical inspections, EPCs, insurance renewals — each has a due date and a consequence for missing it. A property and compliance manager can sweep the portfolio, find what lapses inside the window, and chase the responsible party before it expires. The cost of a miss is high and concrete, which again makes the role easy to judge.

Pick one. Do not try to launch three roles at once. The first hire is as much about your team learning to work alongside an autonomous worker as it is about the work itself.

Connect the Systems of Record

A role is only as good as its access to where the work actually lives. This is the step buyers underestimate. The agent does not work from a copy or a spreadsheet export; it reads and writes the real systems, under proper authentication.

In practice that means connecting the genuine systems of record for that function. For credit control that is your finance and accounting platform plus email — Business Central, Microsoft 365 and Outlook are typical. For property compliance it is the property and maintenance stack — MRI, Blocks Online, Fixflo, Companies House for the corporate checks. Connections run over an OAuth 2.1 gateway through Entra ID, and each tenant gets its own isolated Azure environment — its own subscription, its own Key Vault, its own container space. Your data does not sit in a shared pool.

You do not need to connect everything at once. Connect the two or three systems the first role genuinely touches. A credit controller needs the ledger and the mailbox; it does not need your CRM on day one. Scope the integrations to the role, not to some imagined future.

Set the Approval Gates

This is where you decide how much rope the new hire gets, and it is the part that earns trust. The governance is fail-closed by default: a read-only role is technically prevented from changing systems, not merely instructed not to. If a role has no business sending payments, the tooling will not let it, regardless of what any instruction says.

Three controls matter most when you set up the first role:

•Signed send authorisations. Every external message requires a signed (HMAC-SHA256), time-limited, recipient-bound authorisation. A chase cannot go to the wrong customer, and it cannot be replayed later.
•Human approval gates on the sensitive stuff. You choose the line. Routine reminders go out autonomously; anything touching spend, credit terms, or a legally sensitive customer waits for a human yes.
•A tamper-evident audit ledger. Every action lands in an append-only record. You can always answer "what did it do, and when" — which is exactly the question your finance director or auditor will ask.

Autonomy without an audit trail is not a workforce, it is a liability. The ledger is what makes the rest safe to switch on.

Inbound that the agent did not generate — a reply from a customer, an attachment from outside — is treated as untrusted and sandboxed against prompt injection, so a cleverly worded email cannot talk the agent into doing something it should not. On certifications, be clear-eyed: the platform inherits Azure's certified infrastructure, and formal certifications such as SOC 2 are on the roadmap rather than held today.

What the First Weeks Look Like

This is weeks, not quarters. A realistic shape:

•Week one — connect the systems, load the role's context, and run it in a shadow mode where it drafts the actions it would take but a human releases each one. You are checking judgement: right customer, right tone, right amount, right timing.
•Week two — loosen the routine cases. Standard reminders go out autonomously under the signed-authorisation gate; the edge cases still route to a person. The audit ledger fills up and you read it.
•Weeks three and four — the role settles into its standing schedule. You stop reading every line and start reading the escalations and the weekly summary. The human moves from doing the work to supervising it.

By the end of the first month you should be able to point at a stream of real, logged outbound actions and a short list of things the agent escalated rather than guessed at. If you cannot, something is wrong — usually a missing integration or a gate set too tight.

What Good Looks Like and What to Watch For

Good looks like quiet, consistent delivery. The invoices get chased on time, every time, in the right tone, and you can see each one in the ledger. The agent escalates when it genuinely should and not constantly. The weekly summary tells you something you would otherwise have had to dig for.

What to watch for in the early weeks:

•Silent runs. If an agent ever finishes without an action or an escalation, treat it as a defect. The delivery contract exists precisely to make this visible.
•Over-escalation. A role that kicks everything to a human is not yet doing its job. Tune the gates so routine work flows and only real judgement calls surface.
•Stale context. Durable memory means the agent should remember the customer who always pays on day 35 and the one mid-dispute. If it keeps relitigating settled facts, its memory needs attention.
•Gates set wrong. Too loose and you lose sleep; too tight and you have bought an expensive drafting tool. Expect to adjust the line in the first fortnight.

Expanding the Fleet

Once the first role is steady, expanding is an addition, not a migration. The hard parts — tenant isolation, the gateway, the audit ledger, the approval model — are already in place. The second hire reuses all of it.

Add roles where they compound. A credit controller pairs naturally with a management accountant; a compliance manager with document control. Channels broaden as you go — email first, then WhatsApp, Teams or SMS where the work calls for it. Keep the same discipline each time: one role, scoped integrations, gates set deliberately, judged on logged outbound work within weeks. A workforce is built one proven hire at a time, not bought as a bundle.

If you want to see a first role running against real systems, you can book a short walkthrough at /demo.

See what an AI workforce could do for you

Start with a £2,500 Audit. We map a fleet of AI employees to your business and show you exactly what they'd do on day one.

Book a demo

How to hire your first AI role

What Hiring a Role Actually Means

Pick the Most Painful Deadline Driven Function

Credit control

Compliance

Connect the Systems of Record

Set the Approval Gates

What the First Weeks Look Like

What Good Looks Like and What to Watch For

Expanding the Fleet

See what an AI workforce could do for you

Related articles

One platform, two industries

Your team already works in WhatsApp. So should your AI.

Copilots wait to be asked. Your operations can't.

Put your operations on autopilot