April 25, 202610 min readRishi

An MCP Server for Dynamics 365 Finance and Operations: Natural-Language Access to ERP

A finance controller asks "what is the open AP balance for vendor 3001 across all legal entities, and which invoices are over 60 days?" and gets a correct answer in twelve seconds, with a citation back to the source records. That is the promise of an MCP server in front of Dynamics 365 Finance and Operations. The reality is achievable, but only if you make the right choices about data entities, authentication, scoping, and which operations you let an LLM execute.

This post walks through how to actually build one — what to expose, what to never expose, how to handle multi-company data, and the gotchas that show up the first time a real user sits in front of it.

Why MCP for F&O Specifically

F&O has good APIs. OData entities, custom services, the Recurring Integrations API. None of this is new. So why a Model Context Protocol server in front of all of it?

Because the problem with F&O is rarely "we cannot get the data." The problem is that getting the data requires knowing which of the 7,000 data entities is the right one, which company context you are in, what the field naming convention is this release, and how to compose three OData calls to answer one business question. That is exactly the work an LLM is good at — if you give it the right tools.

An MCP server reframes the integration: instead of building bespoke endpoints for each use case, you expose a small, well-named tool surface that maps to the entities and actions a user actually thinks in. The model figures out the composition.

The Architecture, in One Picture

Claude Desktop / Claude Code
         │  (MCP over stdio or HTTP)
         ▼
   F&O MCP Server  ── auth ──▶  Microsoft Entra ID
         │                          │
         │                          ▼
         │                    OAuth token (S2S or OBO)
         │
         ├──── OData ────▶  F&O Data Entities
         ├──── Custom SVC ─▶  F&O Custom Services (X++)
         └──── DMF ───────▶  Data Management Framework (bulk)

The MCP server sits between the model and the F&O environment. It does three things: auth, tool surface, and result shaping. Everything else is delegation.

The Auth Decision

Before any of this works, you have to pick an auth model, and the choice matters more than people expect.

Service-to-service (S2S, application user). The MCP server holds an Entra ID app registration and authenticates as itself. Every call into F&O is performed by the same application user. This is simple, but it has a cost: you lose user-level audit trails. Every record change in F&O looks like it came from MCP-Service-User, regardless of who actually asked for it.

On-behalf-of (OBO). The MCP server receives a user token from the client and exchanges it for an F&O token via OBO flow. Calls into F&O are made as the actual user. F&O's role-based security and personalization apply. Audit trails are correct.

For anything that mutates F&O data, OBO is the right answer. Full stop. Auditors will not accept "an AI assistant did it." For pure read-only queries, S2S is acceptable, and meaningfully simpler to deploy.

The pattern I have shipped: OBO for everything that writes, S2S for read-only reporting tools, with a clear naming convention so the difference is visible in the tool list.

What to Expose: Tools, Not Entities

The first instinct is to expose every data entity as a tool. Resist it. The model does not need 7,000 tools — it needs ten that map to actual business questions.

Tools I usually start with:

query_open_invoices(vendor_id?, days_overdue?, legal_entity?) — wraps the appropriate AP entities, defaults sensibly
query_customer_balance(customer_id, legal_entity?, as_of_date?) — wraps the customer transactions entity, applies aging
lookup_product(item_id_or_name, legal_entity?) — fuzzy matches on item name, returns the canonical record
get_purchase_order(po_number, legal_entity?) — returns header + lines + receipt status
query_inventory(item_id, warehouse?) — current on-hand by site/warehouse
list_legal_entities() — for the model to discover which DataAreaIds the user has access to
get_chart_of_accounts(legal_entity?) — for "what account did we book this to" questions

Each tool is a thin wrapper over one or more entity calls, with the naming and parameters of a business question, not an entity schema. The MCP server's job is the translation layer.

The Multi-Company Problem

F&O is multi-company by design. Every transactional entity has a DataAreaId and the same query against the same entity returns different data depending on which legal entity context you are in. This trips up every integration, and it absolutely trips up an LLM if you do not handle it.

Two patterns that work:

Make legal_entity an explicit parameter on every transactional tool. Default to none, and have the tool either error out asking the model to specify, or return data across all legal entities the user has access to. Never silently default to a single company — that is how you get correct-looking answers about the wrong subsidiary.

Expose list_legal_entities() early. The model will call it when it needs to disambiguate. A user asking "what is our cash balance" implicitly assumes one company; the model should ask back which one if the user has access to several.

The Write Path Is Different

Reading data through an MCP server is mostly safe. Writing is not. A bad query in production F&O can post the wrong journal, ship the wrong inventory, or commit a payment that takes a week to reverse.

Two rules I apply:

Writes go through custom services, not direct entity updates. A custom service in X++ enforces business logic — validation, posting rules, workflow approvals. An entity update bypasses some of that. If the MCP server is going to write, it writes through services that the F&O team controls and reviews. The MCP server is not allowed to do anything a power user could not do through a UI.

Every write tool requires an explicit confirmation step. The MCP server returns a preview of the change ("I will create a purchase order for vendor X with these lines, totaling $Y") and waits for the user to confirm before actually calling the service. The model is not allowed to act first and apologize later. This is a hard discipline that pays for itself the first time a user catches a hallucinated vendor ID before it became a real PO.

What to Never Expose

Some things should not be one prompt away from anyone, no matter how senior:

Posting general journals. The blast radius is too large; ledger reversals are expensive.
Mass updates. Anything that touches >1 record per call should be human-driven, not LLM-driven.
User and role administration. Privilege management belongs in the admin UI, with proper change control.
Configuration changes. Number sequences, posting profiles, dimension structures — all bypass-the-MCP territory.
Bank file generation. Anything that produces an artifact a bank will act on.

The principle: if the worst-case outcome of a hallucinated tool call is "we have to call the bank to claw back a payment," the tool should not exist on the MCP server.

Schema, Pagination, and the Token Budget

F&O entities are wide. A single sales order header can have 200+ fields. The total inventory transaction entity has more. If you return the full entity to the model, you burn tokens on Created, Modified, RecVersion, Partition, and dozens of audit fields nobody asked about.

Two practices keep this sane:

Project to a small, named field set per tool. query_open_invoices returns ~12 fields, not the full AP invoice header. The model can call a separate get_invoice_detail(invoice_number) tool when it needs the full record.

Page aggressively. Default page size 25, hard cap 100. If the model asks for "all open invoices," it gets the first page plus a next_cursor. F&O entities will happily return tens of thousands of rows; that is not a useful response to a model.

Caching: Reference Data vs Transactional

F&O has reference data that almost never changes (legal entities, chart of accounts, units of measure) and transactional data that changes constantly. Caching strategy follows that split.

Reference data: cache in-process with a TTL of an hour. Even better, warm it at server start. The model calling list_legal_entities() should not hit F&O.
Transactional data: do not cache at the MCP layer. The user is asking because they want the current state. F&O has its own caching where it makes sense.

Where this matters: tools like lookup_product benefit massively from cached reference data because they involve fuzzy matching — you do not want to hit F&O on every keystroke of an interactive flow.

Observability: Logs the Auditors Will Ask For

Every MCP tool call should produce an audit record with:

Authenticated user
Tool name and parameters
F&O calls made (entity, operation, response code)
Timestamp
Correlation ID linking back to the model's request

When a controller asks "did the AI assistant change this invoice?" — and they will — you need an answer in under a minute. A structured log to Application Insights or a dedicated audit table makes that trivial. Without it, the first incident becomes a forensic project.

A Realistic Deployment Shape

The pattern that has held up for me:

MCP server hosted as an Azure Container App with managed identity to Entra ID
OBO auth for any tool that writes; S2S app registration for read-only reporting
F&O calls go through a small client library that handles retries, throttling responses (F&O returns 429s), and entity field projection
Audit log streamed to Application Insights with a dashboard for tool call volume and error rates
The tool list is generated from a YAML manifest so adding a new tool is a config change, not a code change for everyone who runs the server

The first version is a few hundred lines of TypeScript or Python. The MCP SDK is the easy part; the F&O integration is what takes the time, and most of that is dealing with auth, throttling, and the multi-company gymnastics.

Where This Genuinely Wins

The places I have seen this pay off:

Period-close investigations. "Why did the COGS for SKU X spike this period?" The model can chain inventory transactions, costing entries, and journal lines without an analyst writing six queries.
Vendor and customer servicing. AP and AR teams answering account-status questions in seconds rather than navigating four forms.
Audit support. Read-only access for auditors with full audit trails on every query they ran. Faster than giving them a license and training them on the UI.
Internal reporting. Replaces a long tail of one-off Power BI reports for questions that get asked once.

The places it does not win:

High-volume transaction entry. A clerk entering 200 invoices a day still wants the form. The MCP path is for ad-hoc questions and exceptional flows.
Anything regulated to the field level. SOX-relevant changes still go through the controlled path.

Closing

A Finance and Operations MCP server is not magic. It is the same integration patterns F&O has always demanded — auth, multi-company awareness, custom services, audit — re-shaped into a tool surface that an LLM can actually use. The technical work is bounded. The discipline that matters is the same as any production F&O integration: be deliberate about what you expose, careful about how you write, and obsessive about audit trails.

Done well, it gives users the F&O answers they have always wanted to ask in English. Done poorly, it gives them the most expensive hallucination machine ever connected to an ERP. The line between the two is the choices in this post.

SharePost Share

Keep reading

Apr 21, 20267 min read

Building Your First MCP Server in Python: A Hands-On Walkthrough

From zero to a working Model Context Protocol server in about 100 lines — tools, resources, a local client test, and the traps that will bite you on day one.

mcp ai claude

Apr 23, 20266 min read

Claude Code Hooks: Automating Repo Guardrails Without Pre-Commit Fatigue

Use Claude Code hooks to enforce policy at the moment actions happen — before edits, before tool calls, on session start — without relying on pre-commit hooks everyone learns to bypass.

ai claude tutorial

Apr 18, 20267 min read

Prompt Caching with Claude: What It Saves, Where It Fails, and How to Set It Up in Production

A precise look at Claude's prompt caching — how the cache hit math actually works, the 5-minute and 1-hour TTL variants, and the subtle mistakes that silently disable it.