6 min readRishi

Alternate Keys and Upsert in Dataverse: Reliable Integrations Without Duplicates

Integrations rarely know the Dataverse row GUID on day one. They know an account number, an external customer id, a policy number, or a combination that your business already treats as unique. Alternate keys let you make that business identifier enforceable, searchable, and usable for upsert instead of writing brittle find-then-create logic.

Alternate keys turn business identity into a platform contract

The key decision is ownership. If the source system owns a stable identifier, model it as an alternate key in Dataverse and make every integration call use it. That gives you idempotency: the same message can be retried safely after a timeout, gateway failure, or queue replay without creating a second account.

An alternate key can be a single column or multiple columns. Single-column keys are common for values like account number. Multi-column keys work when uniqueness only makes sense in context, such as external customer id plus source system.

Key shapeExampleWhen it works best
Single columnaccountnumberOne source owns a globally unique business id
Two columnsnew_sourcesystem, new_externalidSeveral systems may send overlapping ids
Composite natural keynew_country, new_taxidA real-world identifier is unique only inside a jurisdiction

Do not confuse validation with identity. A required column prevents blanks. A duplicate detection rule can warn users. An alternate key creates a unique index and lets the Web API address a row by the business key.

Key readiness is asynchronous, so deployments must wait

Creating the key is not the same as using it. Dataverse creates the backing index asynchronously. In solution deployment, the key can exist in metadata while its status is still pending or failed. If you fire integration traffic immediately, upsert calls may fail or fall back to unsafe compensating logic.

Before go-live, check the key status in maker tools or metadata. Failed key creation usually means existing duplicate data or unsupported column choices. Clean the data first, then reactivate or recreate the key.

{
  "KeyAttributes": [
    "new_sourcesystem",
    "new_externalid"
  ],
  "DisplayName": "Source System External Id",
  "EntityLogicalName": "account"
}

A release pipeline should treat alternate key readiness like a database migration. Deploy metadata, wait for the index, then enable the integration job. That sequence avoids the worst production pattern: the first batch both exposes old duplicates and creates new ones.

Upsert is a PATCH addressed by the alternate key

The cleanest pattern is one call per business record. Put the alternate key value in the entity URL and send the attributes you want to create or update. Dataverse resolves the key, updates the matching row if it exists, or creates a new row if it does not.

PATCH [Organization URI]/api/data/v9.2/accounts(accountnumber='A-1001')
Content-Type: application/json
If-None-Match: null

{
  "name": "Contoso Manufacturing",
  "telephone1": "+1 425 555 0100",
  "address1_city": "Redmond"
}

For a composite key, include each key column in the URL. Strings need quotes, and special characters must be URL encoded. Keep the key values immutable from the integration perspective. If the business changes the account number, treat that as a deliberate identity change with a migration plan, not a normal update payload.

PATCH [Organization URI]/api/data/v9.2/accounts(new_sourcesystem='ERP',new_externalid='CUST-7781')
Content-Type: application/json

{
  "name": "Northwind Traders",
  "accountnumber": "NW-7781"
}

The response status tells you what happened. 204 No Content usually means an update. 201 Created means Dataverse inserted a row. Your integration should handle both as success.

Idempotency beats find-then-create logic

The classic anti-pattern is query, branch, then write. It looks harmless: call $filter, check whether a row exists, then create or update. Under load, two workers can read no match at the same time and both create. Duplicate detection might catch one, or it might not, depending on timing and rule configuration.

Upsert makes the unique index part of the write. That changes the failure mode from silent duplication to a deterministic conflict or validation error. It also reduces round trips and simplifies retry handling.

$body = @{
  name = "A. Datum Corporation"
  accountnumber = "A-1001"
  telephone1 = "+1 425 555 0199"
} | ConvertTo-Json

Invoke-RestMethod `
  -Method Patch `
  -Uri "$org/api/data/v9.2/accounts(accountnumber='A-1001')" `
  -Headers $headers `
  -Body $body `
  -ContentType "application/json"

Retries become boring. If the first call succeeded but the network response was lost, the retry updates the same row. If the first call never reached Dataverse, the retry creates it. Either way, one business record remains one Dataverse row.

Key columns have limits you should design around

Not every column belongs in a key. Use stable, short, deterministic values. Avoid values users casually edit, values with inconsistent casing rules across source systems, and values generated late in the process.

Dataverse alternate keys support common scalar column types such as text, whole number, decimal, date and time, lookup, choice, and yes or no. They do not support every type. Large text, file, image, calculated, rollup, party list, and virtual data patterns are poor or invalid choices. Even when a type is technically allowed, long text keys are operationally painful because URL addressing and index size matter.

For text columns, decide casing and whitespace rules upstream. A-1001 and A-1001 are not the same integration contract. Normalize before the upsert call, and consider storing the original source value separately if humans need to see it.

Operational guardrails prevent quiet data drift

Keys need monitoring after launch. Track upsert counts by created versus updated. A sudden spike in created rows often means a source changed its identifier format, a key column mapping broke, or a test source started feeding production.

Also log the exact alternate key used for every failed request. Without that, support teams waste time searching by names that are not unique. For multi-column keys, log every component in one structured field.

When you import historical data, load and deduplicate before enabling the key. When you merge duplicates later, preserve the winning key values and redirect downstream references. Alternate keys are not just metadata. They are the line between integration traffic and your customer master.

Build the integration around the identity Dataverse can enforce. A well-chosen alternate key plus Web API upsert is simpler than custom duplicate checks, safer than GUID mapping tables, and far easier to operate when retries happen at scale.

Keep reading

Newsletter

New posts, straight to your inbox

One email per post. No spam, no tracking pixels, unsubscribe anytime.

Comments

No comments yet. Be the first.