Enforcing Compliance Guardrails with Azure Policy

When you audit a cloud estate long enough you start to notice that most findings are not exotic. They are the same handful of mistakes repeated across subscriptions, a storage account left open to the public, a resource deployed in a region you do not have a data residency agreement for, a VM with no tags so nobody knows who owns it. You can chase these one by one in a spreadsheet, or you can stop them at the door. That is what Azure Policy is for.

I have come to treat Policy as the place where written controls become enforced controls. A line in a policy document that says “all resources must be tagged with a cost centre” is a hope. The same rule expressed as an Azure Policy is a guardrail.

Prerequisites

An Azure subscription. Create one for free if you are just experimenting.
The Resource Policy Contributor or Owner role on the scope you want to govern.
Azure CLI installed, or you can do everything from the portal under Policy.

Definitions, initiatives and assignments

Before touching anything it helps to be clear on three words, because the portal uses them everywhere.

A policy definition is a single rule, for example “deny storage accounts that allow public blob access”.
An initiative (also called a policy set) is a bundle of definitions grouped together, usually because they map to one standard such as ISO 27001 or CIS.
An assignment is where you point a definition or initiative at a scope, a management group, a subscription, or a single resource group, and it starts taking effect.

The mental model I use is simple. Definitions are the rules, initiatives are the rulebook, and assignments are where you decide who the rulebook applies to.

1. Start with a predefined policy

You do not need to write anything to get value on day one. Azure ships with hundreds of predefined definitions. A good first one is requiring a tag, because untagged resources are the root of most ownership confusion.

az policy assignment create \
  --name "require-costcentre-tag" \
  --display-name "Require a CostCentre tag on resources" \
  --policy "1e30110a-5ceb-460c-a204-c1c3969c6d62" \
  --params '{ "tagName": { "value": "CostCentre" } }' \
  --scope "/subscriptions/<your-sub-id>"

That definition ID is the predefined “Require a tag on resources”. Assign it and any new deployment missing the tag will be flagged.

2. Choose the effect deliberately

This is the part people rush and regret. Every policy has an effect, and the effect decides how aggressive the guardrail is. The ones you will use most are:

Audit allows the deployment but records it as not compliant. Use this when you are introducing a rule and do not want to break existing workloads.
Deny blocks the deployment outright. Use this once you are confident the rule is correct.
DeployIfNotExists lets the resource through but automatically deploys a related setting, for example enabling diagnostic logs. This is how you remediate at scale.

My advice from doing this on live tenants: start in Audit, watch the compliance results for a week, then promote the rule to Deny. Going straight to Deny on a busy subscription is how you end up explaining to a delivery team why their release failed at 5pm on a Friday.

3. Write a custom definition when the predefined ones fall short

Eventually you hit a rule that no predefined definition covers. Custom definitions are just JSON. Here is one that denies any storage account where public network access is not disabled.

{
  "properties": {
    "displayName": "Deny storage accounts with public network access",
    "mode": "All",
    "policyRule": {
      "if": {
        "allOf": [
          { "field": "type", "equals": "Microsoft.Storage/storageAccounts" },
          { "field": "Microsoft.Storage/storageAccounts/publicNetworkAccess", "notEquals": "Disabled" }
        ]
      },
      "then": { "effect": "deny" }
    }
  }
}

Save that and create the definition:

az policy definition create \
  --name "deny-storage-public-access" \
  --rules @deny-storage-public.json \
  --mode All

The structure is always the same, an if block describing what to match, and a then block describing what to do about it. Once you read a few you can write your own quickly.

4. Bundle definitions into an initiative

Individual rules are fine, but auditors think in frameworks, not occasional settings. Group related definitions into an initiative so you can report against a standard as a whole. In the portal this is Policy > Definitions > Initiative definition, and you simply add the definitions you want, including any predefined regulatory ones Microsoft already maintains for ISO 27001, SOC 2 and NIST.

When you assign the initiative, the compliance dashboard rolls everything up into one percentage per standard. That number is the thing your leadership and your auditors actually want to see.

5. Check compliance and remediate

After assignment, evaluation runs on a schedule (and on every new deployment). To get a quick read from the CLI:

az policy state summarize --subscription "<your-sub-id>"

For policies using DeployIfNotExists or Modify effects, existing resources will show as not compliant until you run a remediation task, which is Azure going back and fixing them for you. You can trigger these from the Remediation blade or with az policy remediation create.

A note on scope and management groups

If you manage more than one subscription, resist the urge to assign policies subscription by subscription. Assign them once at a management group and let them inherit downward. It is the difference between maintaining a rule in one place and maintaining it in fifteen, and it is the single biggest time saver I can recommend here.

Closing thoughts

Azure Policy is not glamorous, and that is exactly why I like it. It moves compliance from something you inspect after the fact to something the platform enforces continuously. Start small with an audit rule, prove it does not break anything, promote it to deny, then group your rules into initiatives that line up with the standards you actually report against. Do that consistently and a large part of your cloud compliance simply stops being a manual job.