Automating Azure VM Password Management with Terraform and Key Vault

Introduction

When deploying virtual machines at scale in Azure, one of the overlooked yet crucial aspects is managing local administrator passwords.

Generating secure passwords, ensuring they are unique for every VM, storing them safely, and enabling rotation without risking VM stability — these are not easy challenges.

In smaller setups, administrators might manage passwords manually or through basic automation, but as environments grow, this quickly becomes impractical and risky.

With Terraform, you can automate VM creation end-to-end. But by default, Terraform stores all generated values (like random passwords) in its state file. This poses a significant security concern — anyone with access to the state can view sensitive secrets in plaintext.

In this article, we will walk through two robust approaches for solving this problem:

    • Approach 1: Using Terraform’s random_password + Azure Key Vault + lifecycle ignore_changes (good, but passwords are stored in state)

    • Approach 2: Using Terraform ephemeral password generation + Azure Key Vault + ignore_changes (best practice — secrets are never stored in state)

Along the way, we will understand why the second approach is superior and how Terraform’s new ephemeral values feature (introduced in v1.10) has changed the game for secrets management.


The Problem with Passwords and Terraform State

Terraform is declarative. It maintains a state file which reflects the last known state of your infrastructure. Whenever you generate a random password and use it in a resource (like Azure VM), Terraform records this password in its state so that future plans know what value was originally provisioned.

But here lies the problem:

      • Terraform state is usually stored in remote backends (Azure Storage Account, S3, etc.) → which might be accessed by teams and pipelines.
      • Anyone with access to this state can view the password in plain text.
      • This creates security and compliance concerns → especially when passwords or sensitive secrets are involved.

So our goal should be clear: Avoid Terraform state for secrets wherever possible.


Solution Overview

Let’s explore the two approaches to solve this problem. Each has its pros and cons.

    • Approach 1: Use regular random_password. Password gets stored in state, but Key Vault is used as permanent store. We mitigate drift by using ignore_changes in the VM resource.
    • Approach 2: Use Terraform 1.10’s ephemeral block → generate password during apply only → pass it directly to VM and Key Vault → no state storage → best security.

Let’s dive deeper into each of them now.


Approach 1 — Regular random_password + ignore_changes

Before Terraform 1.10 introduced ephemeral values, this was the most common approach. You generate a password using the random_password resource and pass it to:

    • The VM → during creation, to set the local admin password.
    • Azure Key Vault → to securely store and retrieve the password later.

However, this password will be stored in Terraform state. To avoid accidental overwrite in future (when password is rotated in Key Vault), we use ignore_changes for the password attribute in the VM resource. This ensures that if Key Vault value changes later, Terraform does not consider the VM “out of sync” and try to reapply the password.

Provider Block (provider.tf)

terraform {
  required_version = ">= 1.3.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.80.0"
    }
    random = {
      source  = "hashicorp/random"
      version = ">= 3.7.1"
    }
  }
}

provider "azurerm" {
  features {}
}

Password Generation and VM Creation (main.tf)

data "azurerm_key_vault" "kv" {
  name                = var.key_vault_name
  resource_group_name = var.resource_group_name
}

resource "random_password" "vm_password" {
  for_each = var.vms
  length   = 16
  special  = true
  override_characters = "!@#$%^&*()_+-=[]{}|;:,.<>?"

  lifecycle {
    ignore_changes = all
  }
}

resource "azurerm_windows_virtual_machine" "vm" {
  for_each = var.vms

  name                  = each.value.name
  location              = var.location
  resource_group_name   = var.resource_group_name
  size                  = each.value.size
  admin_username        = each.value.admin_user
  admin_password        = random_password.vm_password[each.key].result
  network_interface_ids = [""]

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "MicrosoftWindowsServer"
    offer     = "WindowsServer"
    sku       = "2022-datacenter"
    version   = "latest"
  }

  lifecycle {
    ignore_changes = [ admin_password ]
  }
}

resource "azurerm_key_vault_secret" "vm_password_secret" {
  for_each = var.vms
  name         = "${each.value.name}-local-admin-password"
  value        = random_password.vm_password[each.key].result
  key_vault_id = data.azurerm_key_vault.kv.id
}

Summary of Approach 1

This approach works and solves drift issue, but storing password in Terraform state is a concern. Anyone with read access to the backend state file (like Azure Storage, S3 or local) will be able to see the password in plain text.

In environments with strong security requirements → this is not acceptable.


Terraform 1.10 Ephemeral Values → Game Changer for Secrets Management

With Terraform 1.10, ephemeral values are now Generally Available. This allows secrets to exist ONLY during terraform apply → they are never saved in the state file. This is ideal for:

    • Passwords (like VM local admin)
    • Private keys and certificates
    • API tokens
    • Any sensitive value which should not be persisted in Terraform state

This is the approach you should now consider first for any secret management requirement.


Approach 2 — Ephemeral random_password + ignore_changes (Recommended)

This approach solves the biggest problem of Approach 1 → Terraform state storage of secrets.

In this pattern:

    • Password is generated during apply → ephemeral → never saved to state.
    • Password is directly passed to Azure VM and stored in Key Vault.
    • VM resource still has ignore_changes → any future password rotation in Key Vault will NOT trigger VM change.
    • After apply → password is gone from Terraform’s memory → only Key Vault keeps the permanent copy.

This is the most secure pattern.

Provider Block (provider.tf)

terraform {
  required_version = ">= 1.10.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.80.0"
    }
    random = {
      source  = "hashicorp/random"
      version = ">= 3.7.1"
    }
  }
}

provider "azurerm" {
  features {}
}

Password Generation and VM Creation (main.tf)

data "azurerm_key_vault" "kv" {
  name                = var.key_vault_name
  resource_group_name = var.resource_group_name
}

ephemeral "random_password" "vm_password" {
  for_each = var.vms
  length   = 16
  special  = true
  override_characters = "!@#$%^&*()_+-=[]{}|;:,.<>?"
}

resource "azurerm_windows_virtual_machine" "vm" {
  for_each = var.vms

  name                  = each.value.name
  location              = var.location
  resource_group_name   = var.resource_group_name
  size                  = each.value.size
  admin_username        = each.value.admin_user
  admin_password        = ephemeral.random_password.vm_password[each.key].result
  network_interface_ids = [""]

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "MicrosoftWindowsServer"
    offer     = "WindowsServer"
    sku       = "2022-datacenter"
    version   = "latest"
  }

  lifecycle {
    ignore_changes = [ admin_password ]
  }
}

resource "azurerm_key_vault_secret" "vm_password_secret" {
  for_each = var.vms
  name         = "${each.value.name}-local-admin-password"
  value        = ephemeral.random_password.vm_password[each.key].result
  key_vault_id = data.azurerm_key_vault.kv.id
}

Summary of Approach 2

This is the most secure approach. Passwords are never stored in Terraform state → removing a major security concern. Permanent storage and retrieval is delegated to Azure Key Vault → where access is auditable and protected via RBAC/Policies.


Password Rotation

One important operational aspect → password rotation.

With both approaches (especially Approach 2), you do not need to run Terraform to rotate the password.

    • Password in Key Vault can be rotated manually or via Key Vault’s rotation policy.
    • VM will continue using the existing password until you manually update it or use DSC/Script.
    • Terraform’s ignore_changes ensures that it does not attempt to overwrite or “fix drift” when password changes outside.

This makes password lifecycle independent of Terraform after initial provisioning → which is ideal.


Final Recommendation

While Approach 1 is workable, Approach 2 (Ephemeral + Key Vault + ignore_changes) is the recommended pattern going forward.

    • No sensitive secrets in Terraform state.
    • Secure storage in Azure Key Vault.
    • No drift / unnecessary VM changes when password rotated outside Terraform.
    • Meets modern security and compliance needs.

For all new designs, this should be the preferred approach for password and secret management with Terraform.

Terraform Import Block — Deep Dive for Real-World Enterprise Scenarios

 

Terraform import is essential when real-world infrastructure needs to be brought under Terraform management without recreating or destroying resources. But with custom modules and enterprise scenarios, importing becomes tricky. This article explains everything in detail, including the confusing parts.

Terraform Import Methods Comparison

1. The terraform import Command

This was the traditional approach of importing resources to Terraform, before terraform 1.5.

terraform import azurerm_virtual_network.example /subscriptions/.../resourceGroups/rg-example/providers/Microsoft.Network/virtualNetworks/vnet-example

What it promises:

      • Imports resource into state file
      • Can generate configuration with -generate-config-out flag

Enterprise limitations:

      • You are directly importing resources to state file, without seeing the plan – risky for an Enterprise
      • Fails with module-based architectures – cannot properly generate module references
      • No support for for_each/count – generates static configurations even for dynamic resources
      • Lacks resource dependencies – imports resources in isolation without reference to related resources

2. Import Blocks (Terraform 1.5+)

Version 1.5 onward, Terraform has launched a new approach to import resources, which is import block.

In this article, we will cover how to import Azure services to terraform state using import block feature.

import {
  to = azurerm_virtual_network.example
  id = "/subscriptions/.../virtualNetworks/vnet-example"
}

Important: If you’re using Terraform version 1.5 or above, import blocks are the preferred option. Their biggest advantage is the ability to see the plan before actual import, which isn’t possible with the terraform import command.

Understanding the Import Block

import {
  to = <terraform state address>
  id = <real-world resource ID>
}

to: Where the resource should be imported in Terraform state.
id: Real resource ID in Azure

You can store this block in a file (say import.tf) within the Terraform working directory, where you have the tfvars, state and from where you will run terraform plan and apply.

Likewise, you can create many import blocks one after another, one import block for each Azure resource that you want to import.

So if you want to import 4 Azure resources, the structure of import.tf file will be as follows :

import.tf 

import {
  to = <terraform state address>
  id = <real-world resource ID>
}
import {
  to = <terraform state address>
  id = <real-world resource ID>
}
import {
  to = <terraform state address>
  id = <real-world resource ID>
}
import {
  to = <terraform state address>
  id = <real-world resource ID>
}

🎯 Two Practical Scenarios

✅ Scenario 1: Easy-to-Identify Resources (Modules with Keys)

When modules support keys, Terraform plan will clearly show the address.

Example from terraform plan:

module.network.azurerm_subnet.subnet["web"]

Azure Resource ID:

/subscriptions/xxx/resourceGroups/rg-network/providers/Microsoft.Network/virtualNetworks/vnet-01/subnets/web

Import block:

import {
  to = module.network.azurerm_subnet.subnet["web"]
  id = "/subscriptions/xxx/resourceGroups/rg-network/providers/Microsoft.Network/virtualNetworks/vnet-01/subnets/web"
}

Direct copy from plan + ID = Simple!

❗ Scenario 2: Auto-generated Keys (Role Assignments etc)

When no keys are used in modules → Terraform uses random index/GUIDs. Matching becomes tricky.

Terraform plan shows:

module.managed_id.azurerm_role_assignment.this["1e71f885-b502-49b4-8d7d-6ea99381234a"]

But which Azure resource does this belong to?

Solution:

    1. Check Azure Portal/CLI → Note principalId, scope, roleDefinitionId.
    2. Use terraform state list to find existing addresses.
    3. Use terraform state show to inspect and match principalId/scope → Identify correct mapping.

Once matched → prepare import block accordingly.

📦 Working with Remote State (Backend)

Typically, in a enterprise environment, state files are stored in remote backend. However, for this import purpose it is convenient to work on local state file.

Therefore, if our state file is stored in remote backend, we will create a local copy from it by running terraform state pull command. Once the import process is done, we will push the local state file to remote backend by terraform state push command.

✅ Downloading state (Recommended → terraform pull)

terraform state pull > terraform.tfstate

Working with local state:

Remember, when you run terraform plan or terraform apply, by default it will consider remote backend, if it is configured.

Therefore, if you want to refer local state file, you have to specify it by using the -state parameter. Failure to do so will point it to the remote backend.

terraform plan -state=terraform.tfstate
terraform apply -state=terraform.tfstate

✅ Updating remote backend (Recommended → terraform push)

Important: Always take a backup of remote state before overwriting!

Once import is done and terraform plan shows no changes, it’s time to update the remote backend.

terraform state push terraform.tfstate          # Push local to remote backend

Validate:

terraform plan

Now, this one is for the remote state file, as you did not specify the state file location.

✅ Should show → Infrastructure up to date.

Bingo ! The import operation is fully successful !

📌 Best Practices

      • Take backup of remote state file before the activity.
      • Ensure no create, delete, or change before import apply.
      • For random keys, use terraform state list and terraform state show → match carefully.
      • Keep import.tf under version control.

📌 Step-by-Step Import Process (Final Flow)

      1. Make a local copy of the backend state (terraform state pull).
      2. Prepare import.tf file (in same directory as tfvars and state file).
      3. Run terraform plan (with local state) to see import plan (should be 0 to add, 0 to change, 0 to destroy, x to import)
      4. Run terraform apply with local state
      5. Run terraform plan again (with local state) → Should show “Your Infrastructure is up to date”. (0 to add, 0 to change, 0 to destroy)
      6. Make sure a backup exists of the remote state file.
      7. Push local state file to remote backend → terraform state push terraform.tfstate.
      8. Run terraform plan with default remote backend → Should show “Your Infrastructure is up to date”.
      9. Delete local state file
      10. Delete import.tf file.

 

Terraform alias — Solving Multi-Subscription Deployment Challenges

 

In cloud environments, especially in Azure, infrastructure is often spread across multiple subscriptions for security and organizational reasons.

Hub and Spoke Topology is a classic pattern where:

    • The Hub VNet (shared resources, security services, DNS zones etc.) lives in its own subscription.
    • The Spoke VNets (application workloads) live in different subscriptions, each managing their own state files.

While everything works smoothly for independent deployments, the real problem starts when you need cross-subscription interactions, like:

    • VNet Peering between Hub and Spoke (both sides need peering objects)
    • Private Endpoint + Private DNS Zones (Private Endpoint in spoke, DNS zone in hub)

By default, Terraform executes operations only against a single provider configuration (in our case, a single Azure subscription).

So, how can we create resources in two different subscriptions at the same time from within the spoke configuration?

Answer → Terraform alias provider.

Continue reading “Terraform alias — Solving Multi-Subscription Deployment Challenges”