packer を使って VM イメージを作成¶

packer とは¶

Packer は、仮想マシン(VM)イメージの構築と管理をするコマンドライン・ツール。テンプレートに基づいて仮想マシンイメージを作成し、そのイメージをプロビジョニングし、最終的にはイメージをクラウドサービスにデプロイする。 Packer は、AWS、Google Cloud、Azure、DigitalOcean などの主要なクラウドプロバイダーと互換性がある。これにより、異なるクラウドプロバイダー間で同じイメージを簡単に作成し、デプロイすることが可能。

Packer の主な機能:

テンプレートに基づく仮想マシンイメージの作成
プロビジョニングスクリプトの実行によるイメージのカスタマイズ
クラウドサービスへマシンイメージデプロイ

Packer のインストール¶

公式サイトから Packer のダウンロードができる。また、Homebrew などのパッケージマネージャーを使用してインストールできる。

1	`brew install packer`

Packer の基本的な使い方¶

Packer の使用方法は次のとおり：

テンプレート作成: JSON または HCL で記述。仮想マシンイメージ作成の指示を含む。
Packer 実行: packer build template.json 等のコマンドでイメージを作成。
デプロイ: テンプレートのデプロイ設定に従い、作成されたイメージをクラウドサービスへデプロイ。

Packer は、開発環境やテスト環境の構築など、繰り返し行う作業の自動化に有用。環境の一貫性を保ち、開発者間の差異を減らし、バグ発見を容易にする。

GCE VM イメージの作成例: Deep Learning 環境¶

以下は、Deep Learning VM イメージを作成する Packer テンプレート(HCL)とプロビジョニングスクリプトの例。GPU ドライバのインストールとコンテナイメージの pull を実行。 GPU driver をインストールする際は作業する(ssh 先)の VM に GPU が搭載されている必要がある。

example.pkr.hcl
packer {
  required_plugins {
    googlecompute = {
      source  = "github.com/hashicorp/googlecompute"
      version = "~> 1"
    }
  }
}

source "googlecompute" "basic-example" {
  project_id = "working-project-id"
  image_project_id = "destination-project-id"
  source_image = "c0-deeplearning-common-cu121-v20231105-debian-11"
  ssh_username = "packer"
  zone         = "asia-northeast1-a"
  network      = "foo"
  subnetwork   = "bar"
  image_name   = "deep-learning-example-{{timestamp}}"
  image_family = "deep-learning-example"
  disk_size    = 50
  preemptible  = true
  use_iap      = true
  use_internal_ip = true
  omit_external_ip = true
  image_description = "deep learning image"
  accelerator_type = "projects/working-project/zones/asia-northeast1-a/acceleratorTypes/nvidia-tesla-t4"
  accelerator_count = 1
}

build {
  sources = ["sources.googlecompute.basic-example"]
  provisioner "shell" {
    scripts = ["./foo.sh"]
    execute_command = "echo 'packer' | sudo -S env {{ .Vars }} {{ .Path }}"
  }
}

sudo 権限がないと編集できないファイルを編集するのであればexecute_commandの設定で sudo を使う。 ref

上記の例ではプライベート ip のマシンに iap を使って ssh 接続しているため、IAP-Secured Tunnel Useの権限が必要になる。

foo.sh
#!/bin/bash
sudo /opt/deeplearning/install-driver.sh
sudo cat << 'EOF' > /etc/docker/daemon.json
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    },
    "exec-opts": ["native.cgroupdriver=cgroupfs"],
    "log-driver": "gcplogs"
}
EOF

gcloud auth configure-docker asia-northeast1-docker.pkg.dev --quiet
docker pull \
    asia-northeast1-docker.pkg.dev/working-project-id/deep-learning-example/hello-world-gpu:latest

Terraform plugin の Packer を使う¶

terraform-provider-packerを使うと、Terraform から Packer を実行できる。

例¶

公式の例を参考に VM イメージ内のファイルに変数の値を書き込む例。

main.tf
terraform {
  required_providers {
    packer = {
      source = "toowoxx/packer"
      version = "0.1.0"
    }
  }
}

provider "packer" {}

data "packer_version" "version" {}

data "packer_files" "files1" {
  file = "demo.pkr.hcl"
}

resource "packer_image" "image1" {
  file = data.packer_files.files1.file
  variables = {
    test_var1 = "test 1"
  }

  triggers = {
    packer_version = data.packer_version.version.version
    files_hash     = data.packer_files.files1.files_hash
  }
}

output "packer_version" {
  value = data.packer_version.version.version
}

output "build_uuid_1" {
  value = resource.packer_image.image1.build_uuid
}

output "file_hash_1" {
  value = data.packer_files.files1.files_hash
}

demo.pkr.hcl
variable "test_var1" {
  type = string
}

packer {
  required_plugins {
    googlecompute = {
      source  = "github.com/hashicorp/googlecompute"
      version = "~> 1"
    }
  }
}

source "googlecompute" "basic-example" {
  project_id = "myproject-123456"
  source_image = "ubuntu-2004-focal-v20231130"
  ssh_username = "packer"
  zone         = "asia-northeast1-a"
  network      = "mynetwork"
  subnetwork   = "mysubnet"
  image_name   = "demo-1"
  image_family = "demo"
  disk_size    = 15
  preemptible  = true
}

build {
  sources = ["sources.googlecompute.basic-example"]
  provisioner "shell" {
    env = {
      "test_var1" = var.test_var1
    }
    scripts = ["./demo.sh"]
    execute_command = "echo 'packer' | sudo -S env {{ .Vars }} {{ .Path }}"
  }
}

demo.sh
cat <<EOF > hello.txt
$test_var1
EOF

HCL ファイルの変更は Terraform で検知されるが、Shell スクリプトの変更は検知されないため、工夫が必要。filesha256 関数を使用することで、依存ファイルの変更検知が可能になる場合もある。 (参考: file hash) State 管理では、既存の VM イメージを削除せず、新規にイメージが作成される。

GPU driver とコンテナを含んだ VM イメージの setup 時間比較¶

VM 起動時に gpu driver と container pull を行う¶

gpu driver setup 3.5min
container image (展開後 10GB)の pull 3.5min

VM 起動のオーバーヘッドを含めるとコンテナ実行まで、約 9 分かかる。

依存を全て含んだ VM 起動¶

GPU ドライバとコンテナイメージを含む VM イメージを作成することで、VM 起動時にインストールや pull が不要になり、起動時間が短縮される。VM 作成からコンテナ実行まで、約 1 ～ 2 分に短縮可能。

参考¶

[Performance] Add Packer image generation scripts for GCP and AWS by yika-luo · Pull Request #4068 · skypilot-org/skypilot · GitHub