When you create a persistent resource, the training service first finds resources from the Compute Engine resource pool based on the specifications you provided, and then provisions a long-running cluster for you. This page shows you how to create a persistent resource for running your custom training jobs by using the Vertex AI API or the Google Cloud CLI.
Create a persistent resource
Select one of the following tabs for instructions on how to create a persistent resource.
gcloud
A persistent resource can have one or more resource pools. To create multiple
resource pools in a persistent resource, specify multiple
--resource-pool-spec
flags.
Each resource pool can have autoscaling either enabled or disabled. To enable
autoscaling, specify min_replica_count
and
max_replica_count
.
You can specify all resource pool configurations as part of the command-line
or use the --config
flag to specify the path to a YAML file that
contains the configurations.
Before using any of the command data below, make the following replacements:
- PROJECT_ID: The Project ID of the Google Cloud project where you want to create the persistent resource.
- LOCATION: The region where you want to create the persistent resource. For a list of supported regions, see Feature availability.
- PERSISTENT_RESOURCE_ID: The ID of the persistent resource.
- DISPLAY_NAME: (Optional) The display name of the persistent resource.
- MACHINE_TYPE: The type of VM to use. For a list
of supported VMs, see Machine types.
This field corresponds to the
machineSpec.machineType
field in theResourcePool
API message. - ACCELERATOR_TYPE: (Optional) The type of GPU to attach
to each VM in the resource pool. For a list of supported GPUs, see
GPUs. This field corresponds to
the
machineSpec.acceleratorType
field in theResourcePool
API message. - ACCELERATOR_COUNT: (Optional) The number of GPUs to
attach to each VM in the resource pool. The default the value is
1
. This field corresponds to themachineSpec.acceleratorCount
field inResourcePool
API message. - REPLICA_COUNT: The number of replicas to create
when creating this resource pool. This field corresponds to the
replicaCount
field in theResourcePool
API message. This field is required if you're not specifying MIN_REPLICA_COUNT and MAX_REPLICA_COUNT. - MIN_REPLICA_COUNT: (Optional) The minimum number of replicas that autoscaling can scale down to for this resource pool. Both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT are required to enable autoscaling on this resource pool.
- MAX_REPLICA_COUNT: (Optional) The maximum number of replicas that autoscaling can scale up to for this resource pool. Both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT are required to enable autoscaling on this resource pool.
- BOOT_DISK_TYPE: (Optional) The type of disk to use for
as the boot disk of each VM in the resource pool. This field corresponds to the
diskSpec.bootDiskType
field in theResourcePool
API message. Acceptable values include the following:pd-standard
(default)pd-ssd
- BOOT_DISK_SIZE_GB: (Optional) The disk size in GiB for
the boot disk of each VM in the resource pool. Acceptable values are
100
(default) to64000
. This field corresponds to thediskSpec.bootDiskSizeGb
field in theResourcePool
API message. - CONFIG: Path to the persistent resource YAML
configuration file. This file should contain a list of ResourcePool. If an option is specified
in both the configuration file and the command-line arguments, the command-line arguments
override the configuration file. Note that keys with underscores are invalid.
Example YAML configuration file:
resourcePoolSpecs: machineSpec: machineType: n1-standard-4 replicaCount: 1
Execute the following command:
Linux, macOS, or Cloud Shell
gcloud beta ai persistent-resources create \ --persistent-resource-id=PERSISTENT_RESOURCE_ID \ --display-name=DISPLAY_NAME \ --project=PROJECT_ID \ --region=LOCATION \ --resource-pool-spec="replica-count=REPLICA_COUNT,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT,machine-type=MACHINE_TYPE,accelerator-type=ACCELERATOR_TYPE,accelerator-count=ACCELERATOR_COUNT,disk-type=BOOT_DISK_TYPE,disk-size=BOOT_DISK_SIZE_GB"
Windows (PowerShell)
gcloud beta ai persistent-resources create ` --persistent-resource-id=PERSISTENT_RESOURCE_ID ` --display-name=DISPLAY_NAME ` --project=PROJECT_ID ` --region=LOCATION ` --resource-pool-spec="replica-count=REPLICA_COUNT,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT,machine-type=MACHINE_TYPE,accelerator-type=ACCELERATOR_TYPE,accelerator-count=ACCELERATOR_COUNT,disk-type=BOOT_DISK_TYPE,disk-size=BOOT_DISK_SIZE_GB"
Windows (cmd.exe)
gcloud beta ai persistent-resources create ^ --persistent-resource-id=PERSISTENT_RESOURCE_ID ^ --display-name=DISPLAY_NAME ^ --project=PROJECT_ID ^ --region=LOCATION ^ --resource-pool-spec="replica-count=REPLICA_COUNT,min-replica-count=MIN_REPLICA_COUNT,max-replica-count=MAX_REPLICA_COUNT,machine-type=MACHINE_TYPE,accelerator-type=ACCELERATOR_TYPE,accelerator-count=ACCELERATOR_COUNT,disk-type=BOOT_DISK_TYPE,disk-size=BOOT_DISK_SIZE_GB"
You should receive a response similar to the following:
Using endpoint [http://us-central1-aiplatform.googleapis.com/] Operation to create PersistentResource [projects/123456789012/locations/us-central1/persistentResources/mypersistentresource/operations/1234567890123456789] is submitted successfully. You may view the status of your PersistentResource create operation with the command $ gcloud beta ai operations describe projects/sample-project/locations/us-central1/operations/1234567890123456789
Example gcloud
command:
gcloud beta ai persistent-resources create \ --persistent-resource-id=my-persistent-resource \ --region=us-central1 \ --resource-pool-spec="min-replica-count=4,max-replica-count=12,machine-type=n1-highmem-2,accelerator-type=NVIDIA_TESLA_K80,accelerator-count=1,disk-type=pd-standard,disk-size=200" \ --resource-pool-spec="replica-count=4,machine-type=n1-standard-4"
Advanced gcloud
configurations
If you want to specify configuration options that are not available in the
preceding examples, you can use the --config
flag to specify the path to a
config.yaml
file in your local environment that contains the fields of
persistentResources
. For example:
gcloud beta ai persistent-resources create \ --persistent-resource-id=PERSISTENT_RESOURCE_ID \ --project=PROJECT_ID \ --region=LOCATION \ --config=CONFIG
REST
A persistent resource can have one or more resource pools
(machine_spec
), and each resource pool can have autoscaling either
enabled or disabled.
Before using any of the request data, make the following replacements:
- PROJECT_ID: The Project ID of the Google Cloud project where you want to create the persistent resource.
- LOCATION: The region where you want to create the persistent resource. For a list of supported regions, see Feature availability.
- PERSISTENT_RESOURCE_ID: The ID of the persistent resource.
- DISPLAY_NAME: (Optional) The display name of the persistent resource.
- MACHINE_TYPE: The type of VM to use. For a list
of supported VMs, see Machine types.
This field corresponds to the
machineSpec.machineType
field in theResourcePool
API message. - ACCELERATOR_TYPE: (Optional) The type of GPU to attach
to each VM in the resource pool. For a list of supported GPUs, see
GPUs. This field corresponds to
the
machineSpec.acceleratorType
field in theResourcePool
API message. - ACCELERATOR_COUNT: (Optional) The number of GPUs to
attach to each VM in the resource pool. The default the value is
1
. This field corresponds to themachineSpec.acceleratorCount
field inResourcePool
API message. - REPLICA_COUNT: The number of replicas to create
when creating this resource pool. This field corresponds to the
replicaCount
field in theResourcePool
API message. This field is required if you're not specifying MIN_REPLICA_COUNT and MAX_REPLICA_COUNT. - MIN_REPLICA_COUNT: (Optional) The minimum number of replicas that autoscaling can scale down to for this resource pool. Both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT are required to enable autoscaling on this resource pool.
- MAX_REPLICA_COUNT: (Optional) The maximum number of replicas that autoscaling can scale up to for this resource pool. Both MIN_REPLICA_COUNT and MAX_REPLICA_COUNT are required to enable autoscaling on this resource pool.
- BOOT_DISK_TYPE: (Optional) The type of disk to use for
as the boot disk of each VM in the resource pool. This field corresponds to the
diskSpec.bootDiskType
field in theResourcePool
API message. Acceptable values include the following:pd-standard
(default)pd-ssd
- BOOT_DISK_SIZE_GB: (Optional) The disk size in GiB for
the boot disk of each VM in the resource pool. Acceptable values are
100
(default) to64000
. This field corresponds to thediskSpec.bootDiskSizeGb
field in theResourcePool
API message.
HTTP method and URL:
POST http://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/persistentResources?persistent_resource_id=PERSISTENT_RESOURCE_ID
Request JSON body:
{ "display_name": "DISPLAY_NAME", "resource_pools": [ { "machine_spec": { "machine_type": "MACHINE_TYPE", "accelerator_type": "ACCELERATOR_TYPE", "accelerator_count": ACCELERATOR_COUNT }, "replica_count": REPLICA_COUNT, "autoscaling_spec": { "min_replica_count": MIN_REPLICA_COUNT, "max_replica_count": MAX_REPLICA_COUNT }, "disk_spec": { "boot_disk_type": "BOOT_DISK_TYPE", "boot_disk_size_gb": BOOT_DISK_SIZE_GB } } ] }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/123456789012/locations/us-central1/persistentResources/mypersistentresource/operations/1234567890123456789", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.v1beta1.CreatePersistentResourceOperationMetadata", "genericMetadata": { "createTime": "2023-02-08T21:17:15.009668Z", "updateTime": "2023-02-08T21:17:15.009668Z" } } }
Resource stockout
There could be stockout for scarce resources like A100 GPUs, which can lead to persistent resource creation failure when no resource is available in the region you specified. In this case, you can try to reduce the number of replicas, change to different accelerator type, or try again during non-peak hours.
What's next
- Run training jobs on a persistent resource.
- Learn about persistent resource.
- Get information about a persistent resource.
- Delete a persistent resource.