Set up for Ray on Vertex AI

Before you begin with Ray on Vertex AI, follow the steps to set up your Google project, Vertex AI SDK for Python, and VPC peering network:

  1. Follow the steps at Set up a project and a development environment to set up billing for your project, enable the Vertex AI API, and install the gcloud CLI.

  2. If you haven't already, enable the Compute Engine API and Service Networking API :

    Enable the APIs

  3. Prerequisite: You should know how to develop programs using open source Ray.

  4. Set up a VPC peering network and private services connection to access Vertex AI. To connect to the head node on the Ray cluster by using the using the Vertex AI SDK for Python, the connecting environment (for example, Compute Engine VM or Vertex AI Workbench notebook) must be on the same peered VPC network as the cluster.

  5. The Ray on Vertex AI SDK for Python used here is a version of the Vertex AI SDK for Python that includes the functionality of the Ray Client, Ray BigQuery connector, Ray cluster management on Vertex AI, and predictions on Vertex AI.

    • If you're using Ray on Vertex AI in the Google Cloud console, a Colab Enterprise notebook guides you through the Vertex AI SDK for Python installation process after you create a Ray cluster.

    • If you're using Ray on Vertex AI in the Vertex AI Workbench or other interactive Python environment, install the Vertex AI SDK for Python:

      # The latest image in the Ray cluster includes Ray 2.9
      # The latest supported Python version is Python 3.10.
      $ pip install google-cloud-aiplatform[ray]

      After you install the SDK, restart the kernel before you import packages.

  6. The maximum total number of nodes in the Ray cluster on Vertex AI you can scale up to (M) depends on the initial total number of nodes you set up (N). After you create the Ray on Vertex AI cluster, you can scale the total number of nodes to any amount between P and M inclusive, where P is the number of pools in your cluster.

    Assuming f(x) = min(29, (32 - ceiling(log2(x))), use the following formulas to make sure you don't exceed the maximum number of nodes (M):

    • f(2 * M) = f(2 * N)
    • f(64 * M) = f(64 * N)
    • f(max(32, 16 + M)) = f(max(32, 16 + N))
  7. (Optional) If you plan to read from BigQuery, you need to create a new BigQuery dataset or use an existing dataset.

  8. (Optional) To mitigate the risk of data exfiltration from Vertex AI, you can enable VPC Service Controls. For more information, see VPC Service Controls with Vertex AI.

    If you enable VPC Service Controls, you won't be able to reach resources outside the perimeter, such as files in a Cloud Storage bucket.

Supported locations

The Feature availability table for Custom model training lists the available locations for Ray on Vertex AI.

What's next