为 Ray on Vertex AI 进行设置

在开始使用 Ray on Vertex AI 之前，请按照以下步骤设置 Google 项目、Python 版 Vertex AI SDK 和 VPC 对等互连网络：

按照设置项目和开发环境中的步骤为项目设置结算，启用 Vertex AI API 并安装 gcloud CLI。
启用 Compute Engine API 和 Service Networking API（如果尚未启用）：

启用 API
前提条件：您应该了解如何使用开源 Ray 开发程序。
设置 VPC 对等互连网络和专用服务连接以访问 Vertex AI。如需使用 Vertex AI SDK for Python 连接到 Ray 集群上的头节点，连接环境（例如，Compute Engine 虚拟机或 Vertex AI Workbench 笔记本）必须位于与集群相同的对等互连 VPC 网络。
- 如果您在 Google Cloud 控制台中使用 Ray on Vertex AI，则可以在 Vertex AI 上创建 Ray 集群时设置专用服务访问通道连接。
此处使用的 Python 版 Ray on Vertex AI SDK 是 Python 版 Vertex AI SDK 的一个版本，其中包含 Ray 客户端、Ray BigQuery 连接器、Vertex AI 上的 Ray 集群管理和 Vertex AI 上的预测。
- 如果您在 Google Cloud 控制台中使用 Ray on Vertex AI，则 Colab Enterprise 笔记本将在您创建 Ray 集群后指导您完成 Python 版 Vertex AI SDK 安装过程。
- 如果您在 Vertex AI Workbench 或其他交互式 Python 环境中使用 Ray on Vertex AI，请安装 Python 版 Vertex AI SDK：
```
# The latest image in the Ray cluster includes Ray 2.9
# The latest supported Python version is Python 3.10.
$ pip install google-cloud-aiplatform[ray]
```
  安装 SDK 后，请先重启内核，然后再导入软件包。
  
  注意：如果您将 Vertex AI Workbench 笔记本用作客户端环境并使用 Deep Learning VM 作为机器映像，则 TensorFlow 企业版已预装 Ray 和 Python 版 Vertex AI SDK
Vertex AI 上的 Ray 集群可以扩容到的节点总数上限 (M) 取决于您设置的初始节点总数 (N)。创建 Ray on Vertex AI 集群后，您可以将节点总数扩缩到 P 至 M 之间的值（含边界值），其中 P 是集群中的池数量。

假设有 f(x) = min(29, (32 - ceiling(log2(x)))，请使用以下公式来确保不超过节点数上限 (M)：
- f(2 * M) = f(2 * N)
- f(64 * M) = f(64 * N)
- f(max(32, 16 + M)) = f(max(32, 16 + N))
（可选）如果您计划从 BigQuery 读取数据，则需要创建新的 BigQuery 数据集或使用现有数据集。
注意：如果您在 Vertex AI 上的 Ray 集群中运行与 BigQuery 等 Google 服务交互的代码，则 Vertex AI Custom Code Service Agent 用于进行身份验证。
（可选）为了降低 Vertex AI 中发生数据渗漏的风险，您可以启用 VPC Service Controls。如需了解详情，请参阅将 VPC Service Controls 与 Vertex AI 搭配使用。

如果启用 VPC Service Controls，您将无法访问边界外的资源，例如 Cloud Storage 存储桶中的文件。

支持的位置

自定义模型训练的特征可用性表列出了 Ray on Vertex AI 的可用位置。

后续步骤

在 Vertex AI 上创建 Ray 集群