LlamaIndex on Vertex AI for RAG API

Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of generative models, specifically, large language models (LLMs). It combines the power of LLMs with external knowledge sources, such as documents, and databases, to generate more accurate and informative responses.

For more information about how RAG works, see Retrieval-Augmented Generation Overview (RAG).

Supported models

Model Version
Gemini gemini-experimental
Gemini 1.0 Pro gemini-1.0-pro-001
gemini-1.0-pro-002
Gemini 1.0 Pro Vision gemini-1.0-pro-vision-001
Gemini 1.5 Pro (Preview) gemini-1.5-pro-preview-0514

Example syntax

Syntax to create a RAG corpus.

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  http://${LOCATION}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION}/ragCorpora\
  -d '{
  "display_name" : "...",
  "description": ".."
}'

Python

corpus = rag.create_corpus(display_name=..., description=...)
print(corpus)

Parameters list

See examples for implementation details.

Corpus management

For information about a RAG corpus, see Corpus management.

Create RagCorpus

Parameters

display_name

Optional: string

The display name of the RagCorpus.

description

Optional: string

The description of the RagCorpus.

List RagCorpora

Parameters

page_size

Optional: int

The standard list page size.

page_token

Optional: string

The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][] of the previous [VertexRagDataService.ListRagCorpora][] call.

Get RagCorpus

Parameters

rag_corpus_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

Delete RagCorpus

Parameters

rag_corpus_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

Upload RagFile

Parameters

rag_corpus_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

display_name

Optional: string

The display name of the RagCorpus.

description

Optional: string

The description of the RagCorpus.

File management

For information about a RAG file, see Corpus management.

Import RagFile

Parameters

rag_corpus_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

gcs_source.uris

list

Cloud StorageStorage URI that contains the upload file

google_drive_source.resource_id

Optional: string

The type of the Google Drive resource.

google_drive_source.resource_ids.resource_type

Optional: string

The ID of the Google Drive resource.

rag_file_chunking_config.chunk_size

Optional: int

Number of tokens each chunk should have.

rag_file_chunking_config.chunk_overlap

Optional: int

Number of tokens overlap between two chunks.

List RagFiles

Parameters

rag_corpus_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus_id}

page_size

Optional: int

The standard list page size.

page_token

Optional: string

The standard list page token. Typically obtained from [ListRagCorporaResponse.next_page_token][] of the previous [VertexRagDataService.ListRagCorpora][]< call.

Get RagFile

Parameters

rag_file_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Delete RagFile

Parameters

rag_file_id

string

The ID of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_file_id}

Retrieval and prediction

Retrieval

Parameter Description
similarity_top_k Controls the maximum number of contexts that are retrieved.
vector_distance_threshold Only contexts with a distance smaller than the threshold are considered.

Prediction

Parameters

model_id

string

LLM model for content generation.

rag_corpora

string

The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}

text

string (list)

The text to LLM for content generation. Maximum value: 1 list.

vector_distance_threshold

Optional: double

Only contexts with a vector distance smaller than the threshold are returned.

similarity_top_k

Optional: int

The number of top contexts to retrieve.

Examples

Create a RAG corpus

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • CORPUS_DISPLAY_NAME: The display name of the RagCorpus.
  • CORPUS_DESCRIPTION: The description of the RagCorpus.

HTTP method and URL:

POST http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora

Request JSON body:

{
  "display_name" : "CORPUS_DISPLAY_NAME",
  "description": "CORPUS_DESCRIPTION"
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content
You should receive a successful status code (2xx).

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

corpus = rag.create_corpus(display_name=display_name, description=description)
print(corpus)

List a RAG corpus

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • PAGE_SIZE: The standard list page size. You may adjust the number of RagCorpora to return per page by updating the page_size parameter.
  • PAGE_TOKEN: The standard list page token. Obtained typically using ListRagCorporaResponse.next_page_token of the previous VertexRagDataService.ListRagCorpora call.

HTTP method and URL:

GET http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

Execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
You should receive a successful status code (2xx) and a list of RagCorpora under the given PROJECT_ID.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

corpora = rag.list_corpora()
print(corpora)

Get a RAG corpus

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

GET http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

Execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
A successful response returns the RagCorpus resource.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# corpus_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

corpus = rag.get_corpus(name=corpus_name)
print(corpus)

Delete a RAG corpus

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.

HTTP method and URL:

DELETE http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X DELETE \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"

PowerShell

Execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
A successful response returns the DeleteOperationMetadata.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# corpus_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

rag.delete_corpus(name=corpus_name)
print(f"Corpus {corpus_name} deleted.")

Upload a RAG file

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • INPUT_FILE: The path of a local file.
  • FILE_DISPLAY_NAME: The display name of the RagFile.
  • RAG_FILE_DESCRIPTION: The description of the RagFile.

HTTP method and URL:

POST http://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload

Request JSON body:

{
 "rag_file": {
  "display_name": "FILE_DISPLAY_NAME",
  "description": "RAG_FILE_DESCRIPTION"
 }
}

To send your request, choose one of these options:

curl

Save the request body in a file named INPUT_FILE, and execute the following command:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @INPUT_FILE \
"http://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"

PowerShell

Save the request body in a file named INPUT_FILE, and execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile INPUT_FILE `
-Uri "http://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload" | Select-Object -Expand Content
A successful response returns the RagFile resource. The last component of the RagFile.name field is the server-generated rag_file_id.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# corpus_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# path = "path/to/local/file.txt"
# display_name = "file_display_name"
# description = "file description"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

rag_file = rag.upload_file(
    corpus_name=corpus_name,
    path=path,
    display_name=display_name,
    description=description,
)
print(rag_file)

Import RAG files

Files and folders can be imported from Google Drive or Cloud Storage.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • GCS_URIS: A list of Cloud Storage locations. Example: gs://my-bucket1, gs://my-bucket2.
  • DRIVE_RESOURCE_ID: The ID of the Google Drive resource. Examples:
    • http://drive.go888ogle.com.fqhub.com/file/d/ABCDE
    • http://drive.go888ogle.com.fqhub.com/corp/drive/u/0/folders/ABCDEFG
  • DRIVE_RESOURCE_TYPE: Type of the Google Drive resource. Options:
    • RESOURCE_TYPE_FILE - File
    • RESOURCE_TYPE_FOLDER - Folder
  • CHUNK_SIZE: Optional: Number of tokens each chunk should have.
  • CHUNK_OVERLAP: Optional: Number of tokens overlap between chunks.

HTTP method and URL:

POST http://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import

Request JSON body:

{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": GCS_URIS
    },
    "google_drive_source": {
      "resource_ids": {
        "resource_id": DRIVE_RESOURCE_ID,
        "resource_type": DRIVE_RESOURCE_TYPE
      },
    }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"http://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "http://LOCATION-aiplatform.googleapis.com/upload/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
A successful response returns the ImportRagFilesOperationMetadata resource.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# corpus_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["http://drive.go888ogle.com.fqhub.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    chunk_size=512,  # Optional
    chunk_overlap=100,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")

Get a RAG file

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • RAG_FILE_ID: The ID of the RagFile resource.

HTTP method and URL:

GET http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

Execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
A successful response returns the RagFile resource.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# file_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

rag_file = rag.get_file(name=file_name)
print(rag_file)

List RAG files

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • PAGE_SIZE: The standard list page size. You may adjust the number of RagFiles to return per page by updating the page_size parameter.
  • PAGE_TOKEN: The standard list page token. Obtained typically using ListRagFilesResponse.next_page_token of the previous VertexRagDataService.ListRagFiles call.

HTTP method and URL:

GET http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"

PowerShell

Execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
You should receive a successful status code (2xx) along with a list of RagFiles under the given RAG_CORPUS_ID.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# corpus_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

files = rag.list_files(corpus_name=corpus_name)
for file in files:
    print(file)

Delete a RAG file

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • RAG_CORPUS_ID: The ID of the RagCorpus resource.
  • RAG_FILE_ID: The ID of the RagFile resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}.

HTTP method and URL:

DELETE http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X DELETE \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"

PowerShell

Execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
A successful response returns the DeleteOperationMetadata resource.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# file_name = "projects/{project_id}/locations/us-central1/ragCorpora/{rag_corpus_id}/ragFiles/{rag_file_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

rag.delete_file(name=file_name)
print(f"File {file_name} deleted.")

Retrieval query

When a user asks a question or provides a prompt, the retrieval component in RAG searches through its knowledge base to find information that is relevant to the query.

REST

Before using any of the request data, make the following replacements:

  • LOCATION: The region to process the request.
  • PROJECT_ID: Your project ID.
  • RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
  • VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
  • TEXT: The query text to get relevant contexts.
  • SIMILARITY_TOP_K: The number of top contexts to retrieve.

HTTP method and URL:

POST http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts

Request JSON body:

{
 "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "vector_distance_threshold": 0.8
  },
  "query": {
   "text": "TEXT",
   "similarity_top_k": SIMILARITY_TOP_K
  }
 }

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
You should receive a successful status code (2xx) and a list of related RagFiles.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# rag_corpora = ["9183965540115283968"] # Only one corpus is supported at this time
# text = "Your Query"

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

response = rag.retrieval_query(
    rag_corpora=rag_corpora,
    text=text,
    similarity_top_k=10,  # Optional
)
print(response)

Prediction

A prediction controls the LLM method that generates content.

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The region to process the request.
  • MODEL_ID: LLM model for content generation. Example: gemini-1.5-pro-preview-0514
  • GENERATION_METHOD: LLM method for content generation. Options: generateContent, streamGenerateContent
  • INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
  • RAG_CORPUS_RESOURCE: The name of the RagCorpus resource. Format: projects/{project}/locations/{location}/ragCorpora/{rag_corpus}.
  • SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
  • VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.

HTTP method and URL:

POST http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD

Request JSON body:

{
 "contents": {
  "role": "user",
  "parts": {
    "text": "INPUT_PROMPT"
  }
 },
 "tools": {
  "retrieval": {
   "disable_attribution": false,
   "vertex_rag_store": {
    "rag_resources": {
      "rag_corpus": "RAG_CORPUS_RESOURCE",
    },
    "similarity_top_k": SIMILARITY_TOP_K,
    "vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
   }
  }
 }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$headers = @{  }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "http://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
A successful response returns the generated content with citations.

Python

To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.


from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
# rag_corpora = ["9183965540115283968"] # Only one corpus is supported at this time

# Initialize Vertex AI API once per session
vertexai.init(project=project_id, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_corpora=rag_corpora,
            similarity_top_k=3,  # Optional
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-1.0-pro-002", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)

What's Next