Reranking documentation (#25867)

aninibread · Oxyjun · Naapperas · web-flow · commit 960009f14a22 · 2025-10-28T11:46:27.000-04:00
* reranking documentation

* add parameters

* changelog and system prompt changes

* Apply suggestions from code review

* Update src/content/docs/ai-search/usage/rest-api.mdx

Co-authored-by: Nuno Pereira &lt;nunoafonso2002@gmail.com&gt;

* Update src/content/docs/ai-search/usage/rest-api.mdx

Co-authored-by: Nuno Pereira &lt;nunoafonso2002@gmail.com&gt;

---------

Co-authored-by: Jun Lee &lt;junlee@cloudflare.com&gt;
Co-authored-by: Nuno Pereira &lt;nunoafonso2002@gmail.com&gt;
diff --git a/src/content/changelog/ai-search/2025-10-27-ai-search-reranking-system-prompt.mdx b/src/content/changelog/ai-search/2025-10-27-ai-search-reranking-system-prompt.mdx
@@ -0,0 +1,54 @@
+---
+title: Reranking and API-based system prompt configuration in AI Search
+description: Improve result accuracy with reranking and dynamically control AI Search responses by setting system prompts in API requests.
+products:
+  - ai-search
+date: 2025-10-28
+---
+
+[AI Search](/ai-search/) now supports reranking for improved retrieval quality and allows you to set the system prompt directly in your API requests.
+
+## Rerank for more relevant results
+
+You can now enable [reranking](/ai-search/configuration/reranking/) to reorder retrieved documents based on their semantic relevance to the user’s query. Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering.
+
+You can enable and configure reranking in the dashboard or directly in your API requests:
+
+```javascript
+const answer = await env.AI.autorag("my-autorag").aiSearch({
+  query: "How do I train a llama to deliver coffee?",
+  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
+  reranking: {
+    enabled: true,
+    model: "@cf/baai/bge-reranker-base"
+  }
+});
+```
+
+## Set system prompts in API
+
+Previously, [system prompts](/ai-search/configuration/system-prompt/) could only be configured in the dashboard. You can now define them directly in your API requests, giving you per-query control over behavior. For example:
+
+```javascript
+// Dynamically set query and system prompt in AI Search
+async function getAnswer(query, tone) {
+  const systemPrompt = `You are a ${tone} assistant.`;
+
+  const response = await env.AI.autorag("my-autorag").aiSearch({
+    query: query,
+    system_prompt: systemPrompt
+  });
+
+  return response;
+}
+
+// Example usage
+const query = "What is Cloudflare?";
+const tone = "friendly";
+
+const answer = await getAnswer(query, tone);
+console.log(answer);
+```
+
+Learn more about [Reranking](/ai-search/configuration/reranking/) and [System Prompt](/ai-search/configuration/system-prompt/) in AI Search.
+
diff --git a/src/content/docs/ai-search/configuration/index.mdx b/src/content/docs/ai-search/configuration/index.mdx
@@ -22,6 +22,7 @@ The table below lists all available configuration options:
 | [Query rewrite system prompt](/ai-search/configuration/system-prompt/)         | yes                     | Custom system prompt to guide query rewriting behavior                                     |
 | [Match threshold](/ai-search/configuration/retrieval-configuration/)           | yes                     | Minimum similarity score required for a vector match                                       |
 | [Maximum number of results](/ai-search/configuration/retrieval-configuration/) | yes                     | Maximum number of vector matches returned (`top_k`)                                        |
+| [Reranking](/ai-search/configuration/reranking/) | yes | Rerank to reorder retrieved results by semantic relevance using a reranking model after initial retrieval |
 | [Generation model](/ai-search/configuration/models/)                           | yes                     | Model used to generate the final response                                                  |
 | [Generation system prompt](/ai-search/configuration/system-prompt/)            | yes                     | Custom system prompt to guide response generation                                          |
 | [Similarity caching](/ai-search/configuration/cache/)                          | yes                     | Enable or disable caching of responses for similar (not just exact) prompts                |
diff --git a/src/content/docs/ai-search/configuration/models/supported-models.mdx b/src/content/docs/ai-search/configuration/models/supported-models.mdx
@@ -50,6 +50,11 @@ Production models are the actively supported and recommended models that are sta
 | **Workers AI** | `@cf/baai/bge-m3` | 1,024 | 512 | cosine |
 |  | `@cf/baai/bge-large-en-v1.5` | 1,024 | 512 | cosine |
 
+### Reranking
+| Provider | Alias | Input tokens | 
+|---|---|---|
+| **Workers AI** | `@cf/baai/bge-reranker-base` | 512 | 
+
 ## Transition models
 
 There are currently no models marked for end-of-life.
diff --git a/src/content/docs/ai-search/configuration/reranking.mdx b/src/content/docs/ai-search/configuration/reranking.mdx
@@ -0,0 +1,74 @@
+---
+pcx_content_type: concept
+title: Reranking
+sidebar:
+  order: 4
+---
+
+import { DashButton } from "~/components";
+
+Reranking can help improve the quality of AI Search results by reordering retrieved documents based on semantic relevance to the user’s query. It applies a secondary model after retrieval to "rerank" the top results before they are outputted.
+
+## How it works
+
+By default, reranking is **disabled** for all AI Search instances. You can enable it during creation or later from the settings page.
+
+When enabled, AI Search will:
+
+1. Retrieve a set of relevant results from your index, constrained by your `max_num_of_results` and `score_threshold` parameters.  
+2. Pass those results through a [reranking model](/ai-search/configuration/models/supported-models/).
+3. Return the reranked results, which the text generation model can use for answer generation.
+
+Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering.
+
+## Configuration
+
+You can configure reranking in several ways:
+
+### Configure via API
+
+When you make a `/search` or `/ai-search` request using the [Workers Binding](/ai-search/usage/workers-binding/) or [REST API](/ai-search/usage/rest-api/), you can:
+
+- Enable or disable reranking per request
+- Specify the reranking model
+
+For example:
+
+```javascript
+const answer = await env.AI.autorag("my-autorag").aiSearch({
+  query: "How do I train a llama to deliver coffee?",
+  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
+  reranking: {
+    enabled: true,
+    model: "@cf/baai/bge-reranker-base"
+  }
+});
+```
+
+### Configure in dashboard for new AI Search
+
+When creating a new RAG in the dashboard:
+
+1. Go to **AI Search** in the Cloudflare dashboard.
+
+   <DashButton url="/?to=/:account/ai/ai-search" />
+   
+2. Select **Create** > **Get started**.
+3. In the **Retrieval configuration** step, open the **Reranking** dropdown.
+4. Toggle **Reranking** on.
+5. Select the reranking model.
+6. Complete your setup.
+
+### Configure in dashboard for existing AI Search
+
+To update reranking for an existing instance:
+
+1. Go to **AI Search** in the Cloudflare dashboard.
+
+   <DashButton url="/?to=/:account/ai/ai-search" />
+
+2. Select an existing AI Search instance.
+3. Go to the **Settings** tab.
+4. Under **Reranking**, toggle reranking on.
+5. Select the reranking model.
+
diff --git a/src/content/docs/ai-search/configuration/system-prompt.mdx b/src/content/docs/ai-search/configuration/system-prompt.mdx
@@ -21,13 +21,7 @@ System prompts are particularly useful for:
 - Applying domain-specific tone or terminology
 - Encouraging consistent, high-quality output
 
-## How to set your system prompt
-
-The system prompt for your AI Search can be set after it has been created by:
-
-1. Navigating to the [Cloudflare dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag), and go to AI > AI Search
-2. Select your AI Search
-3. Go to Settings page and find the System prompt setting for either Query rewrite or Generation
+## System prompt configuration
 
 ### Default system prompt
 
@@ -39,6 +33,31 @@ You can view the effective system prompt used for any AI Search's model call thr
 The default system prompt can change and evolve over time to improve performance and quality.
 :::
 
+### Configure via API
+
+When you make a `/ai-search` request using the [Workers Binding](/ai-search/usage/workers-binding/) or [REST API](/ai-search/usage/rest-api/), you can set the system prompt programmatically.
+
+For example:
+
+```javascript
+const answer = await env.AI.autorag("my-autorag").aiSearch({
+  query: "How do I train a llama to deliver coffee?",
+  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
+  system_prompt: "You are a helpful assistant."
+});
+```
+
+import { DashButton } from "~/components";
+
+### Configure via Dashboard
+The system prompt for your AI Search can be set after it has been created:
+
+1. Go to **AI Search** in the Cloudflare dashboard.
+   <DashButton url="/?to=/:account/ai/ai-search" />
+2. Select an existing AI Search instance.
+3. Go to the **Settings** tab.
+4. Go to **Query rewrite** or **Generation**, and edit the **System prompt**.
+
 ## Query rewriting system prompt
 
 If query rewriting is enabled, you can provide a custom system prompt to control how the model rewrites user queries. In this step, the model receives:
diff --git a/src/content/docs/ai-search/usage/rest-api.mdx b/src/content/docs/ai-search/usage/rest-api.mdx
@@ -52,7 +52,11 @@ curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{
 	"rewrite_query": false,
 	"max_num_results": 10,
 	"ranking_options": {
-		"score_threshold": 0.3
+		"score_threshold": 0.3,
+	},
+	"reranking": {
+		"enabled": true,
+    	"model": "@cf/baai/bge-reranker-base"
 	},
 	"stream": true,
 }'
@@ -89,9 +93,12 @@ curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/rags/{
 	"rewrite_query": true,
 	"max_num_results": 10,
 	"ranking_options": {
-		"score_threshold": 0.3
+		"score_threshold": 0.3,
 	},
-}'
+	"reranking": {
+		"enabled": true,
+    	"model": "@cf/baai/bge-reranker-base"
+	}'
 
 ```
 
diff --git a/src/content/docs/ai-search/usage/workers-binding.mdx b/src/content/docs/ai-search/usage/workers-binding.mdx
@@ -46,8 +46,12 @@ const answer = await env.AI.autorag("my-autorag").aiSearch({
 	rewrite_query: true,
 	max_num_results: 2,
 	ranking_options: {
-		score_threshold: 0.3,
+		score_threshold: 0.3
 	},
+  reranking: {
+    enabled: true,
+    model: "@cf/baai/bge-reranker-base"
+  },
 	stream: true,
 });
 ```
@@ -115,8 +119,12 @@ const answer = await env.AI.autorag("my-autorag").search({
 	rewrite_query: true,
 	max_num_results: 2,
 	ranking_options: {
-		score_threshold: 0.3,
+		score_threshold: 0.3
 	},
+  reranking: {
+    enabled: true,
+    model: "@cf/baai/bge-reranker-base"
+  }
 });
 ```
 
diff --git a/src/content/docs/r2/api/tokens.mdx b/src/content/docs/r2/api/tokens.mdx
@@ -20,7 +20,7 @@ To create an API token:
 1. In the Cloudflare dashboard, go to the **R2 object storage** page.
 
    <DashButton url="/?to=/:account/r2/overview" />
-2. Select **Manage API tokens**.
+2. Select **Manage in API tokens**.
 3. Choose to create either:
    - **Create Account API token** - These tokens are tied to the Cloudflare account itself and can be used by any authorized system or user. Only users with the Super Administrator role can view or create them. These tokens remain valid until manually revoked.
    - **Create User API token** - These tokens are tied to your individual Cloudflare user. They inherit your personal permissions and become inactive if your user is removed from the account.
diff --git a/src/content/partials/ai-search/ai-search-api-params.mdx b/src/content/partials/ai-search/ai-search-api-params.mdx
@@ -12,6 +12,10 @@ The input query.
 
 The text-generation model that is used to generate the response for the query. For a list of valid options, check the AI Search Generation model Settings. Defaults to the generation model selected in the AI Search Settings.
 
+`system_prompt` <Type text="string" /> <MetaInfo text="optional" />
+
+The system prompt for generating the answer.
+
 `rewrite_query` <Type text="boolean" /> <MetaInfo text="optional" />
 
 Rewrites the original query into a search optimized query to improve retrieval accuracy. Defaults to `false`.
@@ -27,6 +31,16 @@ Configurations for customizing result ranking. Defaults to `{}`.
 - `score_threshold` <Type text="number" /> <MetaInfo text="optional" />
   - The minimum match score required for a result to be considered a match. Defaults to `0`. Must be between `0` and `1`.
 
+`reranking` <Type text="object" /> <MetaInfo text="optional" />
+
+Configurations for customizing reranking. Defaults to `{}`.
+
+- `enabled` <Type text="boolean" /> <MetaInfo text="optional" />
+  - Enables or disables reranking, which reorders retrieved results based on semantic relevance using a reranking model. Defaults to `false`.
+
+- `model` <Type text="string" /> <MetaInfo text="optional" />
+  - The reranking model to use when reranking is enabled.
+
 `stream` <Type text="boolean" /> <MetaInfo text="optional" />
 
 Returns a stream of results as they are available. Defaults to `false`.
diff --git a/src/content/partials/ai-search/search-api-params.mdx b/src/content/partials/ai-search/search-api-params.mdx
@@ -23,6 +23,16 @@ Configurations for customizing result ranking. Defaults to `{}`.
 - `score_threshold` <Type text="number" /> <MetaInfo text="optional" />
   - The minimum match score required for a result to be considered a match. Defaults to `0`. Must be between `0` and `1`.
 
+`reranking` <Type text="object" /> <MetaInfo text="optional" />
+
+Configurations for customizing reranking. Defaults to `{}`.
+
+- `enabled` <Type text="boolean" /> <MetaInfo text="optional" />
+  - Enables or disables reranking, which reorders retrieved results based on semantic relevance using a reranking model. Defaults to `false`.
+
+- `model` <Type text="string" /> <MetaInfo text="optional" />
+  - The reranking model to use when reranking is enabled.
+
 `filters` <Type text="object" /> <MetaInfo text="optional" />
 
 Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](/ai-search/configuration/metadata).