===/overview===
# Welcome to xAI Documentation
Build with Grok, the AI model designed to deliver truthful, insightful answers.
***
## Get started
## Quick start with the API
===/console/billing===
#### Key Information
# Manage Billing
**Ensure you are in the desired team before changing billing information. When you save the billing information or make a purchase for the first time, the billing information is saved to the team you are in and shared with its members.**
There are two ways of billing:
* **Prepaid credits:** You can pre-purchase credits for your team. Your API consumption will be deducted from remaining prepaid credits available.
* **Monthly invoiced billing:** xAI will generate a monthly invoice based on your API consumption, when you don't have available prepaid credits. xAI will charge your default payment method with the invoiced amount at the end of each month.
**Monthly invoiced billing is disabled by default, with default Invoiced Spending Limit of $0.** This will introduce service disruption when you have consumed all of your prepaid credits. To enable monthly invoiced billing, set a higher than $0 Invoiced Spending Limit at [Billing -> API Credits](https://console.x.ai/team/default/billing) on xAI Console.
Your API consumption will be accounted for in the following order:
* Free/Promotional credits
* Prepaid credits
* Monthly invoiced billing (if Invoiced Spending Limit > $0)
**Any prepaid credits and added payment method will be made available to the team you made the purchase in.**
## Prepaid credits
You can only purchase prepaid credits with Guest Checkout at the moment, due to regulatory
requirements.
This is the most common way to build with xAI API. Before using the API, you purchase a given amount of credits. When you use the API, xAI will track your consumption and deduct the amount from the credits available in your account.
You can add prepaid credits on the xAI Console [Billing -> API Credits](https://console.x.ai/team/default/billing) page.
On the same page, you can view the remaining prepaid credits, enter promo code, as well as any free credits granted by xAI team.
Note: When you make the purchase via bank transfer instead of credit card, the payment will take 2-3 business days to process. You will be granted credits after the process has completed.
## Monthly invoiced billing and invoiced billing limit
Enterprise customers might find it beneficial to enroll in monthly invoiced billing to avoid disruption to their services.
When you have set a **$0 invoiced billing limit** (default), xAI will only use your available prepaid credits. **Your API requests will be automatically rejected once your prepaid credits are depleted.**
If you want to use monthly billing, you can **increase your invoiced billing limit** on [Billing -> API Credits](https://console.x.ai/team/default/billing) page. xAI will attempt to use your prepaid credits first, and the remaining amount will be charged to your default payment method at the end of the month. This ensures you won't experience interruption while consuming the API.
Once your monthly invoiced billing amount has reached the invoiced billing limit, you won't be able to get response until you have raised the invoiced billing limit.
## Saving payment method
When you make a purchase, we automatically keep it on file to make your next purchase easier. You can also manually add payment method on xAI Console [Billing -> Billing details -> Add Payment Information](https://console.x.ai/team/default/billing).
Currently we don't allow user to remove the last payment method on file. There might be changes in the future.
## Invoices
You can view your invoices for prepaid credits and monthly invoices on [Billing -> Invoices](https://console.x.ai/team/default/billing/invoices).
## Billing address and tax information
Enter your billing information carefully, as it will appear on your invoices. We are not able to
regenerate the invoices at the moment.
Your billing address and tax information will be displayed on the invoice. On [Billing -> Payment](https://console.x.ai/team/default/billing), you can also add/change your billing address. When you add/change billing address, you can optionally add your organization's tax information.
===/console/collections===
#### Guides
# Using Collections in Console
This guide walks you through managing collections using the [xAI Console](https://console.x.ai) interface.
## Creating a new collection
Navigate to the **Collections** tab in the [xAI Console](https://console.x.ai). Make sure you are in the correct team.
Click on "Create new collection" to create a new `collection`.
You can choose to enable generate embeddings on document upload or not. We recommend leaving the generate embeddings setting to on.
## Viewing and editing collection configuration
You can view and edit the Collection's configuration by clicking on Edit Collection.
This opens up the following modal where you can view the configuration and make changes.
## Adding a document to the collection
Once you have created the new `collection`, you can click on it in the collections table to view the `documents` included in the `collection`.
Click on "Upload document" to upload a new `document`.
Once the upload has completed, each document is given a File ID. You can view the File ID, Collection ID and hash of the `document` by clicking on the `document` in the documents table.
## Deleting documents and collections
You can delete `documents` and `collections` by clicking on the more button on the right side of the collections or documents table.
===/console/faq/accounts===
#### FAQ
# Accounts
## How do I create an account for the API?
You can create an account at https://accounts.x.ai, or https://console.x.ai. To link your X account automatically to
your xAI account, choose to sign up with X account.
You can create multiple accounts of different sign-in methods with the same email.
When you sign-up with a sign-in method and with the same email, we will prompt you whether you
want to create a new account, or link to the existing account. We will not be able to merge the
content, subscriptions, etc. of different accounts.
## How do I update my xAI account email?
You can visit [xAI Accounts](https://accounts.x.ai). On the Account page, you can update your email.
## How do I add other sign-in methods?
Once you have signed-up for an account, you can add additional sign-in methods by going to [xAI Accounts](https://accounts.x.ai).
## I've forgotten my Multi-Factor Authentication (MFA) method, can you remove it?
You can generate your recovery codes at [xAI Accounts](https://accounts.x.ai) Security page.
We can't remove or reset your MFA method unless you have recovery codes due to security considerations. Please reach out to support@x.ai if you would like to delete the account instead.
## If I already have an account for Grok, can I use the same account for API access?
Yes, the account is shared between Grok and xAI API. You can manage the sign-in details at https://accounts.x.ai.
However, the billing is separate for Grok and xAI API. You can manage your billing for xAI API on [xAI Console](https://console.x.ai).
To manage billing for Grok, visit https://grok.com -> Settings -> Billing, or directly with Apple/Google if you made the
purchase via Apple App Store or Google Play.
## How do I manage my account?
You can visit [xAI Accounts](https://accounts.x.ai) to manage your account.
Please note the xAI account is different from the X account, and xAI cannot assist you with X account issues. Please
contact X via [X Help Center](https://help.x.com/) or Premium Support if you encounters any issues with your X account.
## I received an email of someone logging into my xAI account
xAI will send an email to you when someone logs into your xAI account. The login location is an approximation based on your IP address, which is dependent on your network setup and ISP and might not reflect exactly where the login happened.
If you think the login is not you, please [reset your password](https://accounts.x.ai/request-reset-password) and [clear your login sessions](https://accounts.x.ai/sessions). We also recommend all users to [add a multi-factor authentication method](https://accounts.x.ai/security).
## How do I delete my xAI account?
We are sorry to see you go!
You can visit [xAI Accounts](https://accounts.x.ai/account) to delete your account. You can restore your account after log in again and confirming restoration within 30 days.
You can cancel the deletion within 30 days by logging in again to any xAI websites and follow the prompt to confirm restoring the account.
For privacy requests, please go to: https://privacy.x.ai.
===/console/faq/billing===
#### FAQ
# Billing
## I'm having payment issues with an Indian payment card
Unfortunately we cannot process Indian payment cards for our API service. We are working toward supporting it but you might want to consider using a third-party API in the meantime. As Grok Website and Apps' payments are handled differently, those are not affected.
## When will I be charged?
* Prepaid Credits: If you choose to use prepaid credits, you’ll be charged when you buy them. These credits will be assigned to the team you select during purchase.
* Monthly Invoiced Billing: If you set your [invoiced spending limit](/console/billing#monthly-invoiced-billing-and-invoiced-billing-limit) above $0, any usage beyond your prepaid credits will be charged at the end of the month.
* API Usage: When you make API requests, the cost is calculated immediately. The amount is either deducted from your available prepaid credits or added to your monthly invoice if credits are exhausted.
If you change your [invoiced spending limit](/console/billing#monthly-invoiced-billing-and-invoiced-billing-limit) to be greater than $0, you will be charged at the end of the month for any extra consumption after your prepaid credit on the team has run out.
Your API consumption will be calculated when making the requests, and the corresponding amount will be deducted from your remaining credits or added to your monthly invoice.
Check out [Billing](/console/billing) for more information.
## Can you retroactively generate an invoice with new billing information?
We are unable to retroactively generate an invoice. Please ensure your billing information is correct on [xAI Console](https://console.x.ai) Billing -> Payment.
## Can prepaid API credits be refunded?
Unfortunately, we are not able to offer refunds on any prepaid credit purchase unless in regions required by law. For details, please visit https://x.ai/legal/terms-of-service-enterprise.
### My prompt token consumption from the API is different from the token count I get from xAI Console Tokenizer or tokenize text endpoint
The inference endpoints add pre-defined tokens to help us process the request. Therefore, these tokens would be added to the total prompt token consumption. For more information, see:
[Estimating consumption with tokenizer on xAI Console or Estimating consumption with tokenizer on xAI Console or through API](/developers/rate-limits#estimating-consumption-with-tokenizer-on-xai-console-or-through-api).
===/console/faq/security===
#### FAQ
# Security
## Does xAI train on customers' API requests?
xAI never trains on your API inputs or outputs without your explicit permission.
API requests and responses are temporarily stored on our servers for 30 days in case they need to be audited for potential abuse or misuse. This data is automatically deleted after 30 days.
## Is the xAI API HIPAA compliant?
To inquire about a Business Associate Agreement (BAA), please complete our [BAA Questionnaire](https://forms.gle/YAEdX3XUp6MvdEXW9). A member of our team will review your responses and reach out with next steps.
## Is xAI GDPR and SOC II compliant?
We are SOC 2 Type 2 compliant. Customers with a signed NDA can refer to our [Trust Center](https://trust.x.ai/) for up-to-date information on our certifications and data governance.
## Do you have Audit Logs?
Team admins are able to view an audit log of user interactions. This lists all of the user interactions with our API server. You can view it at [xAI Console -> Audit Log](https://console.x.ai/team/default/audit).
The admin can also search by Event ID, Description or User to filter the results shown. For example, this is to filter by description matching `ListApiKeys`:
You can also view the audit log across a range of dates with the time filter:
## How can I securely manage my API keys?
Treat your xAI API keys as sensitive information, like passwords or credit card details. Do not share keys between teammates to avoid unauthorized access. Store keys securely using environment variables or secret management tools. Avoid committing keys to public repositories or source code.
Rotate keys regularly for added security. If you suspect a compromise, log into the xAI console first. Ensure you are viewing the correct team, as API keys are tied to specific teams. Navigate to the "API Keys" section via the sidebar. In the API Keys table, click the vertical ellipsis (three dots) next to the key. Select "Disable key" to deactivate it temporarily or "Delete key" to remove it permanently. Then, click the "Create API Key" button to generate a new one and update your applications.
xAI partners with GitHub's Secret Scanning program to detect leaked keys. If a leak is found, we disable the key and notify you via email. Monitor your account for unusual activity to stay protected.
===/console/usage===
#### Key Information
# Usage Explorer
Sometimes as a team admin, you might want to monitor the API consumption, either to track spending, or to detect anomalies. xAI Console provides an easy-to-use [Usage Explorer](https://console.x.ai/team/default/usage) for team admins to track API usage across API keys, models, etc.
## Basic usage
[Usage Explorer](https://console.x.ai/team/default/usage) page provides intuitive dropdown menus for you to customize how you want to view the consumptions.
For example, you can view your daily credit consumption with `Granularity: Daily`:
By default, the usage is calculated by cost in USD. You can select Dimension -> Tokens or Dimension -> Billing items to change the dimension to token count or billing item count.
You can also see the usage with grouping. This way, you can easily compare the consumption across groups. In this case, we are trying to compare consumptions across test and production API keys, so we select `Group by: API Key`:
## Filters
The basic usage should suffice if you are only viewing general information. However, you can also use filters to conditionally display information.
The filters dropdown gives you the options to filter by a particular API key, a model, a request IP, a cluster, or the token type.
===/developers/advanced-api-usage/async===
#### Advanced API Usage
# Asynchronous Requests
When working with the xAI API, you may need to process hundreds or even thousands of requests. Sending these requests sequentially can be extremely time-consuming.
To improve efficiency, you can use `AsyncClient` from `xai_sdk` or `AsyncOpenAI` from `openai`, which allows you to send multiple requests concurrently. The example below is a Python script demonstrating how to use `AsyncClient` to batch and process requests asynchronously, significantly reducing the overall execution time:
You can also use our Batch API to queue the requests and fetch them later. Please visit [Batch API](/developers/advanced-api-usage/batch-api) for more information.
## Rate Limits
Adjust the `max_concurrent` param to control the maximum number of parallel requests.
You are unable to concurrently run your requests beyond the rate limits shown in the API console.
```pythonXAI
import asyncio
import os
from xai_sdk import AsyncClient
from xai_sdk.chat import Response, user
async def main():
client = AsyncClient(
api_key=os.getenv("XAI_API_KEY"),
timeout=3600, # Override default timeout with longer timeout for reasoning models
)
model = "grok-4-1-fast-reasoning"
requests = [
"Tell me a joke",
"Write a funny haiku",
"Generate a funny X post",
"Say something unhinged",
]
# Define a semaphore to limit concurrent requests (e.g., max 2 concurrent requests at a time)
max_in_flight_requests = 2
semaphore = asyncio.Semaphore(max_in_flight_requests)
async def process_request(request) -> Response:
async with semaphore:
print(f"Processing request: {request}")
chat = client.chat.create(model=model, max_tokens=100)
chat.append(user(request))
return await chat.sample()
tasks = []
for request in requests:
tasks.append(process_request(request))
responses = await asyncio.gather(*tasks)
for i, response in enumerate(responses):
print(f"Total tokens used for response {i}: {response.usage.total_tokens}")
if __name__ == "__main__":
asyncio.run(main())
```
```pythonOpenAISDK
import asyncio
import os
import httpx
from asyncio import Semaphore
from openai import AsyncOpenAI
client = AsyncOpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0) # Override default timeout with longer timeout for reasoning models
)
async def send_request(sem: Semaphore, request: str) -> dict:
"""Send a single request to xAI with semaphore control."""
# The 'async with sem' ensures only a limited number of requests run at once
async with sem:
return await client.chat.completions.create(
model="grok-4-1-fast-reasoning",
messages=[{"role": "user", "content": request}]
)
async def process_requests(requests: list[str], max_concurrent: int = 2) -> list[dict]:
"""Process multiple requests with controlled concurrency."""
# Create a semaphore that limits how many requests can run at the same time # Think of it like having only 2 "passes" to make requests simultaneously
sem = Semaphore(max_concurrent)
# Create a list of tasks (requests) that will run using the semaphore
tasks = [send_request(sem, request) for request in requests]
# asyncio.gather runs all tasks in parallel but respects the semaphore limit
# It waits for all tasks to complete and returns their results
return await asyncio.gather(*tasks)
async def main() -> None:
"""Main function to handle requests and display responses."""
requests = [
"Tell me a joke",
"Write a funny haiku",
"Generate a funny X post",
"Say something unhinged"
]
# This starts processing all asynchronously, but only 2 at a time
# Instead of waiting for each request to finish before starting the next,
# we can have 2 requests running at once, making it faster overall
responses = await process_requests(requests)
# Print each response in order
for i, response in enumerate(responses):
print(f"# Response {i}:")
print(response.choices[0].message.content)
if __name__ == "__main__":
asyncio.run(main())
```
===/developers/advanced-api-usage/batch-api===
#### Advanced API Usage
# Batch API
The Batch API lets you process large volumes of requests asynchronously with reduced pricing and higher rate limits. For pricing details, see [Batch API Pricing](/developers/models#batch-api-pricing).
## What is the Batch API?
When you make a standard API call to Grok, you send a request and wait for an immediate response. This approach is perfect for interactive applications like chatbots, real-time assistants, or any use case where users are waiting for a response.
The Batch API takes a different approach. Instead of processing requests immediately, you submit them to a queue where they're processed in the background. You don't get an instant response—instead, you check back later to retrieve your results.
**Key differences from real-time API requests:**
| | Real-time API | Batch API |
|---|---|---|
| **Response time** | Immediate (seconds) | Typically within 24 hours |
| **Cost** | Standard pricing | Reduced pricing ([see details](/developers/models#batch-api-pricing)) |
| **Rate limits** | Per-minute limits apply | Requests don't count towards rate limits |
| **Use case** | Interactive, real-time | Background processing, bulk jobs |
**Processing time:** Most batch requests complete within **24 hours**, though processing time may vary depending on system load and batch size.
You can also create, monitor, and manage batches through the [xAI Console](https://console.x.ai/team/default/batches). The Console provides a visual interface for tracking batch progress and viewing results.
## When to use the Batch API
The Batch API is ideal when you don't need immediate results and want to **reduce your API costs**:
* **Running evaluations and benchmarks** - Test model performance across thousands of prompts
* **Processing large datasets** - Analyze customer feedback, classify support tickets, extract entities
* **Content moderation at scale** - Review backlogs of user-generated content
* **Document summarization** - Process reports, research papers, or legal documents in bulk
* **Data enrichment pipelines** - Add AI-generated insights to database records
* **Scheduled overnight jobs** - Generate daily reports or prepare data for dashboards
## How it works
The Batch API workflow consists of four main steps:
1. **Create a batch** - A batch is a container that groups related requests together
2. **Add requests** - Submit your inference requests to the batch queue
3. **Monitor progress** - Poll the batch status to track completion
4. **Retrieve results** - Fetch responses for all processed requests
Let's walk through each step.
## Step 1: Create a batch
A batch acts as a container for your requests. Think of it as a folder that groups related work together—you might create separate batches for different datasets, experiments, or job types.
When you create a batch, you receive a `batch_id` that you'll use to add requests and retrieve results.
```bash
curl -X POST https://api.x.ai/v1/batches \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"name": "customer_feedback_analysis"
}'
```
```pythonXAI
from xai_sdk import Client
client = Client()
# Create a batch with a descriptive name
batch = client.batch.create(batch_name="customer_feedback_analysis")
print(f"Created batch: {batch.batch_id}")
# Store the batch_id for later use
batch_id = batch.batch_id
```
## Step 2: Add requests to the batch
With your batch created, you can now add requests to it. Each request is a standard chat completion that will be processed asynchronously.
**With the xAI SDK, adding batch requests is simple:** create `Chat` objects the same way you would for regular chat completions, then pass them as a list. You don't need to construct JSONL files or deal with complex request formats. Just use the familiar `chat.create()` and `chat.append()` pattern you already know.
**Important:** Assign a unique `batch_request_id` to each request. This ID lets you match results back to their original requests, which becomes important when you're processing hundreds or thousands of items. If you don't provide an ID, we generate a UUID for you. Using your own IDs is useful for idempotency (ensuring a request is only processed once) and for linking batch requests to records in your own system.
```pythonXAI
from xai_sdk import Client
from xai_sdk.chat import system, user
client = Client()
# Sample data to process
feedback_items = [
{"id": "feedback_001", "text": "The product exceeded my expectations!"},
{"id": "feedback_002", "text": "Shipping took way too long."},
{"id": "feedback_003", "text": "It works as described, nothing special."},
]
# Build batch requests using familiar Chat objects
batch_requests = []
for item in feedback_items:
# Create a Chat exactly like you would for a regular request
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
batch_request_id=item["id"], # Add an ID to track this request
)
# Append messages the same way as always
chat.append(system("Classify the sentiment as positive, negative, or neutral."))
chat.append(user(item["text"]))
batch_requests.append(chat)
# Pass the list of Chat objects to the batch
client.batch.add(batch_id=batch.batch_id, batch_requests=batch_requests)
print(f"Added {len(batch_requests)} requests to batch")
```
```bash
curl -X POST https://api.x.ai/v1/batches/{batch_id}/requests \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"batch_requests": [
{
"batch_request_id": "feedback_001",
"batch_request": {
"chat_get_completion": {
"messages": [
{"role": "system", "content": "Classify the sentiment as positive, negative, or neutral."},
{"role": "user", "content": "The product exceeded my expectations!"}
],
"model": "grok-4-1-fast-reasoning"
}
}
},
{
"batch_request_id": "feedback_002",
"batch_request": {
"chat_get_completion": {
"messages": [
{"role": "system", "content": "Classify the sentiment as positive, negative, or neutral."},
{"role": "user", "content": "Shipping took way too long."}
],
"model": "grok-4-1-fast-reasoning"
}
}
}
]
}'
```
## Step 3: Monitor batch progress
After adding requests, they begin processing in the background. Since batch processing is asynchronous, you need to poll the batch status to know when results are ready.
The batch state includes counters for pending, successful, and failed requests. Poll periodically until `num_pending` reaches zero, which indicates all requests have been processed (either successfully or with errors).
```bash
# Check batch status
curl https://api.x.ai/v1/batches/{batch_id} \\
-H "Authorization: Bearer $XAI_API_KEY"
# Response includes state with request counts:
# {
# "state": {
# "num_requests": 100,
# "num_pending": 25,
# "num_success": 70,
# "num_error": 5
# }
# }
```
```pythonXAI
import time
from xai_sdk import Client
client = Client()
# Poll until all requests are processed
print("Waiting for batch to complete...")
while True:
batch = client.batch.get(batch_id=batch.batch_id)
pending = batch.state.num_pending
completed = batch.state.num_success + batch.state.num_error
total = batch.state.num_requests
print(f"Progress: {completed}/{total} complete, {pending} pending")
if pending == 0:
print("Batch processing complete!")
break
# Wait before polling again (avoid hammering the API)
time.sleep(5)
```
### Understanding batch states
The Batch API tracks state at two levels: the **batch level** and the **individual request level**.
**Batch-level state** shows aggregate progress across all requests in a given batch,
accessible through the `batch.state` object returned by the `client.batch.get()` method:
| Counter | Description |
|---|---|
| `num_requests` | Total number of requests added to the batch |
| `num_pending` | Requests waiting to be processed |
| `num_success` | Requests that completed successfully |
| `num_error` | Requests that failed with an error |
| `num_cancelled` | Requests that were cancelled |
When `num_pending` reaches zero, all requests have been processed (either successfully, with errors, or cancelled).
**Individual request states** describe where each request is in its lifecycle, accessible through the `batch_request_metadata` object returned by the `client.batch.list_batch_requests()` [method](#check-individual-request-status):
| State | Description |
|---|---|
| `pending` | Request is queued and waiting to be processed |
| `succeeded` | Request completed successfully, result is available |
| `failed` | Request encountered an error during processing |
| `cancelled` | Request was cancelled (e.g., when the batch was cancelled before this request was processed) |
**Batch lifecycle:** A batch can also be cancelled or expire. [If you cancel a batch](#cancel-a-batch), pending requests won't be processed, but already-completed results remain available. Batches have an expiration time after which results are no longer accessible—check the `expires_at` field when retrieving batch details.
## Step 4: Retrieve results
You can retrieve results at any time, even before the entire batch completes. Results are available as soon as individual requests finish processing, so you can start consuming completed results while other requests are still in progress.
Each result is linked to its original request via the `batch_request_id` you assigned earlier. The `result.response` object is the same SDK `Response` you'd get from a regular chat completion, with all the familiar fields: `.content`, `.usage`, `.finish_reason`, and more.
The SDK provides convenient `.succeeded` and `.failed` properties to separate successful responses from errors.
**Pagination:** Results are returned in pages. Use the `limit` parameter to control page size and `pagination_token` to fetch subsequent pages. When `pagination_token` is `None`, you've reached the end.
```pythonXAI
from xai_sdk import Client
client = Client()
# Paginate through all results
all_succeeded = []
all_failed = []
pagination_token = None
while True:
# Fetch a page of results (limit controls page size)
page = client.batch.list_batch_results(
batch_id=batch.batch_id,
limit=100,
pagination_token=pagination_token,
)
# Collect results from this page
all_succeeded.extend(page.succeeded)
all_failed.extend(page.failed)
# Check if there are more pages
if page.pagination_token is None:
break
pagination_token = page.pagination_token
# Process all results
print(f"Successfully processed: {len(all_succeeded)} requests")
for result in all_succeeded:
# Access the full Response object
print(f"[{result.batch_request_id}] {result.response.content}")
print(f" Tokens used: {result.response.usage.total_tokens}")
if all_failed:
print(f"\\nFailed: {len(all_failed)} requests")
for result in all_failed:
print(f"[{result.batch_request_id}] Error: {result.error_message}")
```
```bash
# Fetch first page
curl "https://api.x.ai/v1/batches/{batch_id}/results?page_size=100" \\
-H "Authorization: Bearer $XAI_API_KEY"
# Use pagination_token from response to fetch next page
curl "https://api.x.ai/v1/batches/{batch_id}/results?page_size=100&pagination_token={token}" \\
-H "Authorization: Bearer $XAI_API_KEY"
```
## Additional operations
Beyond the core workflow, the Batch API provides additional operations for managing your batches.
### Cancel a batch
You can cancel a batch before all requests complete. Already-processed requests remain available in the results, but pending requests will not be processed. You cannot add more requests to a cancelled batch.
```bash
curl -X POST https://api.x.ai/v1/batches/{batch_id}:cancel \\
-H "Authorization: Bearer $XAI_API_KEY"
```
```pythonXAI
from xai_sdk import Client
client = Client()
# Cancel processing
cancelled_batch = client.batch.cancel(batch_id=batch.batch_id)
print(f"Cancelled batch: {cancelled_batch.batch_id}")
print(f"Completed before cancellation: {cancelled_batch.state.num_success} requests")
```
### List all batches
View all batches belonging to your team. Batches are retained until they expire (check the `expires_at` field). This endpoint supports the same `limit` and `pagination_token` parameters for paginating through large lists.
```bash
curl "https://api.x.ai/v1/batches?page_size=20" \\
-H "Authorization: Bearer $XAI_API_KEY"
```
```pythonXAI
from xai_sdk import Client
client = Client()
# List recent batches
response = client.batch.list(limit=20)
for batch in response.batches:
status = "complete" if batch.state.num_pending == 0 else "processing"
print(f"{batch.name} ({batch.batch_id}): {status}")
```
### Check individual request status
For detailed tracking, you can inspect the metadata for each request in a batch. This shows the status, timing, and other details for individual requests. This endpoint supports the same `limit` and `pagination_token` parameters for paginating through large batches.
```bash
curl "https://api.x.ai/v1/batches/{batch_id}/requests?page_size=50" \\
-H "Authorization: Bearer $XAI_API_KEY"
```
```pythonXAI
from xai_sdk import Client
client = Client()
# Get metadata for individual requests
metadata = client.batch.list_batch_requests(batch_id=batch.batch_id)
for request in metadata.batch_request_metadata:
print(f"Request {request.batch_request_id}: {request.state}")
```
### Track costs
Each batch tracks the total processing cost. Access the cost breakdown after processing to understand your spending. For pricing details, see [Batch API Pricing on the Models and Pricing page](/developers/models#batch-api-pricing).
```pythonXAI
from xai_sdk import Client
client = Client()
# Get batch with cost information
batch = client.batch.get(batch_id=batch.batch_id)
# Cost is returned in ticks (1e-10 USD) for precision
total_cost_usd = batch.cost_breakdown.total_cost_usd_ticks / 1e10
print("Total cost: $%.4f" % total_cost_usd)
```
## Complete example
This end-to-end example demonstrates a realistic batch workflow: analyzing customer feedback at scale. It creates a batch, submits feedback items for sentiment analysis, waits for processing, and outputs the results. For simplicity, this example doesn't paginate results—see [Step 4](#step-4-retrieve-results) for pagination when processing larger batches.
```pythonXAI
import time
from xai_sdk import Client
from xai_sdk.chat import system, user
client = Client()
# Sample dataset: customer feedback to analyze
feedback_data = [
{"id": "fb_001", "text": "Absolutely love this product! Best purchase ever."},
{"id": "fb_002", "text": "Delivery was late and the packaging was damaged."},
{"id": "fb_003", "text": "Works fine, nothing special to report."},
{"id": "fb_004", "text": "Customer support was incredibly helpful!"},
{"id": "fb_005", "text": "The app keeps crashing on my phone."},
]
# Step 1: Create a batch
print("Creating batch...")
batch = client.batch.create(batch_name="feedback_sentiment_analysis")
print(f"Batch created: {batch.batch_id}")
# Step 2: Build and add requests
print("\\nAdding requests...")
batch_requests = []
for item in feedback_data:
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
batch_request_id=item["id"],
)
chat.append(system(
"Analyze the sentiment of the customer feedback. "
"Respond with exactly one word: positive, negative, or neutral."
))
chat.append(user(item["text"]))
batch_requests.append(chat)
client.batch.add(batch_id=batch.batch_id, batch_requests=batch_requests)
print(f"Added {len(batch_requests)} requests")
# Step 3: Wait for completion
print("\\nProcessing...")
while True:
batch = client.batch.get(batch_id=batch.batch_id)
pending = batch.state.num_pending
completed = batch.state.num_success + batch.state.num_error
print(f" {completed}/{batch.state.num_requests} complete")
if pending == 0:
break
time.sleep(2)
# Step 4: Retrieve and display results
print("\\n--- Results ---")
results = client.batch.list_batch_results(batch_id=batch.batch_id)
# Create a lookup for original feedback text
feedback_lookup = {item["id"]: item["text"] for item in feedback_data}
for result in results.succeeded:
original_text = feedback_lookup.get(result.batch_request_id, "")
sentiment = result.response.content.strip().lower()
print(f"[{sentiment.upper()}] {original_text[:50]}...")
# Report any failures
if results.failed:
print("\\n--- Errors ---")
for result in results.failed:
print(f"[{result.batch_request_id}] {result.error_message}")
# Display cost
cost_usd = batch.cost_breakdown.total_cost_usd_ticks / 1e10
print("\\nTotal cost: $%.4f" % cost_usd)
```
## Limitations
**Batches**
* A team can have an **unlimited** number of batches.
* Maximum batch creation rate: **1** batch creation per second per team.
**Batch Requests**
* A batch can contain an **unlimited** number of requests in theory, but extremely large batches (>1,000,000 requests) may be throttled for processing stability.
* Each individual request that can be added to a batch has a maximum payload size of **25MB**.
* A team can send up to **100** add-batch-requests API calls every **30 seconds** (this is a rolling limit shared across all batches in the team).
**Unsupported Features**
* **Agentic requests** using [server-side tools](/developers/guides/tools/overview) (such as web search, code execution, or MCP tools) are not supported in batch requests.
* **Client-side tools** (function calling) are not supported in batch requests.
## Related
* [API Reference: Batch endpoints](/developers/rest-api-reference/inference/batches#create-a-new-batch)
* [gRPC Reference: Batch management](/developers/grpc-api-reference#batch-management)
* [Models and pricing — Batch API Pricing](/developers/models#batch-api-pricing)
* [xAI Python SDK](https://github.com/xai-org/xai-sdk-python)
===/developers/advanced-api-usage/deferred-chat-completions===
#### Advanced API Usage
# Deferred Chat Completions
Deferred Chat Completions are currently available only via REST requests or xAI SDK.
Deferred Chat Completions allow you to create a chat completion, get a `response_id`, and retrieve the response at a later time. The result would be available to be requested exactly once within 24 hours, after which it would be discarded.
Your deferred completion rate limit is the same as your chat completions rate limit. To view your rate limit, please visit [xAI Console](https://console.x.ai).
After sending the request to the xAI API, the chat completion result will be available at `https://api.x.ai/v1/chat/deferred-completion/{request_id}`. The response body will contain `{'request_id': 'f15c114e-f47d-40ca-8d5c-8c23d656eeb6'}`, and the `request_id` value can be inserted into the `deferred-completion` endpoint path. Then, we send this GET request to retrieve the deferred completion result.
When the completion result is not ready, the request will return `202 Accepted` with an empty response body.
You can access the model's raw thinking trace via the `message.reasoning_content` of the chat completion response.
## Example
A code example is provided below, where we retry retrieving the result until it has been processed:
```pythonXAI
import os
from datetime import timedelta
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(api_key=os.getenv('XAI_API_KEY'))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
messages=[system("You are Zaphod Beeblebrox.")]
)
chat.append(user("126/3=?"))
# Poll the result every 10 seconds for a maximum of 10 minutes
response = chat.defer(
timeout=timedelta(minutes=10), interval=timedelta(seconds=10)
)
# Print the result when it is ready
print(response.content)
```
```pythonRequests
import json
import os
import requests
from tenacity import retry, wait_exponential
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
payload = {
"messages": [
{"role": "system", "content": "You are Zaphod Beeblebrox."},
{"role": "user", "content": "126/3=?"}
],
"model": "grok-4-1-fast-reasoning",
"deferred": True
}
response = requests.post(
"https://api.x.ai/v1/chat/completions",
headers=headers,
json=payload
)
request_id = response.json()["request_id"]
print(f"Request ID: {request_id}")
@retry(wait=wait_exponential(multiplier=1, min=1, max=60),)
def get_deferred_completion():
response = requests.get(f"https://api.x.ai/v1/chat/deferred-completion/{request_id}", headers=headers)
if response.status_code == 200:
return response.json()
elif response.status_code == 202:
raise Exception("Response not ready yet")
else:
raise Exception(f"{response.status_code} Error: {response.text}")
completion_data = get_deferred_completion()
print(json.dumps(completion_data, indent=4))
```
```javascriptWithoutSDK
const axios = require('axios');
const retry = require('retry');
const headers = {
'Content-Type': 'application/json',
'Authorization': \`Bearer \${process.env.XAI_API_KEY}\`
};
const payload = {
messages: [
{ role: 'system', content: 'You are Zaphod Beeblebrox.' },
{ role: 'user', content: '126/3=?' }
],
model: 'grok-4-1-fast-reasoning',
deferred: true
};
async function main() {
const requestId = (await axios.post('https://api.x.ai/v1/chat/completions', payload, { headers })).data.request_id;
console.log(\`Request ID: \${requestId}\`);
const operation = retry.operation({
minTimeout: 1000,
maxTimeout: 60000,
factor: 2
});
const completion = await new Promise((resolve, reject) => {
operation.attempt(async () => {
const res = await axios.get(\`https://api.x.ai/v1/chat/deferred-completion/\${requestId}\`, { headers });
if (res.status === 200) resolve(res.data);
else if (res.status === 202) operation.retry(new Error('Not ready'));
else reject(new Error(\`\${res.status}: \${res.statusText}\`));
});
});
console.log(JSON.stringify(completion, null, 4));
}
main().catch(console.error);
```
```bash
RESPONSE=$(curl -s https://api.x.ai/v1/chat/completions \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"messages": [
{"role": "system", "content": "You are Zaphod Beeblebrox."},
{"role": "user", "content": "126/3=?"}
],
"model": "grok-4-1-fast-reasoning",
"deferred": true
}')
REQUEST_ID=$(echo "$RESPONSE" | jq -r '.request_id')
echo "Request ID: $REQUEST_ID"
sleep 10
curl -s https://api.x.ai/v1/chat/deferred-completion/$REQUEST_ID \\
-H "Authorization: Bearer $XAI_API_KEY"
```
The response body will be the same as what you would expect with non-deferred chat completions:
```json
{
"id": "3f4ddfca-b997-3bd4-80d4-8112278a1508",
"object": "chat.completion",
"created": 1752077400,
"model": "grok-4-1-fast-reasoning",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Whoa, hold onto your improbability drives, kid! This is Zaphod Beeblebrox here, the two-headed, three-armed ex-President of the Galaxy, and you're asking me about 126 divided by 3? Pfft, that's kid stuff for a guy who's stolen starships and outwitted the universe itself.\n\nBut get this\u2014126 slashed by 3 equals... **42**! Yeah, that's right, the Ultimate Answer to Life, the Universe, and Everything! Deep Thought didn't compute that for seven and a half million years just for fun, you know. My left head's grinning like a Vogon poet on happy pills, and my right one's already planning a party. If you need more cosmic math or a lift on the Heart of Gold, just holler. Zaphod out! \ud83d\ude80",
"refusal": null
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 26,
"completion_tokens": 168,
"total_tokens": 498,
"prompt_tokens_details": {
"text_tokens": 26,
"audio_tokens": 0,
"image_tokens": 0,
"cached_tokens": 4
},
"completion_tokens_details": {
"reasoning_tokens": 304,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
},
"num_sources_used": 0
},
"system_fingerprint": "fp_44e53da025"
}
```
For more details, refer to [Chat completions](/developers/rest-api-reference/inference/chat#chat-completions) and [Get deferred chat completions](/developers/rest-api-reference/inference/chat#get-deferred-chat-completions) in our REST API Reference.
===/developers/advanced-api-usage/fingerprint===
#### Advanced API Usage
# Fingerprint
For each request to the xAI API, the response body will include a unique `system_fingerprint` value. This fingerprint serves as an identifier for the current state of the backend system's configuration.
Example:
```bash
curl https://api.x.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"messages": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"model": "grok-4-1-fast-reasoning",
"stream": false,
"temperature": 0
}'
```
Response:
```json
{..., "system_fingerprint":"fp_6ca29cf396"}
```
You can automate your system to keep track of the `system_fingerprint` along with token consumption and other metrics.
## Usage of fingerprint
* **Monitoring System Changes:** The system fingerprint acts as a version control for the backend configuration. If any part of the backend system—such as model parameters, server settings, or even the underlying infrastructure—changes, the fingerprint will also change. This allows developers to track when and how the system has evolved over time. This is crucial for debugging, performance optimization, and ensuring consistency in API responses.
* **Security and Integrity:** The fingerprint can be used to ensure the integrity of the response. If a response's fingerprint matches the expected one based on a recent system configuration, it helps in verifying that the data hasn't been tampered with during transmission or that the service hasn't been compromised. **The fingerprint will change over time and it is expected.**
* **Compliance and Auditing:** For regulated environments, this fingerprint can serve as part of an audit trail, showing when specific configurations were in use for compliance purposes.
===/developers/advanced-api-usage/grok-code-prompt-engineering===
#### Advanced API Usage
# Prompt Engineering for Grok Code Fast 1
## For developers using agentic coding tools
`grok-code-fast-1` is a lightweight agentic model which is designed to excel as your pair-programmer inside most common coding tools. To optimize your experience, we present a few guidelines so that you can fly through your day-to-day coding tasks.
### Provide the necessary context
Most coding tools will gather the necessary context for you on their own. However, it is oftentimes better to be specific by selecting the specific code you want to use as context. This allows `grok-code-fast-1` to focus on your task and prevent unnecessary deviations. Try to specify relevant file paths, project structures, or dependencies and avoid providing irrelevant context.
* No-context prompt to avoid
> Make error handling better
* Good prompt with specified context
> My error codes are defined in @errors.ts, can you use that as reference to add proper error handling and error codes to @sql.ts where I am making queries
### Set explicit goals and requirements
Clearly define your goals and the specific problem you want `grok-code-fast-1` to solve. Detailed and concrete queries can lead to better performance. Try to avoid vague or underspecified prompts, as they can result in suboptimal results.
* Vague prompt to avoid
> Create a food tracker
* Good, detailed prompt
> Create a food tracker which shows the breakdown of calorie consumption per day divided by different nutrients when I enter a food item. Make it such that I can see an overview as well as get high level trends.
### Continually refine your prompts
`grok-code-fast-1` is a highly efficient model, delivering up to 4x the speed and 1/10th the cost of other leading agentic models. This enables you to test your complex ideas at an unprecedented speed and affordability. Even if the initial output isn’t perfect, we strongly suggest taking advantage of the uniquely rapid and cost-effective iteration to refine your query—using the suggestions above (e.g., adding more context) or by referencing the specific failures from the first attempt.
* Good prompt example with refinement
> The previous approach didn’t consider the IO heavy process which can block the main thread, we might want to run it in its own threadloop such that it does not block the event loop instead of just using the async lib version
### Assign agentic tasks
We encourage users to try `grok-code-fast-1` for agentic-style tasks rather than one-shot queries. Our Grok 4 models are more suited for one-shot Q\&A while `grok-code-fast-1` is your ideal companion for navigating large mountains of code with tools to deliver you precise answers.
A good way to think about this is:
* `grok-code-fast-1` is great at working quickly and tirelessly to find you the answer or implement the required change.
* Grok 4 is best for diving deep into complex concepts and tough debugging when you provide all the necessary context upfront.
## For developers building coding agents via the xAI API
With `grok-code-fast-1`, we wanted to bring an agentic coding model into the hands of developers. Outside of our launch partners, we welcome all developers to try out `grok-code-fast-1` in tool-call-heavy domains as the fast speed and low cost makes it both efficient and affordable for using many tools to figure out the correct answer.
As mentioned in the blog post, `grok-code-fast-1` is a reasoning model with interleaved tool-calling during its thinking. We also send summarized thinking via the OpenAI-compatible API for better UX support. More API details can be found at /developers/tools/function-calling.
### Reasoning content
`grok-code-fast-1` is a reasoning model, and we expose its thinking trace via `chunk.choices[0].delta.reasoning_content`. Please note that the thinking traces are only accessible when using streaming mode.
### Use native tool calling
`grok-code-fast-1` offers first-party support for native tool-calling and was specifically designed with native tool-calling in mind. We encourage you to use it instead of XML-based tool-call outputs, which may hurt performance.
### Give a detailed system prompt
Be thorough and give many details in your system prompt. A well-written system prompt which describes the task, expectations, and edge-cases the model should be aware of can make a night-and-day difference. For more inspiration, refer to the User Best Practices above.
### Introduce context to the model
`grok-code-fast-1` is accustomed to seeing a lot of context in the initial user prompt. We recommend developers to use XML tags or Markdown-formatted content to mark various sections of the context and to add clarity to certain sections. Descriptive Markdown headings/XML tags and their corresponding definitions will allow `grok-code-fast-1` to use the context more effectively.
### Optimize for cache hits
Our cache hits are a big contributor to `grok-code-fast-1`’s fast inference speed. In agentic tasks where the model uses multiple tools in sequence, most of the prefix remains the same and thus is automatically retrieved from the cache to speed up inference. We recommend against changing or augmenting the prompt history, as that could lead to cache misses and therefore significantly slower inference speeds.
===/developers/advanced-api-usage===
#### Advanced API Usage
# Advanced API Usage
Advanced guides for scaling, optimizing, and integrating xAI APIs.
## In this section
* [Batch API](/developers/advanced-api-usage/batch-api)
* [Deferred Completions](/developers/advanced-api-usage/deferred-chat-completions)
* [Fingerprint](/developers/advanced-api-usage/fingerprint)
* [Async Requests](/developers/advanced-api-usage/async)
* [Use with Code Editors](/developers/advanced-api-usage/use-with-code-editors)
* [Prompt Engineering for Grok Code](/developers/advanced-api-usage/grok-code-prompt-engineering)
===/developers/advanced-api-usage/use-with-code-editors===
# Use with Code Editors
You can use Grok with coding assistant plugins to help you code. Our Code models are specifically optimized for this task, which would provide you a smoother experience.
For pricing and limits of Code models, check out [Models and Pricing](/developers/models).
## Using Grok Code models with Cline
To use Grok with Cline, first download Cline from VSCode marketplace. Once you have installed Cline in VSCode, open Cline.
Click on "Use your own API key".
Then, you can save your xAI API key to Cline.
After setting up your xAI API key with Cline, you can set to use a coding model. Go to Cline settings -> API Configuration and you can choose `grok-code-fast-1` as the model.
## Using Grok Code models with Cursor
You can also use Grok with Cursor to help you code.
After installing Cursor, head to Cursor Settings -> Models.
Open API Keys settings, enter your xAI API key and set Override OpenAI Base URL to `https://api.x.ai/v1`
In the "Add or search model" input box, enter a coding model such as `grok-code-fast-1`. Then click on "Add Custom Model".
## Other code assistants supporting Grok Code models
Besides Cline and Cursor, you can also use our code model with [GitHub Copilot](https://github.com/features/copilot), [opencode](https://opencode.ai/), [Kilo Code](https://kilocode.ai/), [Roo Code](https://roocode.com/) and [Windsurf](https://windsurf.com/).
===/developers/community===
#### Resources
# Community Integrations
Grok is also accessible via your favorite community integrations, enabling you to connect Grok to other parts of your system easily.
## Third-party SDK/frameworks
### LiteLLM
LiteLLM provides a simple SDK or proxy server for calling different LLM providers. If you're using LiteLLM, integrating xAI as your provider is straightforward—just swap out the model name and API key to xAI's Grok model in your configuration.
For latest information and more examples, visit [LiteLLM xAI Provider Documentation](https://docs.litellm.ai/docs/providers/xai).
As a quick start, you can use LiteLLM in the following fashion:
```pythonWithoutSDK
from litellm import completion
import os
os.environ['XAI_API_KEY'] = ""
response = completion(
model="xai/grok-4-1-fast-reasoning",
messages=[
{
"role": "user",
"content": "What's the weather like in Boston today in Fahrenheit?",
}
],
max_tokens=10,
response_format={ "type": "json_object" },
seed=123,
stop=["\n\n"],
temperature=0.2,
top_p=0.9,
tool_choice="auto",
tools=[],
user="user",
)
print(response)
```
### Vercel AI SDK
[Vercel's AI SDK](https://sdk.vercel.ai/) supports a [xAI Grok Provider](https://sdk.vercel.ai/providers/ai-sdk-providers/xai) for integrating with xAI API.
By default it uses your xAI API key in `XAI_API_KEY` variable.
To generate text use the `generateText` function:
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});
```
You can also customize the setup like the following:
```javascriptAISDK
import { createXai } from '@ai-sdk/xai';
const xai = createXai({
apiKey: 'your-api-key',
});
```
You can also generate images with the `generateImage` function:
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { experimental_generateImage as generateImage } from 'ai';
const { image } = await generateImage({
model: xai.image('grok-imagine-image'),
prompt: 'A cat in a tree',
});
```
## Coding assistants
### Continue
You can use Continue extension in VSCode or JetBrains with xAI's models.
To start using xAI models with Continue, you can add the following in Continue's config file `~/.continue/config.json`(MacOS and Linux)/`%USERPROFILE%\.continue\config.json`(Windows).
```json
"models": [
{
"title": "grok-4-1-fast-reasoning",
"provider": "xAI",
"model": "grok-4-1-fast-reasoning",
"apiKey": "[XAI_API_KEY]"
}
]
```
Visit [Continue's Documentation](https://docs.continue.dev/chat/model-setup#grok-2-from-xai) for more details.
===/developers/debugging===
#### Getting Started
# Debugging Errors
When you send a request, you would normally get a `200 OK` response from the server with the expected response body.
If there has been an error with your request, or error with our service, the API endpoint will typically return an error code with error message.
If there is an ongoing service disruption, you can visit
[https://status.x.ai](https://status.x.ai) for the latest updates. The status is also available
via RSS at [https://status.x.ai/feed.xml](https://status.x.ai/feed.xml).
The service status is also indicated in the navigation bar of this site.
Most of the errors will be accompanied by an error message that is self-explanatory. For typical status codes of each endpoint, visit [API Reference](/developers/rest-api-reference) or view our [OpenAPI Document](/openapi.json).
## Status Codes
Here is a list of potential errors and statuses arranged by status codes.
### 4XX Status Codes
| Status Code | Endpoints | Cause | Solution |
| ------------------------------ | -------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| 400Bad Request | All Endpoints | - A `POST` method request body specified an invalid argument, or a `GET` method with dynamic route has an invalid param in the URL.- An incorrect API key is supplied. | - Please check your request body or request URL. |
| 401Unauthorized | All Endpoints | - No authorization header or an invalid authorization token is provided. | - Supply an `Authorization: Bearer Token ` in the request header. You can get a new API key on [xAI Console](https://console.x.ai). |
| 403Forbidden | All Endpoints | - Your API key/team doesn't have permission to perform the action.- Your API key/team is blocked. | - Ask your team admin for permission. |
| 404Not Found | All Endpoints | - A model specified in a `POST` method request body is not found.- Trying to reach an invalid endpoint URL. (Misspelled URL) | - Check your request body and endpoint URL with our [API Reference](/developers/rest-api-reference). |
| 405Method Not Allowed | All Endpoints | - The request method is not allowed. For example, sending a `POST` request to an endpoint supporting only `GET`. | - Check your request method with our [API Reference](/developers/rest-api-reference). |
| 415Unsupported Media Type | All Endpoints Supporting `POST` Method | - An empty request body in `POST` requests.- Not specifying `Content-Type: application/json` header. | - Add a valid request body. - Ensure `Content-Type: application/json` header is present in the request header. |
| 422Unprocessable Entity | All Endpoints Supporting `POST` Method | - An invalid format for a field in the `POST` request body. | - Check your request body is valid. You can find more information from [API Reference](/developers/rest-api-reference). |
| 429Too Many Requests | All Inference Endpoints | - You are sending requests too frequently and reaching rate limit | - Reduce your request rate or increase your rate limit. You can find your current rate limit on [xAI Console](https://console.x.ai). |
### 2XX Error Codes
| Status Code | Endpoints | Cause | Solution |
| ---------------- | ------------------------------------------- | -------------------------------------------------------------------------------------------------------- | ------------------------------ |
| 202Accepted | `/v1/chat/deferred-completion/{request_id}` | - Your deferred chat completion request is queued for processing, but the response is not available yet. | - Wait for request processing. |
## Bug Report
If you believe you have encountered a bug and would like to contribute to our development process, [email API Bug Report](mailto:support@x.ai?subject=API%20Bug%20Report) to support@x.ai with your API request and response and relevant logs.
You can also chat in the `#help` channel of our [xAI API Developer Discord](https://discord.gg/x-ai).
===/developers/faq/accounts===
#### FAQ
# Accounts
## How do I create an account for the API?
You can create an account at https://accounts.x.ai, or https://console.x.ai. To link your X account automatically to
your xAI account, choose to sign up with X account.
You can create multiple accounts of different sign-in methods with the same email.
When you sign-up with a sign-in method and with the same email, we will prompt you whether you
want to create a new account, or link to the existing account. We will not be able to merge the
content, subscriptions, etc. of different accounts.
## How do I update my xAI account email?
You can visit [xAI Accounts](https://accounts.x.ai). On the Account page, you can update your email.
## How do I add other sign-in methods?
Once you have signed-up for an account, you can add additional sign-in methods by going to [xAI Accounts](https://accounts.x.ai).
## I've forgotten my Multi-Factor Authentication (MFA) method, can you remove it?
You can generate your recovery codes at [xAI Accounts](https://accounts.x.ai) Security page.
We can't remove or reset your MFA method unless you have recovery codes due to security considerations. Please reach out to support@x.ai if you would like to delete the account instead.
## If I already have an account for Grok, can I use the same account for API access?
Yes, the account is shared between Grok and xAI API. You can manage the sign-in details at https://accounts.x.ai.
However, the billing is separate for Grok and xAI API. You can manage your billing for xAI API on [xAI Console](https://console.x.ai).
To manage billing for Grok, visit https://grok.com -> Settings -> Billing, or directly with Apple/Google if you made the
purchase via Apple App Store or Google Play.
## How do I manage my account?
You can visit [xAI Accounts](https://accounts.x.ai) to manage your account.
Please note the xAI account is different from the X account, and xAI cannot assist you with X account issues. Please
contact X via [X Help Center](https://help.x.com/) or Premium Support if you encounters any issues with your X account.
## I received an email of someone logging into my xAI account
xAI will send an email to you when someone logs into your xAI account. The login location is an approximation based on your IP address, which is dependent on your network setup and ISP and might not reflect exactly where the login happened.
If you think the login is not you, please [reset your password](https://accounts.x.ai/request-reset-password) and [clear your login sessions](https://accounts.x.ai/sessions). We also recommend all users to [add a multi-factor authentication method](https://accounts.x.ai/security).
## How do I delete my xAI account?
We are sorry to see you go!
You can visit [xAI Accounts](https://accounts.x.ai/account) to delete your account. You can restore your account after log in again and confirming restoration within 30 days.
You can cancel the deletion within 30 days by logging in again to any xAI websites and follow the prompt to confirm restoring the account.
For privacy requests, please go to: https://privacy.x.ai.
===/developers/faq/billing===
#### FAQ
# Billing
## I'm having payment issues with an Indian payment card
Unfortunately we cannot process Indian payment cards for our API service. We are working toward supporting it but you might want to consider using a third-party API in the meantime. As Grok Website and Apps' payments are handled differently, those are not affected.
## When will I be charged?
* Prepaid Credits: If you choose to use prepaid credits, you’ll be charged when you buy them. These credits will be assigned to the team you select during purchase.
* Monthly Invoiced Billing: If you set your [invoiced spending limit](/console/billing#monthly-invoiced-billing-and-invoiced-billing-limit) above $0, any usage beyond your prepaid credits will be charged at the end of the month.
* API Usage: When you make API requests, the cost is calculated immediately. The amount is either deducted from your available prepaid credits or added to your monthly invoice if credits are exhausted.
If you change your [invoiced spending limit](/console/billing#monthly-invoiced-billing-and-invoiced-billing-limit) to be greater than $0, you will be charged at the end of the month for any extra consumption after your prepaid credit on the team has run out.
Your API consumption will be calculated when making the requests, and the corresponding amount will be deducted from your remaining credits or added to your monthly invoice.
Check out [Billing](/console/billing) for more information.
## Can you retroactively generate an invoice with new billing information?
We are unable to retroactively generate an invoice. Please ensure your billing information is correct on [xAI Console](https://console.x.ai) Billing -> Payment.
## Can prepaid API credits be refunded?
Unfortunately, we are not able to offer refunds on any prepaid credit purchase unless in regions required by law. For details, please visit https://x.ai/legal/terms-of-service-enterprise.
### My prompt token consumption from the API is different from the token count I get from xAI Console Tokenizer or tokenize text endpoint
The inference endpoints add pre-defined tokens to help us process the request. Therefore, these tokens would be added to the total prompt token consumption. For more information, see:
[Estimating consumption with tokenizer on xAI Console or Estimating consumption with tokenizer on xAI Console or through API](/developers/rate-limits#estimating-consumption-with-tokenizer-on-xai-console-or-through-api).
===/developers/faq/general===
#### FAQ
# Frequently Asked Questions - General
Frequently asked questions by our customers.
For product-specific questions, visit or .
### Does the xAI API provide access to live data?
Yes! With the agentic server-side [Web Search](/developers/tools/web-search) and [X Search](/developers/tools/x-search) tools.
### How do I contact Sales?
For customers with bespoke needs or to request custom pricing, please fill out our [Grok for Business form](https://x.ai/grok/business). A member of our team will reach out with next steps. You can also email us at [sales@x.ai](mailto:sales@x.ai).
### Where are your Terms of Service and Privacy Policy?
Please refer to our [Legal Resources](https://x.ai/legal) for our Enterprise Terms of Service and Data Processing Addendum.
### Does xAI sell crypto tokens?
xAI is not affiliated with any cryptocurrency. We are aware of several scam websites that unlawfully use our name and logo.
===/developers/faq===
#### Resources
# FAQ - xAI Console
Frequently asked questions on using the [xAI Console](https://console.x.ai), including creating teams, managing roles, and configuring settings.
You can find details on the following topics:
===/developers/faq/security===
#### FAQ
# Security
## Does xAI train on customers' API requests?
xAI never trains on your API inputs or outputs without your explicit permission.
API requests and responses are temporarily stored on our servers for 30 days in case they need to be audited for potential abuse or misuse. This data is automatically deleted after 30 days.
## Is the xAI API HIPAA compliant?
To inquire about a Business Associate Agreement (BAA), please complete our [BAA Questionnaire](https://forms.gle/YAEdX3XUp6MvdEXW9). A member of our team will review your responses and reach out with next steps.
## Is xAI GDPR and SOC II compliant?
We are SOC 2 Type 2 compliant. Customers with a signed NDA can refer to our [Trust Center](https://trust.x.ai/) for up-to-date information on our certifications and data governance.
## Do you have Audit Logs?
Team admins are able to view an audit log of user interactions. This lists all of the user interactions with our API server. You can view it at [xAI Console -> Audit Log](https://console.x.ai/team/default/audit).
The admin can also search by Event ID, Description or User to filter the results shown. For example, this is to filter by description matching `ListApiKeys`:
You can also view the audit log across a range of dates with the time filter:
## How can I securely manage my API keys?
Treat your xAI API keys as sensitive information, like passwords or credit card details. Do not share keys between teammates to avoid unauthorized access. Store keys securely using environment variables or secret management tools. Avoid committing keys to public repositories or source code.
Rotate keys regularly for added security. If you suspect a compromise, log into the xAI console first. Ensure you are viewing the correct team, as API keys are tied to specific teams. Navigate to the "API Keys" section via the sidebar. In the API Keys table, click the vertical ellipsis (three dots) next to the key. Select "Disable key" to deactivate it temporarily or "Delete key" to remove it permanently. Then, click the "Create API Key" button to generate a new one and update your applications.
xAI partners with GitHub's Secret Scanning program to detect leaked keys. If a leak is found, we disable the key and notify you via email. Monitor your account for unusual activity to stay protected.
===/developers/faq/team-management===
#### FAQ
# Team Management
## What are teams?
Teams are the level at which xAI tracks API usage, processes billing, and issues invoices.
* If you’re the team creator and don’t need a new team, you can rename your Personal Team and add members instead of creating a new one.
* Each team has **roles**:
* **Admin**: Can modify team name, billing details, and manage members.
* **Member**: Cannot make these changes.
* The team creator is automatically an Admin.
## Which team am I on?
When you sign up for xAI, you’re automatically assigned to a **Personal Team**, which you can view the top bar of [xAI Console](https://console.x.ai).
## How can I manage teams and team members?
### Create a Team
1. Click the dropdown menu in the xAI Console.
2. Select **+ Create Team**.
3. Follow the on-screen instructions. You can edit these details later.
### Rename or Describe a Team
Admins can update the team name and description on the [Settings page](https://console.x.ai/team/default/settings).
### Manage Team Members
Admins can add or remove members by email on the [Users page](https://console.x.ai/team/default/users).
* Assign members as **Admin** or **Member**.
* If a user is removed, their API keys remain with the team.
### Delete a Team
Deleting a team removes its prepaid credits.
To permanently delete a team:
1. Go to the [Settings page](https://console.x.ai/team/default/settings).
2. Follow the instructions under **Delete Team**.
## How to automatically add users to team with my organization's email domain?
Admins can enable automatic team joining for users with a shared email domain:
1. Go to the [Settings page](https://console.x.ai/team/default/settings).
2. Add the domain under **Verified Domains**.
3. Add a `domain-verification` key to your domain’s DNS TXT record to verify ownership.
Users signing up with a verified domain email will automatically join the team.
===/developers/files/collections/api===
#### Files & Collections
# Using Collections via API
This guide walks you through managing collections programmatically using the xAI SDK and REST API.
## Creating a Management Key
To use the Collections API, you need to create a Management API Key with the `AddFileToCollection` permission. This permission is required for uploading documents to collections.
1. Navigate to the **Management Keys** section in the [xAI Console](https://console.x.ai)
2. Click on **Create Management Key**
3. Select the `AddFileToCollection` permission along with any other permissions you need
4. If you need to perform operations other than uploading documents (such as creating, updating, or deleting collections), enable the corresponding permissions in the **Collections Endpoint** group
5. Copy and securely store your Management API Key
Make sure to copy your Management API Key immediately after creation. You won't be able to see it again.
## Creating a collection
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
collection = client.collections.create(
name="SEC Filings",
)
print(collection)
```
```javascript customLanguage="javascriptWithoutSDK"
const response = await fetch('https://management-api.x.ai/v1/collections', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}`,
},
body: JSON.stringify({ collection_name: 'SEC Filings' }),
});
const collection = await response.json();
console.log(collection);
```
```bash
curl https://management-api.x.ai/v1/collections \
-X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY" \
-d '{"collection_name": "SEC Filings"}'
```
## Listing collections
```python customLanguage="pythonXAI"
# ... Create client
collections = client.collections.list()
print(collections)
```
```javascript customLanguage="javascriptWithoutSDK"
const response = await fetch('https://management-api.x.ai/v1/collections', {
headers: {
'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}`,
},
});
const collections = await response.json();
console.log(collections);
```
```bash
curl https://management-api.x.ai/v1/collections \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY"
```
## Viewing collection configuration
```python customLanguage="pythonXAI"
# ... Create client
collection = client.collections.get("collection_dbc087b1-6c99-493d-86c6-b401fee34a9d")
print(collection)
```
```javascript customLanguage="javascriptWithoutSDK"
const collectionId = 'collection_dbc087b1-6c99-493d-86c6-b401fee34a9d';
const response = await fetch(`https://management-api.x.ai/v1/collections/${collectionId}`, {
headers: {
'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}`,
},
});
const collection = await response.json();
console.log(collection);
```
```bash
curl https://management-api.x.ai/v1/collections/collection_dbc087b1-6c99-493d-86c6-b401fee34a9d \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY"
```
## Updating collection configuration
```python customLanguage="pythonXAI"
# ... Create client
collection = client.collections.update(
"collection_dbc087b1-6c99-493d-86c6-b401fee34a9d",
name="SEC Filings (New)"
)
print(collection)
```
```javascript customLanguage="javascriptWithoutSDK"
const collectionId = 'collection_dbc087b1-6c99-493d-86c6-b401fee34a9d';
const response = await fetch(`https://management-api.x.ai/v1/collections/${collectionId}`, {
method: 'PUT',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}`,
},
body: JSON.stringify({ collection_name: 'SEC Filings (New)' }),
});
const collection = await response.json();
console.log(collection);
```
```bash
curl https://management-api.x.ai/v1/collections/collection_dbc087b1-6c99-493d-86c6-b401fee34a9d \
-X PUT \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY" \
-d '{"collection_name": "SEC Filings (New)"}'
```
## Uploading documents
Uploading a document to a collection is a two-step process:
1. Upload the file to the xAI API
2. Add the uploaded file to your collection
```python customLanguage="pythonXAI"
# ... Create client
with open("tesla-20241231.html", "rb") as file:
file_data = file.read()
document = client.collections.upload_document(
collection_id="collection_dbc087b1-6c99-493d-86c6-b401fee34a9d",
name="tesla-20241231.html",
data=file_data,
)
print(document)
```
```javascript customLanguage="javascriptWithoutSDK"
const collectionId = 'collection_dbc087b1-6c99-493d-86c6-b401fee34a9d';
// Step 1: Upload file
const formData = new FormData();
formData.append('file', file);
formData.append('purpose', 'assistants');
const uploadResponse = await fetch('https://api.x.ai/v1/files', {
method: 'POST',
headers: { 'Authorization': `Bearer ${process.env.XAI_API_KEY}` },
body: formData,
});
const { id: fileId } = await uploadResponse.json();
// Step 2: Add to collection
await fetch(`https://management-api.x.ai/v1/collections/${collectionId}/documents/${fileId}`, {
method: 'POST',
headers: { 'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}` },
});
```
```bash
# Step 1: Upload file
curl https://api.x.ai/v1/files \
-H "Authorization: Bearer $XAI_API_KEY" \
-F file=@tesla-20241231.html
# Step 2: Add file to collection (use file_id from step 1)
curl -X POST https://management-api.x.ai/v1/collections/$COLLECTION_ID/documents/$FILE_ID \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY"
```
### Uploading with metadata fields
If your collection has [metadata fields](/developers/files/collections/metadata) defined (the collection must have these fields set in `field_definitions` when created or updated - see the linked metadata page for details), include them using the `fields` parameter:
```python customLanguage="pythonXAI"
# ... Create client
with open("paper.pdf", "rb") as file:
file_data = file.read()
document = client.collections.upload_document(
collection_id="collection_dbc087b1-6c99-493d-86c6-b401fee34a9d",
name="paper.pdf",
data=file_data,
fields={
"author": "Sandra Kim",
"year": "2024",
"title": "Q3 Revenue Analysis"
},
)
print(document)
```
```bash
curl https://management-api.x.ai/v1/collections/collection_dbc087b1-6c99-493d-86c6-b401fee34a9d/documents \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY" \
-F "name=paper.pdf" \
-F "data=@paper.pdf" \
-F "content_type=application/pdf" \
-F 'fields={"author": "Sandra Kim", "year": "2024", "title": "Q3 Revenue Analysis"}'
```
## Searching documents
You can also search documents using the Responses API with the `file_search` tool. See the [Collections Search Tool](/developers/tools/collections-search) guide for more details.
```python customLanguage="pythonXAI"
# ... Create client
response = client.collections.search(
query="What were the key revenue drivers based on the SEC filings?",
collection_ids=["collection_dbc087b1-6c99-493d-86c6-b401fee34a9d"],
)
print(response)
```
```javascript customLanguage="javascriptWithoutSDK"
const response = await fetch('https://api.x.ai/v1/documents/search', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.XAI_API_KEY}`,
},
body: JSON.stringify({
query: 'What were the key revenue drivers based on the SEC filings?',
source: {
collection_ids: ['collection_dbc087b1-6c99-493d-86c6-b401fee34a9d'],
},
}),
});
const results = await response.json();
console.log(results);
```
```bash
curl https://api.x.ai/v1/documents/search \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"query": "What were the key revenue drivers based on the SEC filings?",
"source": {
"collection_ids": ["collection_dbc087b1-6c99-493d-86c6-b401fee34a9d"]
}
}'
```
### Search modes
There are three search methods available:
* **Keyword search**
* **Semantic search**
* **Hybrid search** (combines both keyword and semantic methods)
By default, the system uses hybrid search, which generally delivers the best and most comprehensive results.
| Mode | Description | Best for | Drawbacks |
|------|-------------|----------|-----------|
| Keyword | Searches for exact matches of specified words, phrases, or numbers | Precise terms (e.g., account numbers, dates, specific financial figures) | May miss contextually relevant content |
| Semantic | Understands meaning and context to find conceptually related content | Discovering general ideas, topics, or intent even when exact words differ | Less precise for specific terms |
| Hybrid | Combines keyword and semantic search for broader and more accurate results | Most real-world use cases | Slightly higher latency |
The hybrid approach balances precision and recall, making it the recommended default for the majority of queries.
An example to set hybrid mode:
```bash
curl https://api.x.ai/v1/documents/search \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"query": "What were the key revenue drivers based on the SEC filings?",
"source": {
"collection_ids": [
"collection_dbc087b1-6c99-493d-86c6-b401fee34a9d"
]
},
"retrieval_mode": {"type": "hybrid"}
}'
```
You can set `"retrieval_mode": {"type": "keyword"}` for keyword search and `"retrieval_mode": {"type": "semantic"}` for semantic search.
## Deleting a document
```python customLanguage="pythonXAI"
# ... Create client
client.collections.remove_document(
collection_id="collection_dbc087b1-6c99-493d-86c6-b401fee34a9d",
file_id="file_55a709d4-8edc-4f83-84d9-9f04fe49f832",
)
```
```javascript customLanguage="javascriptWithoutSDK"
const collectionId = 'collection_dbc087b1-6c99-493d-86c6-b401fee34a9d';
const fileId = 'file_55a709d4-8edc-4f83-84d9-9f04fe49f832';
await fetch(`https://management-api.x.ai/v1/collections/${collectionId}/documents/${fileId}`, {
method: 'DELETE',
headers: { 'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}` },
});
```
```bash
curl https://management-api.x.ai/v1/collections/collection_dbc087b1-6c99-493d-86c6-b401fee34a9d/documents/file_55a709d4-8edc-4f83-84d9-9f04fe49f832 \
-X DELETE \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY"
```
## Deleting a collection
```python customLanguage="pythonXAI"
# ... Create client
client.collections.delete(collection_id="collection_dbc087b1-6c99-493d-86c6-b401fee34a9d")
```
```javascript customLanguage="javascriptWithoutSDK"
const collectionId = 'collection_dbc087b1-6c99-493d-86c6-b401fee34a9d';
await fetch(`https://management-api.x.ai/v1/collections/${collectionId}`, {
method: 'DELETE',
headers: { 'Authorization': `Bearer ${process.env.XAI_MANAGEMENT_API_KEY}` },
});
```
```bash
curl https://management-api.x.ai/v1/collections/collection_dbc087b1-6c99-493d-86c6-b401fee34a9d \
-X DELETE \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY"
```
## Next Steps
[Metadata Fields →](/developers/files/collections/metadata) - Learn how to attach structured attributes to documents for filtering and contextual embeddings
===/developers/files/collections/metadata===
#### Files & Collections
# Metadata Fields
Metadata fields allow you to attach structured attributes to documents in a collection. These fields enable:
* **Filtered retrieval** — Narrow search results to documents matching specific criteria (e.g., `author="Sandra Kim"`)
* **Contextual embeddings** — Inject metadata into chunks to improve retrieval accuracy (e.g., prepending document title to each chunk)
* **Data integrity constraints** — Enforce required fields or uniqueness across documents
## Creating a Collection with Metadata Fields
Define metadata fields using `field_definitions` when creating a collection:
```bash
curl -X POST "https://management-api.x.ai/v1/collections" \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"collection_name": "research_papers",
"field_definitions": [
{ "key": "author", "required": true },
{ "key": "year", "required": true, "unique": true },
{ "key": "title", "inject_into_chunk": true }
]
}'
```
### Field Definition Options
| Option | Description |
|--------|-------------|
| `required` | Document uploads must include this field. Defaults to `false`. |
| `unique` | Only one document in the collection can have a given value for this field. Defaults to `false`. |
| `inject_into_chunk` | Prepends this field's value to every embedding chunk, improving retrieval by providing context. Defaults to `false`. |
## Uploading Documents with Metadata
Include metadata as a JSON object in the `fields` parameter:
```bash
curl -X POST "https://management-api.x.ai/v1/collections/{collection_id}/documents" \
-H "Authorization: Bearer $XAI_MANAGEMENT_API_KEY" \
-F "name=paper.pdf" \
-F "data=@paper.pdf" \
-F "content_type=application/pdf" \
-F 'fields={"author": "Sandra Kim", "year": "2024", "title": "Q3 Revenue Analysis"}'
```
## Filtering Documents in Search
Use the `filter` parameter to restrict search results based on metadata values. The filter uses AIP-160 syntax:
```bash
curl -X POST "https://api.x.ai/v1/documents/search" \
-H "Authorization: Bearer $XAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "revenue growth",
"source": { "collection_ids": ["collection_xxx"] },
"filter": "author=\"Sandra Kim\" AND year>=2020"
}'
```
### Supported Filter Operators
| Operator | Example | Description |
|----------|---------|-------------|
| `=` | `author="Jane"` | Equals |
| `!=` | `status!="draft"` | Not equals |
| `<`, `>`, `<=`, `>=` | `year>=2020` | Numeric/lexical comparison |
| `AND` | `a="x" AND b="y"` | Both conditions must match |
| `OR` | `a="x" OR a="y"` | Either condition matches |
`OR` has higher precedence than `AND`. Use parentheses for clarity: `a="x" AND (b="y" OR b="z")`.
Wildcard matching (e.g., `author="E*"`) is not supported. All string comparisons are exact matches.
Filtering on fields that don't exist in your documents returns no results. Double-check that field names match your collection's `field_definitions`.
## AIP-160 Filter String Examples
### Basic Examples
```bash
# Equality (double or single quotes for strings with spaces)
author="Sandra Kim"
author='Sandra Kim'
# Equality (no quotes needed for simple values)
year=2024
status=active
# Not equal
status!="archived"
status!='archived'
```
### Comparison Operators
```bash
# Numeric comparisons
year>=2020
year>2019
score<=0.95
price<100
# Combined comparisons (range)
year>=2020 AND year<=2024
```
### Logical Operators
```bash
# AND - both conditions must match
author="Sandra Kim" AND year=2024
# OR - either condition matches
status="pending" OR status="in_progress"
# Combined (OR has higher precedence than AND)
department="Engineering" AND status="active" OR status="pending"
# Use parentheses for clarity
department="Engineering" AND (status="active" OR status="pending")
```
### Complex Examples
```bash
# Multiple conditions
author="Sandra Kim" AND year>=2020 AND status!="draft"
# Nested logic with parentheses
(author="Sandra Kim" OR author="John Doe") AND year>=2020
# Multiple fields with mixed operators
category="finance" AND (year=2023 OR year=2024) AND status!="archived"
```
## Quick Reference
| Use Case | Filter String |
|----------|---------------|
| Exact match | `author="Sandra Kim"` |
| Numeric comparison | `year>=2020` |
| Not equal | `status!="archived"` |
| Multiple conditions | `author="Sandra Kim" AND year=2024` |
| Either condition | `status="pending" OR status="draft"` |
| Grouped logic | `(status="active" OR status="pending") AND year>=2020` |
| Complex filter | `category="finance" AND year>=2020 AND status!="archived"` |
===/developers/files/collections===
#### Files & Collections
# Collections
Collections offers xAI API users a robust set of tools and methods to seamlessly integrate their enterprise requirements and internal knowledge bases with the xAI API. Whether you're building a RAG application or need to search across large document sets, Collections provides the infrastructure to manage and query your content.
**Looking for Files?** If you want to attach files directly to chat messages for conversation context, see [Files](/developers/files). Collections are different—they provide persistent document storage with semantic search across many documents.
## Core Concepts
There are two entities that users can create within the Collections service:
* **File** — A single entity of a user-uploaded file.
* **Collection** — A group of files linked together, with an embedding index for efficient retrieval.
* When you create a collection you have the option to automatically generate embeddings for any files uploaded to that collection. You can then perform semantic search across files in multiple collections.
* A single file can belong to multiple collections but must be part of at least one collection.
## What You Can Do
With Collections, you can:
* **Create collections** to organize your documents
* **Upload documents** in various formats (HTML, PDF, text, etc.)
* **Search semantically** across your documents using natural language queries
* **Configure chunking and embeddings** to optimize retrieval
* **Manage documents** by listing, updating, and deleting them
## Getting Started
Choose how you want to work with Collections:
* [Using the Console →](/console/collections) - Create collections and upload documents through the xAI Console interface
* [Using the API →](/developers/files/collections/api) - Programmatically manage collections with the SDK and REST API
## Metadata Fields
Collections support **metadata fields** — structured attributes you can attach to documents for enhanced retrieval and data integrity:
* **Filtered retrieval** — Narrow search results to documents matching specific criteria (e.g., `author="Sandra Kim"`)
* **Contextual embeddings** — Inject metadata into chunks to improve retrieval accuracy (e.g., prepending document title to each chunk)
* **Data integrity constraints** — Enforce required fields or uniqueness across documents
When creating a collection, define metadata fields with options like `required`, `unique`, and `inject_into_chunk` to control how metadata is validated and used during search.
[Learn more about metadata fields →](/developers/files/collections/metadata)
## Usage Limits
To be able to upload files and add to a collections you must have credits in your account.
**Maximum file size**: 100MB**Maximum number of files**: 100,000 files uploaded globally.**Maximum total size**: 100GB
Please [contact us](https://x.ai/contact) to increase any of these limits.
## Data Privacy
We do not use user data stored on Collections for model training purposes.
## Supported MIME Types
While we support any `UTF-8` encoded text file, we also have special file conversion and chunking techniques for certain MIME types.
The following would be a non-exhaustive list for the MIME types that we support:
* application/csv
* application/dart
* application/ecmascript
* application/epub
* application/epub+zip
* application/json
* application/ms-java
* application/msword
* application/pdf
* application/typescript
* application/vnd.adobe.pdf
* application/vnd.curl
* application/vnd.dart
* application/vnd.jupyter
* application/vnd.ms-excel
* application/vnd.ms-outlook
* application/vnd.oasis.opendocument.text
* application/vnd.openxmlformats-officedocument.presentationml.presentation
* application/vnd.openxmlformats-officedocument.presentationml.slide
* application/vnd.openxmlformats-officedocument.presentationml.slideshow
* application/vnd.openxmlformats-officedocument.presentationml.template
* application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
* application/vnd.openxmlformats-officedocument.spreadsheetml.template
* application/vnd.openxmlformats-officedocument.wordprocessingml.document
* application/x-csh
* application/x-epub+zip
* application/x-hwp
* application/x-hwp-v5
* application/x-latex
* application/x-pdf
* application/x-php
* application/x-powershell
* application/x-sh
* application/x-shellscript
* application/x-tex
* application/x-zsh
* application/xhtml
* application/xml
* application/zip
* text/cache-manifest
* text/calendar
* text/css
* text/csv
* text/html
* text/javascript
* text/jsx
* text/markdown
* text/n3
* text/php
* text/plain
* text/rtf
* text/tab-separated-values
* text/troff
* text/tsv
* text/tsx
* text/turtle
* text/uri-list
* text/vcard
* text/vtt
* text/x-asm
* text/x-bibtex
* text/x-c
* text/x-c++hdr
* text/x-c++src
* text/x-chdr
* text/x-coffeescript
* text/x-csh
* text/x-csharp
* text/x-csrc
* text/x-d
* text/x-diff
* text/x-emacs-lisp
* text/x-erlang
* text/x-go
* text/x-haskell
* text/x-java
* text/x-java-properties
* text/x-java-source
* text/x-kotlin
* text/x-lisp
* text/x-lua
* text/x-objcsrc
* text/x-pascal
* text/x-perl
* text/x-perl-script
* text/x-python
* text/x-python-script
* text/x-r-markdown
* text/x-rst
* text/x-ruby-script
* text/x-rust
* text/x-sass
* text/x-scala
* text/x-scheme
* text/x-script.python
* text/x-scss
* text/x-sh
* text/x-sql
* text/x-swift
* text/x-tcl
* text/x-tex
* text/x-vbasic
* text/x-vcalendar
* text/xml
* text/xml-dtd
* text/yaml
===/developers/files/managing-files===
#### Files & Collections
# Managing Files
The Files API provides a complete set of operations for managing your files. Before using files in chat conversations, you need to upload them using one of the methods described below.
## Uploading Files
You can upload files in several ways: from a file path, raw bytes, BytesIO object, or an open file handle.
### Upload from File Path
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload a file from disk
file = client.files.upload("/path/to/your/document.pdf")
print(f"File ID: {file.id}")
print(f"Filename: {file.filename}")
print(f"Size: {file.size} bytes")
print(f"Created at: {file.created_at}")
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# Upload a file
with open("/path/to/your/document.pdf", "rb") as f:
file = client.files.create(
file=f,
purpose="assistants"
)
print(f"File ID: {file.id}")
print(f"Filename: {file.filename}")
```
```pythonRequests
import os
import requests
url = "https://api.x.ai/v1/files"
headers = {
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
with open("/path/to/your/document.pdf", "rb") as f:
files = {"file": f}
data = {"purpose": "assistants"}
response = requests.post(url, headers=headers, files=files, data=data)
file_data = response.json()
print(f"File ID: {file_data['id']}")
print(f"Filename: {file_data['filename']}")
```
```bash
curl https://api.x.ai/v1/files \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-F file=@/path/to/your/document.pdf \\
-F purpose=assistants
```
### Upload from Bytes
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload file content directly from bytes
content = b"This is my document content.\\nIt can span multiple lines."
file = client.files.upload(content, filename="document.txt")
print(f"File ID: {file.id}")
print(f"Filename: {file.filename}")
```
### Upload from file object
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload a file directly from disk
file = client.files.upload(open("document.pdf", "rb"), filename="document.pdf")
print(f"File ID: {file.id}")
print(f"Filename: {file.filename}")
```
## Upload with Progress Tracking
Track upload progress for large files using callbacks or progress bars.
### Custom Progress Callback
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Define a custom progress callback
def progress_callback(bytes_uploaded: int, total_bytes: int):
percentage = (bytes_uploaded / total_bytes) * 100 if total_bytes else 0
mb_uploaded = bytes_uploaded / (1024 * 1024)
mb_total = total_bytes / (1024 * 1024)
print(f"Progress: {mb_uploaded:.2f}/{mb_total:.2f} MB ({percentage:.1f}%)")
# Upload with progress tracking
file = client.files.upload(
"/path/to/large-file.pdf",
on_progress=progress_callback
)
print(f"Successfully uploaded: {file.filename}")
```
### Progress Bar with tqdm
```pythonXAI
import os
from xai_sdk import Client
from tqdm import tqdm
client = Client(api_key=os.getenv("XAI_API_KEY"))
file_path = "/path/to/large-file.pdf"
total_bytes = os.path.getsize(file_path)
# Upload with tqdm progress bar
with tqdm(total=total_bytes, unit="B", unit_scale=True, desc="Uploading") as pbar:
file = client.files.upload(
file_path,
on_progress=pbar.update
)
print(f"Successfully uploaded: {file.filename}")
```
## Listing Files
Retrieve a list of your uploaded files with pagination and sorting options.
### Available Options
* **`limit`**: Maximum number of files to return. If not specified, uses server default of 100.
* **`order`**: Sort order for the files. Either `"asc"` (ascending) or `"desc"` (descending).
* **`sort_by`**: Field to sort by. Options: `"created_at"`, `"filename"`, or `"size"`.
* **`pagination_token`**: Token for fetching the next page of results.
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# List files with pagination and sorting
response = client.files.list(
limit=10,
order="desc",
sort_by="created_at"
)
for file in response.data:
print(f"File: {file.filename} (ID: {file.id}, Size: {file.size} bytes)")
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# List files
files = client.files.list()
for file in files.data:
print(f"File: {file.filename} (ID: {file.id})")
```
```pythonRequests
import os
import requests
url = "https://api.x.ai/v1/files"
headers = {
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
response = requests.get(url, headers=headers)
files = response.json()
for file in files.get("data", []):
print(f"File: {file['filename']} (ID: {file['id']})")
```
```bash
curl https://api.x.ai/v1/files \\
-H "Authorization: Bearer $XAI_API_KEY"
```
## Getting File Metadata
Retrieve detailed information about a specific file.
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Get file metadata by ID
file = client.files.get("file-abc123")
print(f"Filename: {file.filename}")
print(f"Size: {file.size} bytes")
print(f"Created: {file.created_at}")
print(f"Team ID: {file.team_id}")
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# Get file metadata
file = client.files.retrieve("file-abc123")
print(f"Filename: {file.filename}")
print(f"Size: {file.bytes} bytes")
```
```pythonRequests
import os
import requests
file_id = "file-abc123"
url = f"https://api.x.ai/v1/files/{file_id}"
headers = {
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
response = requests.get(url, headers=headers)
file = response.json()
print(f"Filename: {file['filename']}")
print(f"Size: {file['bytes']} bytes")
```
```bash
curl https://api.x.ai/v1/files/file-abc123 \\
-H "Authorization: Bearer $XAI_API_KEY"
```
## Getting File Content
Download the actual content of a file.
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Get file content
content = client.files.content("file-abc123")
# Content is returned as bytes
print(f"Content length: {len(content)} bytes")
print(f"Content preview: {content[:100]}")
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# Get file content
content = client.files.content("file-abc123")
print(f"Content: {content.text}")
```
```pythonRequests
import os
import requests
file_id = "file-abc123"
url = f"https://api.x.ai/v1/files/{file_id}/content"
headers = {
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
response = requests.get(url, headers=headers)
content = response.content
print(f"Content length: {len(content)} bytes")
```
```bash
curl https://api.x.ai/v1/files/file-abc123/content \\
-H "Authorization: Bearer $XAI_API_KEY"
```
## Deleting Files
Remove files when they're no longer needed.
```pythonXAI
import os
from xai_sdk import Client
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Delete a file
delete_response = client.files.delete("file-abc123")
print(f"Deleted: {delete_response.deleted}")
print(f"File ID: {delete_response.id}")
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# Delete a file
delete_response = client.files.delete("file-abc123")
print(f"Deleted: {delete_response.deleted}")
print(f"File ID: {delete_response.id}")
```
```pythonRequests
import os
import requests
file_id = "file-abc123"
url = f"https://api.x.ai/v1/files/{file_id}"
headers = {
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
response = requests.delete(url, headers=headers)
result = response.json()
print(f"Deleted: {result['deleted']}")
print(f"File ID: {result['id']}")
```
```bash
curl -X DELETE https://api.x.ai/v1/files/file-abc123 \\
-H "Authorization: Bearer $XAI_API_KEY"
```
## Limitations and Considerations
### File Size Limits
* **Maximum file size**: 48 MB per file
* **Processing time**: Larger files may take longer to process
### File Retention
* **Cleanup**: Delete files when no longer needed to manage storage
* **Access**: Files are scoped to your team/organization
### Supported Formats
While many text-based formats are supported, the system works best with:
* Structured documents (with clear sections, headings)
* Plain text and markdown
* Documents with clear information hierarchy
Supported file types include:
* Plain text files (.txt)
* Markdown files (.md)
* Code files (.py, .js, .java, etc.)
* CSV files (.csv)
* JSON files (.json)
* PDF documents (.pdf)
* And many other text-based formats
## Next Steps
Now that you know how to manage files, learn how to use them in chat conversations:
===/developers/files===
#### Files & Collections
# Files
The Files API enables you to upload documents and use them in chat conversations with Grok. When you attach files to a chat message, the system automatically activates the `attachment_search` tool, transforming your request into an agentic workflow where Grok can intelligently search through and reason over your documents to answer questions.
You can view more information at [Files API Reference](/developers/rest-api-reference/files).
**Looking for Collections?** If you need persistent document storage with semantic search across many documents, see [Collections](/developers/files/collections). Files are different—they're for attaching documents to chat conversations for immediate context.
## How Files Work with Chat
Behind the scenes, when you attach files to a chat message, the xAI API implicitly adds the `attachment_search` server-side tool to your request. This means:
1. **Automatic Agentic Behavior**: Your chat request becomes an agentic request, where Grok autonomously searches through your documents
2. **Intelligent Document Analysis**: The model can reason over document content, extract relevant information, and synthesize answers
3. **Multi-Document Support**: You can attach multiple files, and Grok will search across all of them
This seamless integration allows you to simply attach files and ask questions—the complexity of document search and retrieval is handled automatically by the agentic workflow.
## Understanding Document Search
When you attach files to a chat message, the xAI API automatically activates the `attachment_search` [server-side tool](/developers/tools/overview). This transforms your request into an [agentic workflow](/developers/tools/overview#how-it-works) where Grok:
1. **Analyzes your query** to understand what information you're seeking
2. **Searches the documents** intelligently, finding relevant sections across all attached files
3. **Extracts and synthesizes information** from multiple sources if needed
4. **Provides a comprehensive answer** with the context from your documents
### Agentic Workflow
Just like other agentic tools (web search, X search, code execution), document search operates autonomously:
* **Multiple searches**: The model may search documents multiple times with different queries to find comprehensive information
* **Reasoning**: The model uses its reasoning capabilities to decide what to search for and how to interpret the results
* **Streaming visibility**: In streaming mode, you can see when the model is searching your documents via tool call notifications
### Token Usage with Files
File-based chats follow similar token patterns to other agentic requests:
* **Prompt tokens**: Include the conversation history and internal processing. Document content is processed efficiently
* **Reasoning tokens**: Used for planning searches and analyzing document content
* **Completion tokens**: The final answer text
* **Cached tokens**: Repeated document content benefits from prompt caching for efficiency
The actual document content is processed by the server-side tool and doesn't directly appear in the message history, keeping token usage optimized.
### Pricing
Document search is billed per tool invocation, in addition to standard token costs. Each time the model searches your documents, it counts as one tool invocation. For complete pricing details, see the [Tools Pricing](/developers/models#tools-pricing) table.
## Getting Started
To use files with Grok, you'll need to:
1. **[Upload and manage files](/developers/files/managing-files)** - Learn how to upload, list, retrieve, and delete files using the Files API
2. **[Chat with files](/developers/model-capabilities/files/chat-with-files)** - Discover how to attach files to chat messages and ask questions about your documents
## Quick Example
Here's a quick example of the complete workflow:
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file
client = Client(api_key=os.getenv("XAI_API_KEY"))
# 1. Upload a document
document_content = b"""Quarterly Sales Report - Q4 2024
Total Revenue: $5.2M
Growth: +18% YoY
"""
uploaded_file = client.files.upload(document_content, filename="sales.txt")
# 2. Chat with the file
chat = client.chat.create(model="grok-4-fast")
chat.append(user("What was the total revenue?", file(uploaded_file.id)))
# 3. Get the answer
response = chat.sample()
print(response.content) # "The total revenue was $5.2M"
# 4. Clean up
client.files.delete(uploaded_file.id)
```
## Key Features
### Multiple File Support
Attach [multiple documents](/developers/model-capabilities/files/chat-with-files#multiple-file-attachments) to a single query and Grok will search across all of them to find relevant information.
### Multi-Turn Conversations
File context persists across [conversation turns](/developers/model-capabilities/files/chat-with-files#multi-turn-conversations-with-files), allowing you to ask follow-up questions without re-attaching files.
### Code Execution Integration
Combine files with the [code execution tool](/developers/model-capabilities/files/chat-with-files#combining-files-with-code-execution) to perform advanced data analysis, statistical computations, and transformations on your uploaded data. The model can write and execute Python code that processes your files directly.
## Limitations
* **File size**: Maximum 48 MB per file
* **No batch requests**: File attachments with document search are agentic requests and do not support batch mode (`n > 1`)
* **Agentic models only**: Requires models that support agentic tool calling (e.g., `grok-4-fast`, `grok-4`)
* **Supported file formats**:
* Plain text files (.txt)
* Markdown files (.md)
* Code files (.py, .js, .java, etc.)
* CSV files (.csv)
* JSON files (.json)
* PDF documents (.pdf)
* And many other text-based formats
## Next Steps
===/developers/grpc-api-reference===
# gRPC API Reference
The xAI gRPC API is a robust, high-performance gRPC interface designed for seamless integration into existing systems.
The base url for all services is at `api.x.ai`. For all services, you have to authenticate with the header `Authorization: Bearer `.
Visit [xAI API Protobuf Definitions](https://github.com/xai-org/xai-proto) to view and download our protobuf definitions.
***
===/developers/introduction===
#### Introduction
# What is Grok?
Grok is a family of Large Language Models (LLMs) developed by [xAI](https://x.ai).
Inspired by the Hitchhiker's Guide to the Galaxy, Grok is a maximally truth-seeking AI that provides insightful, unfiltered truths about the universe.
xAI offers an API for developers to programmatically interact with our Grok [models](/developers/models). The same models power our consumer facing services such as [Grok.com](https://grok.com), the [iOS](https://apps.apple.com/us/app/grok/id6670324846) and [Android](https://play.google.com/store/apps/details?id=ai.x.grok) apps, as well as [Grok in X experience](https://grok.x.com).
## What is the xAI API? How is it different from Grok in other services?
The xAI API is a toolkit for developers to integrate xAI's Grok models into their own applications, the xAI API provides the building blocks to create new AI experiences.
To get started building with the xAI API, please head to [The Hitchhiker's Guide to Grok](/developers/quickstart).
## xAI API vs Grok in other services
| Category | xAI API | Grok.com | Mobile Apps | Grok in 𝕏 |
|-------------------------------|----------------------------------|-----------------------------------|----------------------------|------------------------------------|
| **Accessible** | API (api.x.ai) | grok.com + PWA (Android) | App Store / Play Store | X.com + 𝕏 apps |
| **Billing** | xAI | xAI / 𝕏 | xAI / 𝕏 | 𝕏 |
| **Programming Required** | Yes | No | No | No |
| **Description** | Programmatic access for developers | Full-featured web AI assistant | Mobile AI assistant | X-integrated AI (fewer features) |
Because these are separate offerings, your purchase on X (e.g. X Premium) won't affect your service status on xAI API, and vice versa.
This documentation is intended for users using xAI API.
===/developers/management-api-guide===
#### Key Information
# Using Management API
Some enterprise users may prefer to manage their account details programmatically rather than manually through the xAI Console. For this reason, we have developed a Management API to enable enterprise users to efficiently manage their team details.
You can read the endpoint specifications and descriptions at [Management API Reference](/developers/rest-api-reference/management).
You need to get a management key, which is separate from your API key, to use the management API. The management key can be obtained at [xAI Console](https://console.x.ai) -> Settings -> Management Keys.
The base URL is at `https://management-api.x.ai`, which is also different from the inference API.
## Operations related to API Keys
You can create, list, update, and delete API keys via the management API.
You can also manage the access control lists (ACLs) associated with the API keys.
The available ACL types are:
* `api-key:model`
* `api-key:endpoint`
To enable all models and endpoints available to your team, use:
* `api-key:model:*`
* `api-key:endpoint:*`
Or if you need to specify the particular endpoint available to the API:
* `api-key:endpoint:chat` for chat and vision models
* `api-key:endpoint:image` for image generation models
And to specify models the API key has access to:
* `api-key:model:`
### Create an API key
An example to create an API key with all models and endpoints enabled, limiting requests to 5 queries per second and 100 queries per minute, without token number restrictions.
```bash
curl https://management-api.x.ai/auth/teams/{teamId}/api-keys \\
-X POST \\
-H "Authorization: Bearer " \\
-d '{
"name": "My API key",
"acls": ["api-key:model:*", "api-key:endpoint:*"],
"qps": 5,
"qpm": 100,
"tpm": null
}'
```
Specify `tpm` to any integer string to limit the number of tokens produced/consumed per minute. When the token rate limit is triggered, new requests will be rejected and in-flight requests will continue processing.
The newly-created API key will be returned in the `"apiKey"` field of the response object. The API Key ID is returned as `"apiKeyId"` in the response body as well, which is useful for updating and deleting operations.
### List API keys
To retrieve a list of API keys from a team, you can run the following:
```bash
curl https://management-api.x.ai/auth/teams/{teamId}/api-keys?pageSize=10&paginationToken= \\
-H "Authorization: Bearer "
```
You can customize the query parameters such as `pageSize` and `paginationToken`.
### Update an API key
You can update an API key after it has been created. For example, to update the `qpm` of an API key:
```bash
curl https://management-api.x.ai/auth/api-keys/{apiKeyId} \\
-X PUT \\
-H "Authorization: Bearer " \\
-d '{
"apiKey": {
"qpm": 200
},
"fieldMask": "qpm",
}'
```
Or to update the `name` of an API key:
```bash
curl https://management-api.x.ai/auth/api-keys/{apiKeyId} \\
-X PUT \\
-H "Authorization: Bearer " \\
-d '{
"apiKey": {
"name": "Updated API key"
},
"fieldMask": "name",
}'
```
### Delete an API key
You can also delete an API key with the following:
```bash
curl https://management-api.x.ai/auth/api-keys/{apiKeyId} \\
-X DELETE \\
-H "Authorization: Bearer "
```
### Check propagation status of API key across clusters
There could be a slight delay between creating an API key, and the API key being available for use across all clusters.
You can check the propagation status of the API key via API.
```bash
curl https://management-api.x.ai/auth/api-keys/{apiKeyId}/propagation \\
-H "Authorization: Bearer "
```
### List all models available for the team
You can list all the available models for a team with our management API as well.
The model names in the output can be used with setting ACL string on an API key as `api-key:model:`
```bash
curl https://management-api.x.ai/auth/teams/{teamId}/models \\
-H "Authorization: Bearer "
```
## Access Control List (ACL) management
We also offer endpoint to list possible ACLs for a team. You can then apply the endpoint ACL strings to your API keys.
To view possible endpoint ACLs for a team's API keys:
```bash
curl https://management-api.x.ai/auth/teams/{teamId}/endpoints \\
-H "Authorization: Bearer "
```
## Validate a management key
You can check if your key is a valid management key. If validation succeeds, the endpoint returns meta information about the management key.
This endpoint does not require any Access Control List (ACL) permissions.
```bash
curl https://management-api.x.ai/auth/management-keys/validation \\
-H "Authorization: Bearer "
```
## Audit Logs
You can retrieve audit logs for your team. Audit events track changes to team settings, API keys, team membership, and other administrative actions.
### List audit events
To retrieve audit events for a team:
```bash
curl "https://management-api.x.ai/audit/teams/{teamId}/events?pageSize=10" \\
-H "Authorization: Bearer "
```
You can customize the query parameters:
* `pageSize` - Number of events per page
* `pageToken` - Token for fetching the next page of results
* `eventFilter.userId` - Filter events to a specific user
* `eventFilter.query` - Full-text search in event descriptions
* `eventTimeFrom` - Filter events from a specific time (ISO 8601 format)
* `eventTimeTo` - Filter events up to a specific time (ISO 8601 format)
To fetch the next page of results, use the `nextPageToken` from the response:
```bash
curl "https://management-api.x.ai/audit/teams/{teamId}/events?pageSize=10&pageToken={nextPageToken}" \\
-H "Authorization: Bearer "
```
Example with time filter:
```bash
curl "https://management-api.x.ai/audit/teams/{teamId}/events?pageSize=50&eventTimeFrom=2025-01-01T00:00:00Z" \\
-H "Authorization: Bearer "
```
===/developers/migration/models===
#### Key Information
# Migrating to New Models
As we release newer, more advanced models, we are focusing resources on supporting customers with these models and will
be phasing out older versions.
You will see `deprecated` tag by the deprecated model names on [xAI Console](https://console.x.ai) models page. You
should consider moving to a newer model when the model of your choice is being deprecated.
We may transition a `deprecated` model to `obsolete` and discontinue serving the model across our services.
An `obsolete` model will be removed from our [Models and Pricing](../models) page as well as from [xAI Console](https://console.x.ai).
## Moving from an older generation model
When you move from an older model generation to a newer one, you usually won't need to make significant changes to
how you use the API. In your request body, you can switch the `"model"` field from the deprecating model to a current
model on [xAI Console](https://console.x.ai) models page.
The newer models are more performant, but you might want to check if your prompts and other parameters can work with the
new model and modify if necessary.
## Moving to the latest endpoints
When you are setting up to use new models, it might also be a good idea to ensure you're using the latest endpoints. The
latest endpoints have more stable supports for the model functionalities. Endpoints that are marked with `legacy`
might not receive any updates that support newer functionalities.
In general, the following endpoints are recommended: - Text and image input and text output: [Chat Completions](/developers/rest-api-reference/inference/chat#chat-completions) - `/v1/chat/completions` - Text input and image output: [Image Generation](/developers/rest-api-reference/inference/images#image-generation) - `/v1/image/generations` - Tokenization: [Tokenize Text](/developers/rest-api-reference/inference/other#tokenize-text) - `/v1/tokenize-text`
===/developers/model-capabilities/audio/voice-agent===
#### Model Capabilities
# Voice Agent API
Build interactive voice conversations with Grok models using WebSocket. The Grok Voice Agent API accepts audio and text inputs and creates text and audio responses in real-time.
**WebSocket Endpoint:** `wss://api.x.ai/v1/realtime`
The Voice Agent API is billed at a flat per-minute rate. See [Voice Agent API Pricing](/developers/models#voice-agent-api-pricing) for details.
## Authentication
You can authenticate [WebSocket](#connect-via-websocket) connections using the xAI API key or an ephemeral token.
**IMPORTANT:** It is **recommended to use an ephemeral token** when authenticating from the client side (e.g. browser).
If you use the xAI API key to authenticate from the client side, **the client may see the API key and make unauthorized API requests with it.**
### Fetching Ephemeral Tokens
You need to set up another server or endpoint to fetch the ephemeral token from xAI. The ephemeral token will give the holder a scoped access to resources.
**Endpoint:** `POST https://api.x.ai/v1/realtime/client_secrets`
```bash
curl --url https://api.x.ai/v1/realtime/client_secrets \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
--data '{
"expires_after": {
"seconds": 300
}
}'
# Note: Does not support "session" or "expires_after.anchor" fields
```
```pythonWithoutSDK
# Example ephemeral token endpoint with FastAPI
import os
import httpx
from fastapi import FastAPI
app = FastAPI()
SESSION_REQUEST_URL = "https://api.x.ai/v1/realtime/client_secrets"
XAI_API_KEY = os.getenv("XAI_API_KEY")
@app.post("/session")
async def get_ephemeral_token():
# Send request to xAI endpoint to retrieve the ephemeral token
async with httpx.AsyncClient() as client:
response = await client.post(
url=SESSION_REQUEST_URL,
headers={
"Authorization": f"Bearer {XAI_API_KEY}",
"Content-Type": "application/json",
},
json={"expires_after": {"seconds": 300}},
)
# Return the response body from xAI with ephemeral token
return response.json()
```
```javascriptWithoutSDK
// Example ephemeral token endpoint with Express
import express from 'express';
const app = express();
const SESSION_REQUEST_URL = "https://api.x.ai/v1/realtime/client_secrets";
app.use(express.json());
app.post("/session", async (req, res) => {
const r = await fetch(SESSION_REQUEST_URL, {
method: "POST",
headers: {
Authorization: \`Bearer \${process.env.XAI_API_KEY}\`,
"Content-Type": "application/json",
},
body: JSON.stringify({
expires_after: { seconds: 300 }
}),
});
const data = await r.json();
res.json(data);
});
app.listen(8081);
```
### Using API Key Directly
For server-side applications where the API key is not exposed to clients, you can authenticate directly with your xAI API key.
**Server-side only:** Only use API key authentication from secure server environments. Never expose your API key in client-side code.
```pythonWithoutSDK
import os
import websockets
XAI_API_KEY = os.getenv("XAI_API_KEY")
base_url = "wss://api.x.ai/v1/realtime"
# Connect with API key in Authorization header
async with websockets.connect(
uri=base_url,
ssl=True,
additional_headers={"Authorization": f"Bearer {XAI_API_KEY}"}
) as websocket:
# WebSocket connection is now authenticated
pass
```
```javascriptWithoutSDK
import WebSocket from "ws";
const baseUrl = "wss://api.x.ai/v1/realtime";
// Connect with API key in Authorization header
const ws = new WebSocket(baseUrl, {
headers: {
Authorization: "Bearer " + process.env.XAI_API_KEY,
"Content-Type": "application/json",
},
});
ws.on("open", () => {
console.log("Connected with API key authentication");
});
```
## Voice Options
The Grok Voice Agent API supports 5 different voice options, each with distinct characteristics. Select the voice that best fits your application's personality and use case.
### Available Voices
| Voice | Type | Tone | Description | Sample |
|-------|------|------|-------------|:------:|
| **`Ara`** | Female | Warm, friendly | Default voice, balanced and conversational | |
| **`Rex`** | Male | Confident, clear | Professional and articulate, ideal for business applications | |
| **`Sal`** | Neutral | Smooth, balanced | Versatile voice suitable for various contexts | |
| **`Eve`** | Female | Energetic, upbeat | Engaging and enthusiastic, great for interactive experiences | |
| **`Leo`** | Male | Authoritative, strong | Decisive and commanding, suitable for instructional content | |
### Selecting a Voice
Specify the voice in your session configuration using the `voice` parameter:
```pythonWithoutSDK
# Configure session with a specific voice
session_config = {
"type": "session.update",
"session": {
"voice": "Ara", # Choose from: Ara, Rex, Sal, Eve, Leo
"instructions": "You are a helpful assistant.",
# Audio format settings (these are the defaults if not specified)
"audio": {
"input": {"format": {"type": "audio/pcm", "rate": 24000}},
"output": {"format": {"type": "audio/pcm", "rate": 24000}}
}
}
}
await ws.send(json.dumps(session_config))
```
```javascriptWithoutSDK
// Configure session with a specific voice
const sessionConfig = {
type: "session.update",
session: {
voice: "Ara", // Choose from: Ara, Rex, Sal, Eve, Leo
instructions: "You are a helpful assistant.",
// Audio format settings (these are the defaults if not specified)
audio: {
input: { format: { type: "audio/pcm", rate: 24000 } },
output: { format: { type: "audio/pcm", rate: 24000 } }
}
}
};
ws.send(JSON.stringify(sessionConfig));
```
## Audio Format
The Grok Voice Agent API supports multiple audio formats for real-time audio streaming. Audio data must be encoded as base64 strings when sent over WebSocket.
### Supported Audio Formats
The API supports three audio format types:
| Format | Encoding | Container Types | Sample Rate |
|--------|----------|-----------------|-------------|
| **`audio/pcm`** | Linear16, Little-endian | Raw, WAV, AIFF | Configurable (see below) |
| **`audio/pcmu`** | G.711 μ-law (Mulaw) | Raw | 8000 Hz |
| **`audio/pcma`** | G.711 A-law | Raw | 8000 Hz |
### Supported Sample Rates
When using `audio/pcm` format, you can configure the sample rate to one of the following supported values:
| Sample Rate | Quality | Description |
|-------------|---------|-------------|
| **8000 Hz** | Telephone | Narrowband, suitable for voice calls |
| **16000 Hz** | Wideband | Good for speech recognition |
| **21050 Hz** | Standard | Balanced quality and bandwidth |
| **24000 Hz** | High (Default) | Recommended for most use cases |
| **32000 Hz** | Very High | Enhanced audio clarity |
| **44100 Hz** | CD Quality | Standard for music / media |
| **48000 Hz** | Professional | Studio-grade audio / Web Browser |
**Note:** Sample rate configuration is only applicable for `audio/pcm` format. The `audio/pcmu` and `audio/pcma` formats use their standard encoding specifications.
### Audio Specifications
| Property | Value | Description |
|----------|-------|-------------|
| **Sample Rate** | Configurable (PCM only) | Sample rate in Hz (see supported rates above) |
| **Default Sample Rate** | 24kHz | 24,000 samples per second (for PCM) |
| **Channels** | Mono | Single channel audio |
| **Encoding** | Base64 | Audio bytes encoded as base64 string |
| **Byte Order** | Little-endian | 16-bit samples in little-endian format (for PCM) |
### Configuring Audio Format
You can configure the audio format and sample rate for both input and output in the session configuration:
```pythonWithoutSDK
# Configure audio format with custom sample rate for input and output
session_config = {
"type": "session.update",
"session": {
"audio": {
"input": {
"format": {
"type": "audio/pcm", # or "audio/pcmu" or "audio/pcma"
"rate": 16000 # Only applicable for audio/pcm
}
},
"output": {
"format": {
"type": "audio/pcm", # or "audio/pcmu" or "audio/pcma"
"rate": 16000 # Only applicable for audio/pcm
}
}
},
"instructions": "You are a helpful assistant.",
}
}
await ws.send(json.dumps(session_config))
```
```javascriptWithoutSDK
// Configure audio format with custom sample rate for input and output
const sessionConfig = {
type: "session.update",
session: {
audio: {
input: {
format: {
type: "audio/pcm", // or "audio/pcmu" or "audio/pcma"
rate: 16000 // Only applicable for audio/pcm
}
},
output: {
format: {
type: "audio/pcm", // or "audio/pcmu" or "audio/pcma"
rate: 16000 // Only applicable for audio/pcm
}
}
},
instructions: "You are a helpful assistant.",
}
};
ws.send(JSON.stringify(sessionConfig));
```
## Connect via WebSocket
You can connect to the realtime model via WebSocket. The audio data needs to be serialized into base64-encoded strings.
The examples below show connecting to the WebSocket endpoint from the server environment.
```pythonWithoutSDK
import asyncio
import json
import os
from typing import Any
import websockets
from websockets.asyncio.client import ClientConnection
XAI_API_KEY = os.getenv("XAI_API_KEY")
base_url = "wss://api.x.ai/v1/realtime"
# Process received message
async def on_message(ws: ClientConnection, message: websockets.Data):
data = json.loads(message)
print("Received event:", json.dumps(data, indent=2))
# Optionally, you can send an event after processing message
# You can create an event dictionary and send:
# await send_message(ws, event)
# Send message with an event to server
async def send_message(ws: ClientConnection, event: dict[str, Any]):
await ws.send(json.dumps(event))
# Example event to be sent on connection open
async def on_open(ws: ClientConnection):
print("Connected to server.")
# Configure the session with voice, audio format, and instructions
session_config = {
"type": "session.update",
"session": {
"voice": "Ara",
"instructions": "You are a helpful assistant.",
"turn_detection": {"type": "server_vad"},
"audio": {
"input": {"format": {"type": "audio/pcm", "rate": 24000}},
"output": {"format": {"type": "audio/pcm", "rate": 24000}}
}
}
}
await send_message(ws, session_config)
# Send a user text message content
event = {
"type": "conversation.item.create",
"item": {
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": "hello"}],
},
}
await send_message(ws, event)
# Send an event to request a response, so Grok will start processing on our previous message
event = {
"type": "response.create",
"response": {
"modalities": ["text", "audio"],
},
}
await send_message(ws, event)
async def main():
# Connect to the secure websocket
async with websockets.connect(
uri=base_url,
ssl=True,
additional_headers={"Authorization": f"Bearer {XAI_API_KEY}"}
) as websocket:
# Send request on connection open
await on_open(ws=websocket)
while True:
try:
# Receive message and print it
message = await websocket.recv()
await on_message(websocket, message)
except websockets.exceptions.ConnectionClosed:
print("Connection Closed")
break
asyncio.run(main())
```
```javascriptWithoutSDK
import WebSocket from "ws";
const baseUrl = "wss://api.x.ai/v1/realtime";
const ws = new WebSocket(baseUrl, {
headers: {
Authorization: "Bearer " + process.env.XAI_API_KEY,
"Content-Type": "application/json",
},
});
ws.on("open", function open() {
console.log("Connected to server.");
// Configure the session with voice, audio format, and instructions
const sessionConfig = {
type: "session.update",
session: {
voice: "Ara",
instructions: "You are a helpful assistant.",
turn_detection: { type: "server_vad" },
audio: {
input: { format: { type: "audio/pcm", rate: 24000 } },
output: { format: { type: "audio/pcm", rate: 24000 } }
}
}
};
ws.send(JSON.stringify(sessionConfig));
// Create a new conversation message and send to server
let event = {
type: "conversation.item.create",
item: {
type: "message",
role: "user",
content: [{ type: "input_text", text: "hello" }],
},
};
ws.send(JSON.stringify(event));
// Send an event to request a response, so Grok will start processing on our previous message
event = {
type: "response.create",
};
ws.send(JSON.stringify(event));
});
ws.on("message", function incoming(message) {
const serverEvent = JSON.parse(message);
console.log(serverEvent);
});
```
## Message types
There are a few message types used in interacting with the models. [Client events](#client-events) are sent by user to the server, and [Server events](#server-events) are sent by server to client.
### Client Events
### Server Events
## Session Messages
### Client Events
* **`"session.update"`** - Update session configuration such as system prompt, voice, audio format and search settings
```json
{
"type": "session.update",
"session": {
"instructions": "pass a system prompt here",
"voice": "Ara",
"turn_detection": {
"type": "server_vad" or null,
},
"audio": {
"input": {
"format": {
"type": "audio/pcm",
"rate": 24000
}
},
"output": {
"format": {
"type": "audio/pcm",
"rate": 24000
}
}
}
}
}
```
**Session Parameters:**
| Parameter | Type | Description |
|-----------|------|-------------|
| `instructions` | string | System prompt |
| `voice` | string | Voice selection: `Ara`, `Rex`, `Sal`, `Eve`, `Leo` (see [Voice Options](#voice-options)) |
| `turn_detection.type` | string | null | `"server_vad"` for automatic detection, `null` for manual text turns |
| `audio.input.format.type` | string | Input format: `"audio/pcm"`, `"audio/pcmu"`, or `"audio/pcma"` |
| `audio.input.format.rate` | number | Input sample rate (PCM only): 8000, 16000, 21050, 24000, 32000, 44100, 48000 |
| `audio.output.format.type` | string | Output format: `"audio/pcm"`, `"audio/pcmu"`, or `"audio/pcma"` |
| `audio.output.format.rate` | number | Output sample rate (PCM only): 8000, 16000, 21050, 24000, 32000, 44100, 48000 |
### Receiving and Playing Audio
Decode and play base64 PCM16 audio received from the API. Use the same sample rate as configured:
```pythonWithoutSDK
import base64
import numpy as np
# Configure session with 16kHz sample rate for lower bandwidth (input and output)
session_config = {
"type": "session.update",
"session": {
"instructions": "You are a helpful assistant.",
"voice": "Ara",
"turn_detection": {
"type": "server_vad",
},
"audio": {
"input": {
"format": {
"type": "audio/pcm",
"rate": 16000 # 16kHz for lower bandwidth usage
}
},
"output": {
"format": {
"type": "audio/pcm",
"rate": 16000 # 16kHz for lower bandwidth usage
}
}
}
}
}
await ws.send(json.dumps(session_config))
# When processing audio, use the same sample rate
SAMPLE_RATE = 16000
# Convert audio data to PCM16 and base64
def audio_to_base64(audio_data: np.ndarray) -> str:
"""Convert float32 audio array to base64 PCM16 string."""
# Normalize to [-1, 1] and convert to int16
audio_int16 = (audio_data * 32767).astype(np.int16)
# Encode to base64
audio_bytes = audio_int16.tobytes()
return base64.b64encode(audio_bytes).decode('utf-8')
# Convert base64 PCM16 to audio data
def base64_to_audio(base64_audio: str) -> np.ndarray:
"""Convert base64 PCM16 string to float32 audio array."""
# Decode base64
audio_bytes = base64.b64decode(base64_audio)
# Convert to int16 array
audio_int16 = np.frombuffer(audio_bytes, dtype=np.int16)
# Normalize to [-1, 1]
return audio_int16.astype(np.float32) / 32768.0
```
```javascriptWithoutSDK
// Configure session with 16kHz sample rate for lower bandwidth (input and output)
const sessionConfig = {
type: "session.update",
session: {
instructions: "You are a helpful assistant.",
voice: "Ara",
turn_detection: { type: "server_vad" },
audio: {
input: {
format: {
type: "audio/pcm",
rate: 16000 // 16kHz for lower bandwidth usage
}
},
output: {
format: {
type: "audio/pcm",
rate: 16000 // 16kHz for lower bandwidth usage
}
}
}
}
};
ws.send(JSON.stringify(sessionConfig));
// When processing audio, use the same sample rate
const SAMPLE_RATE = 16000;
// Create AudioContext with matching sample rate
const audioContext = new AudioContext({ sampleRate: SAMPLE_RATE });
// Helper function to convert Float32Array to base64 PCM16
function float32ToBase64PCM16(float32Array) {
const pcm16 = new Int16Array(float32Array.length);
for (let i = 0; i < float32Array.length; i++) {
const s = Math.max(-1, Math.min(1, float32Array[i]));
pcm16[i] = s < 0 ? s * 0x8000 : s * 0x7FFF;
}
const bytes = new Uint8Array(pcm16.buffer);
return btoa(String.fromCharCode(...bytes));
}
// Helper function to convert base64 PCM16 to Float32Array
function base64PCM16ToFloat32(base64String) {
const binaryString = atob(base64String);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
const pcm16 = new Int16Array(bytes.buffer);
const float32 = new Float32Array(pcm16.length);
for (let i = 0; i < pcm16.length; i++) {
float32[i] = pcm16[i] / 32768.0;
}
return float32;
}
```
### Server Events
* **`"session.updated"`** - Acknowledge the client's `"session.update"` message that the session has been updated
```json
{
"event_id": "event_123",
"type": "session.updated",
"session": {
"instructions": "You are a helpful assistant.",
"voice": "Ara",
"turn_detection": {
"type": "server_vad"
}
}
}
```
## Using Tools with Grok Voice Agent API
The Grok Voice Agent API supports various tools that can be configured in your session to enhance the capabilities of your voice agent. Tools can be configured in the `session.update` message.
### Available Tool Types
* **Collections Search (`file_search`)** - Search through your uploaded document collections
* **Web Search (`web_search`)** - Search the web for current information
* **X Search (`x_search`)** - Search X (Twitter) for posts and information
* **Custom Functions** - Define your own function tools with JSON schemas
### Configuring Tools in Session
Tools are configured in the `tools` array of the session configuration. Here are examples showing how to configure different tool types:
### Collections Search with `file_search`
Use the `file_search` tool to enable your voice agent to search through document collections. You'll need to create a collection first using the [Collections API](/developers/rest-api-reference/collections).
```pythonWithoutSDK
COLLECTION_ID = "your-collection-id" # Replace with your collection ID
session_config = {
"type": "session.update",
"session": {
...
"tools": [
{
"type": "file_search",
"vector_store_ids": [COLLECTION_ID],
"max_num_results": 10,
},
],
},
}
```
```javascriptWithoutSDK
const COLLECTION_ID = "your-collection-id"; // Replace with your collection ID
const sessionConfig = {
type: "session.update",
session: {
...
tools: [
{
type: "file_search",
vector_store_ids: [COLLECTION_ID],
max_num_results: 10,
},
],
},
};
```
### Web Search and X Search
Configure web search and X search tools to give your voice agent access to current information from the web and X (Twitter).
```pythonWithoutSDK
session_config = {
"type": "session.update",
"session": {
...
"tools": [
{
"type": "web_search",
},
{
"type": "x_search",
"allowed_x_handles": ["elonmusk", "xai"],
},
],
},
}
```
```javascriptWithoutSDK
const sessionConfig = {
type: "session.update",
session: {
...
tools: [
{
type: "web_search",
},
{
type: "x_search",
allowed_x_handles: ["elonmusk", "xai"],
},
],
},
};
```
### Custom Function Tools
You can define custom function tools with JSON schemas to extend your voice agent's capabilities.
```pythonWithoutSDK
session_config = {
"type": "session.update",
"session": {
...
"tools": [
{
"type": "function",
"name": "generate_random_number",
"description": "Generate a random number between min and max values",
"parameters": {
"type": "object",
"properties": {
"min": {
"type": "number",
"description": "Minimum value (inclusive)",
},
"max": {
"type": "number",
"description": "Maximum value (inclusive)",
},
},
"required": ["min", "max"],
},
},
],
},
}
```
```javascriptWithoutSDK
const sessionConfig = {
type: "session.update",
session: {
...
tools: [
{
type: "function",
name: "generate_random_number",
description: "Generate a random number between min and max values",
parameters: {
type: "object",
properties: {
min: {
type: "number",
description: "Minimum value (inclusive)",
},
max: {
type: "number",
description: "Maximum value (inclusive)",
},
},
required: ["min", "max"],
},
},
],
},
};
```
### Combining Multiple Tools
You can combine multiple tool types in a single session configuration:
```pythonWithoutSDK
session_config = {
"type": "session.update",
"session": {
...
"tools": [
{
"type": "file_search",
"vector_store_ids": ["your-collection-id"],
"max_num_results": 10,
},
{
"type": "web_search",
},
{
"type": "x_search",
},
{
"type": "function",
"name": "generate_random_number",
"description": "Generate a random number",
"parameters": {
"type": "object",
"properties": {
"min": {"type": "number"},
"max": {"type": "number"},
},
"required": ["min", "max"],
},
},
],
},
}
```
```javascriptWithoutSDK
const sessionConfig = {
type: "session.update",
session: {
...
tools: [
{
type: "file_search",
vector_store_ids: ["your-collection-id"],
max_num_results: 10,
},
{
type: "web_search",
},
{
type: "x_search",
},
{
type: "function",
name: "generate_random_number",
description: "Generate a random number",
parameters: {
type: "object",
properties: {
min: { type: "number" },
max: { type: "number" },
},
required: ["min", "max"],
},
},
],
},
};
```
For more details on Collections, see the [Collections API documentation](/developers/rest-api-reference/collections). For search tool parameters and options, see the [Web Search](/developers/tools/web-search) and [X Search](/developers/tools/x-search) guides.
### Handling Function Call Responses
When you define custom function tools, the voice agent will call these functions during conversation. You need to handle these function calls, execute them, and return the results to continue the conversation.
### Function Call Flow
1. **Agent decides to call a function** → sends `response.function_call_arguments.done` event
2. **Your code executes the function** → processes the arguments and generates a result
3. **Send result back to agent** → sends `conversation.item.create` with the function output
4. **Request continuation** → sends `response.create` to let the agent continue
### Complete Example
```pythonWithoutSDK
import json
import websockets
# Define your function implementations
def get_weather(location: str, units: str = "celsius"):
"""Get current weather for a location"""
# In production, call a real weather API
return {
"location": location,
"temperature": 22,
"units": units,
"condition": "Sunny",
"humidity": 45
}
def book_appointment(date: str, time: str, service: str):
"""Book an appointment"""
# In production, interact with your booking system
import random
confirmation = f"CONF{random.randint(1000, 9999)}"
return {
"status": "confirmed",
"confirmation_code": confirmation,
"date": date,
"time": time,
"service": service
}
# Map function names to implementations
FUNCTION_HANDLERS = {
"get_weather": get_weather,
"book_appointment": book_appointment
}
async def handle_function_call(ws, event):
"""Handle function call from the voice agent"""
function_name = event["name"]
call_id = event["call_id"]
arguments = json.loads(event["arguments"])
print(f"Function called: {function_name} with args: {arguments}")
# Execute the function
if function_name in FUNCTION_HANDLERS:
result = FUNCTION_HANDLERS[function_name](**arguments)
# Send result back to agent
await ws.send(json.dumps({
"type": "conversation.item.create",
"item": {
"type": "function_call_output",
"call_id": call_id,
"output": json.dumps(result)
}
}))
# Request agent to continue with the result
await ws.send(json.dumps({
"type": "response.create"
}))
else:
print(f"Unknown function: {function_name}")
# In your WebSocket message handler
async def on_message(ws, message):
event = json.loads(message)
# Listen for function calls
if event["type"] == "response.function_call_arguments.done":
await handle_function_call(ws, event)
elif event["type"] == "response.output_audio.delta":
# Handle audio response
pass
```
```javascriptWithoutSDK
// Define your function implementations
const functionHandlers = {
get_weather: async (args) => {
// In production, call a real weather API
return {
location: args.location,
temperature: 22,
units: args.units || "celsius",
condition: "Sunny",
humidity: 45
};
},
book_appointment: async (args) => {
// In production, interact with your booking system
const confirmation = \`CONF\${Math.floor(Math.random() * 9000) + 1000}\`;
return {
status: "confirmed",
confirmation_code: confirmation,
date: args.date,
time: args.time,
service: args.service
};
}
};
// Handle function calls from the voice agent
async function handleFunctionCall(ws, event) {
const functionName = event.name;
const callId = event.call_id;
const args = JSON.parse(event.arguments);
console.log(\`Function called: \${functionName\} with args:\`, args);
// Execute the function
const handler = functionHandlers[functionName];
if (handler) {
const result = await handler(args);
// Send result back to agent
ws.send(JSON.stringify({
type: "conversation.item.create",
item: {
type: "function_call_output",
call_id: callId,
output: JSON.stringify(result)
}
}));
// Request agent to continue with the result
ws.send(JSON.stringify({
type: "response.create"
}));
} else {
console.error(\`Unknown function: \${functionName\}\`);
}
}
// In your WebSocket message handler
ws.on("message", (message) => {
const event = JSON.parse(message);
// Listen for function calls
if (event.type === "response.function_call_arguments.done") {
handleFunctionCall(ws, event);
} else if (event.type === "response.output_audio.delta") {
// Handle audio response
}
});
```
### Function Call Events
| Event | Direction | Description |
|-------|-----------|-------------|
| `response.function_call_arguments.done` | Server → Client | Function call triggered with complete arguments |
| `conversation.item.create` (function\_call\_output) | Client → Server | Send function execution result back |
| `response.create` | Client → Server | Request agent to continue processing |
### Real-World Example: Weather Query
When a user asks "What's the weather in San Francisco?", here's the complete flow:
| Step | Direction | Event | Description |
|:----:|:---------:|-------|-------------|
| 1 | Client → Server | `input_audio_buffer.append` | User speaks: "What's the weather in San Francisco?" |
| 2 | Server → Client | `response.function_call_arguments.done` | Agent decides to call `get_weather` with `location: "San Francisco"` |
| 3 | Client → Server | `conversation.item.create` | Your code executes `get_weather()` and sends result: `{temperature: 68, condition: "Sunny"}` |
| 4 | Client → Server | `response.create` | Request agent to continue with function result |
| 5 | Server → Client | `response.output_audio.delta` | Agent responds: "The weather in San Francisco is currently 68°F and sunny." |
Function calls happen automatically during conversation flow. The agent decides when to call functions based on the function descriptions and conversation context.
## Conversation messages
### Server Events
* **`"conversation.created"`** - The first message at connection. Notifies the client that a conversation session has been created
```json
{
"event_id": "event_9101",
"type": "conversation.created",
"conversation": {
"id": "conv_001",
"object": "realtime.conversation"
}
}
```
## Conversation item messages
### Client
* `"conversation.item.create"`: Create a new user message with text.
```json
{
"type": "conversation.item.create",
"previous_item_id": "", // Optional, used to insert turn into history
"item": {
"type": "message",
"role": "user",
"content": [
{
"type": "input_text",
"text": "Hello, how are you?"
}
]
}
}
```
### Server
* `"conversation.item.added"`: Responding to the client that a new user message has been added to conversation history, or if an assistance response has been added to conversation history.
```json
{
"event_id": "event_1920",
"type": "conversation.item.added",
"previous_item_id": "msg_002",
"item": {
"id": "msg_003",
"object": "realtime.item",
"type": "message",
"status": "completed",
"role": "user",
"content": [
{
"type": "input_audio",
"transcript": "hello how are you"
}
]
}
}
```
* `"conversation.item.input_audio_transcription.completed"`: Notify the client the audio transcription for input has been completed.
```json
{
"event_id": "event_2122",
"type": "conversation.item.input_audio_transcription.completed",
"item_id": "msg_003",
"transcript": "Hello, how are you?"
}
```
## Input audio buffer messages
### Client
* `"input_audio_buffer.append"`: Append chunks of audio data to the buffer. The audio needs to be base64-encoded. The server does not send back corresponding message.
```json
{
"type": "input_audio_buffer.append",
"audio": ""
}
```
* `"input_audio_buffer.clear"`: Clear input audio buffer. Server sends back `"input_audio_buffer.cleared"` message.
```json
{
"type": "input_audio_buffer.clear"
}
```
* `"input_audio_buffer.commit"`: Create a new user message by committing the audio buffer created by previous `"input_audio_buffer.append"` messages. Confirmed by `"input_audio_buffer.committed"` from server.
Only available when `"turn_detection"` setting in session is `"type": null`. Otherwise the
conversation turn will be automatically committed by VAD on the server.
```json
{
"type": "input_audio_buffer.commit"
}
```
### Server
* `"input_audio_buffer.speech_started"`: Notify the client the server's VAD has detected the start of a speech.
Only available when `"turn_detection"` setting in session is `"type": "server_vad"`.
```json
{
"event_id": "event_1516",
"type": "input_audio_buffer.speech_started",
"item_id": "msg_003"
}
```
* `"input_audio_buffer.speech_stopped"`: Notify the client the server's VAD has detected the end of a speech.
Only available when `"turn_detection"` setting in session is `"type": "server_vad"`.
```json
{
"event_id": "event_1516",
"type": "input_audio_buffer.speech_stopped",
"item_id": "msg_003"
}
```
* `"input_audio_buffer.cleared"`: Input audio buffer has been cleared.
```json
{
"event_id": "event_1516",
"type": "input_audio_buffer.cleared"
}
```
* `"input_audio_buffer.committed"`: Input audio buffer has been committed.
```json
{
"event_id": "event_1121",
"type": "input_audio_buffer.committed",
"previous_item_id": "msg_001",
"item_id": "msg_002"
}
```
## Response messages
### Client
* `"response.create"`: Request the server to create a new assistant response when using client side vad. (This is handled automatically when using server side vad.)
```json
{
"type": "response.create"
}
```
### Server
* `"response.created"`: A new assistant response turn is in progress. Audio delta created from this assistant turn will have the same response id. Followed by `"response.output_item.added"`.
```json
{
"event_id": "event_2930",
"type": "response.created",
"response": {
"id": "resp_001",
"object": "realtime.response",
"status": "in_progress",
"output": []
}
}
```
* `"response.output_item.added"`: A new assistant response is added to message history.
```json
{
"event_id": "event_3334",
"type": "response.output_item.added",
"response_id": "resp_001",
"output_index": 0,
"item": {
"id": "msg_007",
"object": "realtime.item",
"type": "message",
"status": "in_progress",
"role": "assistant",
"content": []
}
}
```
* `"response.done"`: The assistant's response is completed. Sent after all the `"response.output_audio_transcript.done"` and `"response.output_audio.done"` messages. Ready for the client to add a new conversation item.
```json
{
"event_id": "event_3132",
"type": "response.done",
"response": {
"id": "resp_001",
"object": "realtime.response",
"status": "completed",
}
}
```
## Response audio and transcription messages
### Client
The client does not need to send messages to get these audio and transcription responses. They would be automatically created following `"response.create"` message.
### Server
* `"response.output_audio_transcript.delta"`: Audio transcript delta of the assistant response.
```json
{
"event_id": "event_4950",
"type": "response.output_audio_transcript.delta",
"response_id": "resp_001",
"item_id": "msg_008",
"delta": "Text response..."
}
```
* `"response.output_audio_transcript.done"`: The audio transcript delta of the assistant response has finished generating.
```json
{
"event_id": "event_5152",
"type": "response.output_audio_transcript.done",
"response_id": "resp_001",
"item_id": "msg_008"
}
```
* `"response.output_audio.delta"`: The audio stream delta of the assistant response.
```json
{
"event_id": "event_4950",
"type": "response.output_audio.delta",
"response_id": "resp_001",
"item_id": "msg_008",
"output_index": 0,
"content_index": 0,
"delta": ""
}
```
* `"response.output_audio.done"`: Notifies client that the audio for this turn has finished generating.
```json
{
"event_id": "event_5152",
"type": "response.output_audio.done",
"response_id": "resp_001",
"item_id": "msg_008",
}
```
===/developers/model-capabilities/audio/voice===
#### Model Capabilities
# Voice Overview
We're introducing a new API for voice interactions with Grok. We're initially launching with the Grok Voice Agent API and will soon be launching dedicated speech-to-text and text-to-speech APIs.
## Grok Voice Agent API
Build powerful real-time voice applications with the Grok Voice Agent API. Create interactive voice conversations with Grok models via WebSocket for voice assistants, phone agents, and interactive voice applications.
**WebSocket Endpoint:** `wss://api.x.ai/v1/realtime`
## The Voice Agent API is only available in `us-east-1` region.
### Enterprise Ready
Optimized for enterprise use cases across Customer Support, Medical, Legal, Finance, Insurance, and more. The Grok Voice Agent API delivers the reliability and precision that regulated industries demand.
* **Telephony** - Connect to platforms like Twilio, Vonage, and other SIP providers
* **Tool Calling** - CRMs, calendars, ticketing systems, databases, and custom APIs
* **Multilingual** - Serve global customers in their native language with natural accents
* **Low Latency** - Real-time responses for natural, human-like conversations
* **Accuracy** - Precise transcription and understanding of critical information:
* Industry-specific terminology including medical, legal, and financial vocabulary
* Email addresses, dates, and alphanumeric codes
* Names, addresses, and phone numbers
### Getting Started
The Grok Voice Agent API enables interactive voice conversations with Grok models via WebSocket. Perfect for building voice assistants, phone agents, and interactive voice applications.
**Use Cases:**
* Voice Assistants for web and mobile
* AI-powered phone systems with Twilio
* Real-time customer support
* Interactive Voice Response (IVR) systems
[Documentation →](/developers/model-capabilities/audio/voice-agent)
### Low Latency
Built for real-time conversations. The Grok Voice Agent API is optimized for minimal response times, enabling natural back-and-forth dialogue without awkward pauses. Stream audio bidirectionally over WebSocket for instant voice-to-voice interactions that feel like talking to a human.
### Multilingual with Natural Accents
The Grok Voice Agent API speaks over 100 languages with native-quality accents. The model automatically detects the input language and responds naturally in the same language-no configuration required.
### Supported Languages
English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese (Mandarin), Japanese, Korean, Arabic, Hindi, Turkish, Polish, Swedish, Danish, Norwegian, Finnish, Czech, and many more.
Each language features natural pronunciation, appropriate intonation patterns, and culturally-aware speech rhythms. You can also specify a preferred language or accent in your system instructions for consistent multilingual experiences.
### Tool Calling
Extend your voice agent's capabilities with powerful built-in tools that execute during conversations:
* **Web Search** - Real-time internet search for current information, news, and facts
* **X Search** - Search posts, trends, and discussions from X
* **Collections** - RAG-powered search over your uploaded documents and knowledge bases
* **Custom Functions** - Define your own tools with JSON schemas for booking, lookups, calculations, and more
Tools are called automatically based on conversation context. Your voice agent can search the web, query your documents, and execute custom business logic-all while maintaining a natural conversation flow.
### Voice Personalities
Choose from 5 distinct voices, each with unique characteristics suited to different applications:
| Voice | Type | Tone | Description | Sample |
|-------|------|------|-------------|:------:|
| **`Ara`** | Female | Warm, friendly | Default voice, balanced and conversational | |
| **`Rex`** | Male | Confident, clear | Professional and articulate, ideal for business applications | |
| **`Sal`** | Neutral | Smooth, balanced | Versatile voice suitable for various contexts | |
| **`Eve`** | Female | Energetic, upbeat | Engaging and enthusiastic, great for interactive experiences | |
| **`Leo`** | Male | Authoritative, strong | Decisive and commanding, suitable for instructional content | |
### Flexible Audio Formats
Support for multiple audio formats and sample rates to match your application's requirements:
* **PCM (Linear16)** - High-quality audio with configurable sample rates (8kHz–48kHz)
* **G.711 μ-law** - Optimized for telephony applications
* **G.711 A-law** - Standard for international telephony
### Example Applications
Complete working examples are available demonstrating various voice integration patterns:
#### Web Voice Agent
Real-time voice chat in the browser with React frontend and Python/Node.js backends.
**Architecture:**
```
Browser (React) ←WebSocket→ Backend (FastAPI/Express) ←WebSocket→ xAI API
```
**Features:**
* Real-time audio streaming
* Visual transcript display
* Debug console for development
* Interchangeable backends
[GitHub →](https://github.com/xai-org/xai-cookbook/tree/main/voice-examples/agent/web)
#### Phone Voice Agent (Twilio)
AI-powered phone system using Twilio integration.
**Architecture:**
```
Phone Call ←SIP→ Twilio ←WebSocket→ Node.js Server ←WebSocket→ xAI API
```
**Features:**
* Phone call integration
* Real-time voice processing
* Function/tool calling support
* Production-ready architecture
[GitHub →](https://github.com/xai-org/xai-cookbook/tree/main/voice-examples/agent/telephony)
#### WebRTC Voice Agent
The Grok Voice Agent API uses WebSocket connections. Direct WebRTC connections are not available currently.
You can use a WebRTC server to connect the client to a server that then connects to the Grok Voice Agent API.
**Architecture:**
```
Browser (React) ←WebRTC→ Backend (Express) ←WebSocket→ xAI API
```
**Features:**
* Real-time audio streaming
* Visual transcript display
* Debug console for development
* WebRTC backend handles all WebSocket connections to xAI API
[GitHub →](https://github.com/xai-org/xai-cookbook/tree/main/voice-examples/agent/webrtc)
### Third Party Integrations
Build voice agents using popular third-party frameworks and platforms that integrate with the Grok Voice Agent API.
**LiveKit**
Build real-time voice agents using LiveKit's open-source framework with native Grok Voice Agent API integration and WebRTC Support.
[Docs →](https://docs.livekit.io/agents/integrations/xai/) | [GitHub →](https://github.com/livekit/agents/tree/main/livekit-plugins/livekit-plugins-xai)
**Voximplant**
Build real-time voice agents using Voximplant's open-source framework with native Grok Voice Agent API integration and SIP Support.
[Docs →](https://voximplant.com/products/grok-client) | [GitHub →](https://github.com/voximplant/grok-voice-agent-example)
**Pipecat**
Build real-time voice agents using Pipecat's open-source framework with native Grok Voice Agent API integration and advanced conversation management.
[Docs →](https://docs.pipecat.ai/server/services/s2s/grok) | [GitHub →](https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/51-grok-realtime.py)
===/developers/model-capabilities/files/chat-with-files===
#### Model Capabilities
# Chat with Files
Once you've uploaded files, you can reference them in conversations using the `file()` helper function in the xAI Python SDK. When files are attached, the system automatically enables document search capabilities, transforming your request into an agentic workflow.
## Basic Chat with a Single File
Reference an uploaded file in a conversation to let the model search through it for relevant information.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload a document
document_content = b"""Quarterly Sales Report - Q4 2024
Revenue Summary:
- Total Revenue: $5.2M
- Year-over-Year Growth: +18%
- Quarter-over-Quarter Growth: +7%
Top Performing Products:
- Product A: $2.1M revenue (+25% YoY)
- Product B: $1.8M revenue (+12% YoY)
- Product C: $1.3M revenue (+15% YoY)
"""
uploaded_file = client.files.upload(document_content, filename="sales_report.txt")
# Create a chat with the file attached
chat = client.chat.create(model="grok-4-fast")
chat.append(user("What was the total revenue in this report?", file(uploaded_file.id)))
# Get the response
response = chat.sample()
print(f"Answer: {response.content}")
print(f"\\nUsage: {response.usage}")
# Clean up
client.files.delete(uploaded_file.id)
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# Upload a file
document_content = b"""Quarterly Sales Report - Q4 2024
Revenue Summary:
- Total Revenue: $5.2M
- Year-over-Year Growth: +18%
"""
with open("temp_sales.txt", "wb") as f:
f.write(document_content)
with open("temp_sales.txt", "rb") as f:
uploaded_file = client.files.create(file=f, purpose="assistants")
# Create a chat with the file
response = client.responses.create(
model="grok-4-fast",
input=[
{
"role": "user",
"content": [
{"type": "input_text", "text": "What was the total revenue in this report?"},
{"type": "input_file", "file_id": uploaded_file.id}
]
}
]
)
final_answer = response.output[-1].content[0].text
print(f"Answer: {final_answer}")
# Clean up
client.files.delete(uploaded_file.id)
```
```pythonRequests
import os
import requests
api_key = os.getenv("XAI_API_KEY")
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
# Upload file first
upload_url = "https://api.x.ai/v1/files"
files = {"file": ("sales_report.txt", b"Total Revenue: $5.2M")}
data = {"purpose": "assistants"}
upload_response = requests.post(upload_url, headers={"Authorization": f"Bearer {api_key}"}, files=files, data=data)
file_id = upload_response.json()["id"]
# Create chat with file
chat_url = "https://api.x.ai/v1/responses"
payload = {
"model": "grok-4-fast",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "What was the total revenue in this report?"},
{"type": "input_file", "file_id": file_id}
]
}
]
}
response = requests.post(chat_url, headers=headers, json=payload)
print(response.json())
```
```bash
# First upload the file
FILE_ID=$(curl https://api.x.ai/v1/files \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-F file=@sales_report.txt \\
-F purpose=assistants | jq -r '.id')
# Then use it in chat
curl -X POST "https://api.x.ai/v1/responses" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-H "Content-Type: application/json" \\
-d "{
\\"model\\": \\"grok-4-fast\\",
\\"input\\": [
{
\\"role\\": \\"user\\",
\\"content\\": [
{\\"type\\": \\"input_text\\", \\"text\\": \\"What was the total revenue in this report?\\"},
{\\"type\\": \\"input_file\\", \\"file_id\\": \\"$FILE_ID\\"}
]
}
]
}"
```
## Streaming Chat with Files
Get real-time responses while the model searches through your documents.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload a document
document_content = b"""Product Specifications:
- Model: XR-2000
- Weight: 2.5 kg
- Dimensions: 30cm x 20cm x 10cm
- Power: 100W
- Features: Wireless connectivity, LCD display, Energy efficient
"""
uploaded_file = client.files.upload(document_content, filename="specs.txt")
# Create chat with streaming
chat = client.chat.create(model="grok-4-fast")
chat.append(user("What is the weight of the XR-2000?", file(uploaded_file.id)))
# Stream the response
is_thinking = True
for response, chunk in chat.stream():
# Show tool calls as they happen
for tool_call in chunk.tool_calls:
print(f"\\nSearching: {tool_call.function.name}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nAnswer:")
is_thinking = False
if chunk.content:
print(chunk.content, end="", flush=True)
print(f"\\n\\nUsage: {response.usage}")
# Clean up
client.files.delete(uploaded_file.id)
```
## Multiple File Attachments
Query across multiple documents simultaneously.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload multiple documents
file1_content = b"Document 1: The project started in January 2024."
file2_content = b"Document 2: The project budget is $500,000."
file3_content = b"Document 3: The team consists of 5 engineers and 2 designers."
file1 = client.files.upload(file1_content, filename="timeline.txt")
file2 = client.files.upload(file2_content, filename="budget.txt")
file3 = client.files.upload(file3_content, filename="team.txt")
# Create chat with multiple files
chat = client.chat.create(model="grok-4-fast")
chat.append(
user(
"Based on these documents, when did the project start, what is the budget, and how many people are on the team?",
file(file1.id),
file(file2.id),
file(file3.id),
)
)
response = chat.sample()
print(f"Answer: {response.content}")
print("\\nDocuments searched: 3")
print(f"Usage: {response.usage}")
# Clean up
client.files.delete(file1.id)
client.files.delete(file2.id)
client.files.delete(file3.id)
```
## Multi-Turn Conversations with Files
Maintain context across multiple questions about the same documents. Use encrypted content to preserve file context efficiently across multiple turns.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload an employee record
document_content = b"""Employee Information:
Name: Alice Johnson
Department: Engineering
Years of Service: 5
Performance Rating: Excellent
Skills: Python, Machine Learning, Cloud Architecture
Current Project: AI Platform Redesign
"""
uploaded_file = client.files.upload(document_content, filename="employee.txt")
# Create a multi-turn conversation with encrypted content
chat = client.chat.create(
model="grok-4-fast",
use_encrypted_content=True, # Enable encrypted content for efficient multi-turn
)
# First turn: Ask about the employee name
chat.append(user("What is the employee's name?", file(uploaded_file.id)))
response1 = chat.sample()
print("Q1: What is the employee's name?")
print(f"A1: {response1.content}\\n")
# Add the response to conversation history
chat.append(response1)
# Second turn: Ask about department (agentic context is retained via encrypted content)
chat.append(user("What department does this employee work in?"))
response2 = chat.sample()
print("Q2: What department does this employee work in?")
print(f"A2: {response2.content}\\n")
# Add the response to conversation history
chat.append(response2)
# Third turn: Ask about skills
chat.append(user("What skills does this employee have?"))
response3 = chat.sample()
print("Q3: What skills does this employee have?")
print(f"A3: {response3.content}\\n")
# Clean up
client.files.delete(uploaded_file.id)
```
## Combining Files with Other Modalities
You can combine file attachments with images and other content types in a single message.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file, image
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload a text document with cat care information
text_content = b"Cat Care Guide: Cats require daily grooming, especially long-haired breeds. Regular brushing helps prevent matting and reduces shedding."
text_file = client.files.upload(text_content, filename="cat-care.txt")
# Use both file and image in the same message
chat = client.chat.create(model="grok-4-fast")
chat.append(
user(
"Based on the attached care guide, do you have any advice about the pictured cat?",
file(text_file.id),
image("https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"),
)
)
response = chat.sample()
print(f"Analysis: {response.content}")
print(f"\\nUsage: {response.usage}")
# Clean up
client.files.delete(text_file.id)
```
## Combining Files with Code Execution
For data analysis tasks, you can attach data files and enable the code execution tool. This allows Grok to write and run Python code to analyze and process your data.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, file
from xai_sdk.tools import code_execution
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Upload a CSV data file
csv_content = b"""product,region,revenue,units_sold
Product A,North,245000,1200
Product A,South,189000,950
Product A,East,312000,1500
Product A,West,278000,1350
Product B,North,198000,800
Product B,South,156000,650
Product B,East,234000,950
Product B,West,201000,850
Product C,North,167000,700
Product C,South,134000,550
Product C,East,198000,800
Product C,West,176000,725
"""
data_file = client.files.upload(csv_content, filename="sales_data.csv")
# Create chat with both file attachment and code execution
chat = client.chat.create(
model="grok-4-fast",
tools=[code_execution()], # Enable code execution
)
chat.append(
user(
"Analyze this sales data and calculate: 1) Total revenue by product, 2) Average units sold by region, 3) Which product-region combination has the highest revenue",
file(data_file.id)
)
)
# Stream the response to see code execution in real-time
is_thinking = True
for response, chunk in chat.stream():
for tool_call in chunk.tool_calls:
if tool_call.function.name == "code_execution":
print("\\n[Executing Code]")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nAnalysis Results:")
is_thinking = False
if chunk.content:
print(chunk.content, end="", flush=True)
print(f"\\n\\nUsage: {response.usage}")
# Clean up
client.files.delete(data_file.id)
```
The model will:
1. Access the attached data file
2. Write Python code to load and analyze the data
3. Execute the code in a sandboxed environment
4. Perform calculations and statistical analysis
5. Return the results and insights in the response
## Limitations and Considerations
### Request Constraints
* **No batch requests**: File attachments with document search are agentic requests and do not support batch mode (`n > 1`)
* **Streaming recommended**: Use streaming mode for better observability of document search process
### Document Complexity
* Highly unstructured or very long documents may require more processing
* Well-organized documents with clear structure are easier to search
* Large documents with many searches can result in higher token usage
### Model Compatibility
* **Recommended models**: `grok-4-fast`, `grok-4` for best document understanding
* **Agentic requirement**: File attachments require [agentic-capable](/developers/tools/overview) models that support server-side tools.
## Next Steps
Learn more about managing your files:
===/developers/model-capabilities/images/generation===
#### Model Capabilities
# Image Generation
Generate images from text prompts, edit existing images with natural language, or iteratively refine images through multi-turn conversations. The API supports batch generation of multiple images, and control over aspect ratio and resolution.
## Quick Start
Generate an image with a single API call:
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
response = client.image.sample(
prompt="A collage of London landmarks in a stenciled street‑art style",
model="grok-imagine-image",
)
print(response.url)
```
```bash
curl -X POST https://api.x.ai/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-image",
"prompt": "A collage of London landmarks in a stenciled street‑art style"
}'
```
```python customLanguage="pythonOpenAISDK"
from openai import OpenAI
client = OpenAI(
base_url="https://api.x.ai/v1",
api_key="YOUR_API_KEY",
)
response = client.images.generate(
model="grok-imagine-image",
prompt="A collage of London landmarks in a stenciled street‑art style",
)
print(response.data[0].url)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: 'https://api.x.ai/v1',
});
const response = await client.images.generate({
model: "grok-imagine-image",
prompt: "A collage of London landmarks in a stenciled street‑art style",
});
console.log(response.data[0].url);
```
```javascript customLanguage="javascriptAISDK"
import { xai } from "@ai-sdk/xai";
import { generateImage } from "ai";
const { image } = await generateImage({
model: xai.image("grok-imagine-image"),
prompt: "A collage of London landmarks in a stenciled street‑art style",
});
console.log(image.base64);
```
Images are returned as URLs by default. URLs are temporary, so download or process promptly. You can also request [base64 output](#base64-output) for embedding images directly.
## Image Editing
Edit an existing image by providing a source image along with your prompt. The model understands the image content and applies your requested changes.
The OpenAI SDK's `images.edit()` method is not supported for image editing because it uses `multipart/form-data`, while the xAI API requires `application/json`. Use the xAI SDK, Vercel AI SDK, or direct HTTP requests instead.
With the xAI SDK, use the same `sample()` method — just add the `image_url` parameter:
```python customLanguage="pythonXAI"
import base64
import xai_sdk
client = xai_sdk.Client()
# Load image from file and encode as base64
with open("landmarks.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.image.sample(
prompt="Change the landmarks to be New York City landmarks",
model="grok-imagine-image",
image_url=f"data:image/jpeg;base64,{image_data}",
)
print(response.url)
```
```bash
# Using a public URL as the source image
curl -X POST https://api.x.ai/v1/images/edits \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-image",
"prompt": "Change the landmarks to be New York City landmarks",
"image": {
"url": "https://example.com/landmarks.jpg",
"type": "image_url"
}
}'
```
```javascript customLanguage="javascriptAISDK"
import { xai } from "@ai-sdk/xai";
import { generateImage } from "ai";
import fs from "fs";
// Load image and encode as base64
const imageBuffer = fs.readFileSync("landmarks.jpg");
const base64Image = imageBuffer.toString("base64");
const { image } = await generateImage({
model: xai.image("grok-imagine-image"),
prompt: "Change the landmarks to be New York City landmarks",
providerOptions: {
xai: {
image: `data:image/jpeg;base64,${base64Image}`,
},
},
});
console.log(image.base64);
```
You can provide the source image as:
* A **public URL** pointing to an image
* A **base64-encoded data URI** (e.g., `data:image/jpeg;base64,...`)
## Multi-Turn Editing
Chain multiple edits together by using each output as the input for the next. This enables iterative refinement — start with a base image and progressively add details, adjust styles, or make corrections.
## Style Transfer
The `grok-imagine-image` model excels across a wide range of visual styles — from ultra-realistic photography to anime, oil paintings, pencil sketches, and beyond. Transform existing images by simply describing the desired aesthetic in your prompt.
## Concurrent Requests
When you need to generate multiple images with **different prompts** — such as applying various style transfers to the same source image, or generating unrelated images in parallel — use `AsyncClient` with `asyncio.gather` to fire requests concurrently. This is significantly faster than issuing them one at a time.
If you want multiple variations from the **same prompt**, use [`sample_batch()` with the `n` parameter](#multiple-images) instead. That generates all images in a single request and is the most efficient approach for same-prompt generation.
```python customLanguage="pythonXAI"
import asyncio
import xai_sdk
async def generate_concurrently():
client = xai_sdk.AsyncClient()
source_image = "https://example.com/portrait.jpg"
# Each request uses a different prompt
prompts = [
"Render this image as an oil painting in the style of impressionism",
"Render this image as a pencil sketch with detailed shading",
"Render this image as pop art with bold colors and halftone dots",
"Render this image as a watercolor painting with soft edges",
]
# Fire all requests concurrently
tasks = [
client.image.sample(
prompt=prompt,
model="grok-imagine-image",
image_url=source_image,
)
for prompt in prompts
]
results = await asyncio.gather(*tasks)
for prompt, result in zip(prompts, results):
print(f"{prompt}: {result.url}")
asyncio.run(generate_concurrently())
```
## Configuration
### Multiple Images
Generate multiple images in a single request using the `sample_batch()` method and the `n` parameter. This returns a list of `ImageResponse` objects.
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
responses = client.image.sample_batch(
prompt="A futuristic city skyline at night",
model="grok-imagine-image",
n=4,
)
for i, image in enumerate(responses):
print(f"Variation {i + 1}: {image.url}")
```
```bash
curl -X POST https://api.x.ai/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-image",
"prompt": "A futuristic city skyline at night",
"n": 4
}'
```
### Aspect Ratio
Control image dimensions with the `aspect_ratio` parameter:
| Ratio | Use case |
|-------|----------|
| `1:1` | Social media, thumbnails |
| `16:9` / `9:16` | Widescreen, mobile, stories |
| `4:3` / `3:4` | Presentations, portraits |
| `3:2` / `2:3` | Photography |
| `2:1` / `1:2` | Banners, headers |
| `19.5:9` / `9:19.5` | Modern smartphone displays |
| `20:9` / `9:20` | Ultra-wide displays |
| `auto` | Model auto-selects the best ratio for the prompt |
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
response = client.image.sample(
prompt="Mountain landscape at sunrise",
model="grok-imagine-image",
aspect_ratio="16:9",
)
print(response.url)
```
```bash
curl -X POST https://api.x.ai/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-image",
"prompt": "Mountain landscape at sunrise",
"aspect_ratio": "16:9"
}'
```
### Resolution
You can specify different resolutions of the output image. Currently supported image resolutions are:
* 1k
* 2k
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
response = client.image.sample(
prompt="An astronaut performing EVA in LEO.",
model="grok-imagine-image",
resolution="2k"
)
```
```bash
curl -X POST https://api.x.ai/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-image",
"prompt": "An astronaut performing EVA in LEO.",
"resolution": "2k"
}'
```
### Base64 Output
For embedding images directly without downloading, request base64:
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
response = client.image.sample(
prompt="A serene Japanese garden",
model="grok-imagine-image",
image_format="base64",
)
# Save to file
with open("garden.jpg", "wb") as f:
f.write(response.image)
```
```bash
curl -X POST https://api.x.ai/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-image",
"prompt": "A serene Japanese garden",
"response_format": "b64_json"
}'
```
### Response Details
The xAI SDK exposes additional metadata on the response object beyond the image URL or base64 data.
**Moderation** — Check whether the generated image passed content moderation:
```python customLanguage="pythonXAI"
if response.respect_moderation:
print(response.url)
else:
print("Image filtered by moderation")
```
**Model** — Get the actual model used (resolving any aliases):
```python customLanguage="pythonXAI"
print(f"Model: {response.model}")
```
## Pricing
Image generation uses flat per-image pricing rather than token-based pricing like text models. Each generated image incurs a fixed fee regardless of prompt length.
For image editing, you are charged for both the input image and the generated output image.
For full pricing details on the `grok-imagine-image` model, see the [model page](/developers/models/grok-imagine-image).
## Limitations
* **Maximum images per request:** 10
* **URL expiration:** Generated URLs are temporary
* **Content moderation:** Images are subject to content policy review
## Related
* [Models](/developers/models) — Available image models
* [Video Generation](/developers/model-capabilities/video/generation) — Animate generated images
* [API Reference](/developers/rest-api-reference) — Full endpoint documentation
===/developers/model-capabilities/images/understanding===
#### Model Capabilities
# Image Understanding
When sending images, it is advised to not store request/response history on the server. Otherwise the request may fail.
See .
Some models allow images in the input. The model will consider the image context when generating the response.
## Constructing the message body - difference from text-only prompt
The request message to image understanding is similar to text-only prompt. The main difference is that instead of text input:
```json
[
{
"role": "user",
"content": "What is in this image?"
}
]
```
We send in `content` as a list of objects:
```json
[
{
"role": "user",
"content": [
{
"type": "input_image",
"image_url": "data:image/jpeg;base64,",
"detail": "high"
},
{
"type": "input_text",
"text": "What is in this image?"
}
]
}
]
```
The `image_url.url` can also be the image's url on the Internet.
### Image understanding example
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, image
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
image_url = "https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(
user(
"What's in this image?",
image(image_url=image_url, detail="high"),
)
)
response = chat.sample()
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
```
```python customLanguage="pythonOpenAISDK"
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
image_url = (
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": [
{
"type": "input_image",
"image_url": image_url,
"detail": "high",
},
{
"type": "input_text",
"text": "What's in this image?",
},
],
},
],
)
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const image_url =
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png";
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{
role: "user",
content: [
{
type: "input_image",
image_url: image_url,
detail: "high",
},
{
type: "input_text",
text: "What's in this image?",
},
],
},
],
});
console.log(response);
// The response ID that can be used to recall the conversation later
console.log(response.id);
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text, response } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
messages: [
{
role: 'user',
content: [
{
type: 'image',
image: new URL('https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png'),
},
{
type: 'text',
text: "What's in this image?",
},
],
},
]
});
console.log(text);
// The response ID can be used to continue the conversation
console.log(response.id);
```
```bash
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": [
{
"type": "input_image",
"image_url": "https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png",
"detail": "high"
},
{
"type": "input_text",
"text": "What'\''s in this image?"
}
]
}
]
}'
```
### Image input general limits
* Maximum image size: `20MiB`
* Maximum number of images: No limit
* Supported image file types: `jpg/jpeg` or `png`.
* Any image/text input order is accepted (e.g. text prompt can precede image prompt)
===/developers/model-capabilities/legacy/chat-completions===
#### Model Capabilities
# Chat Completions
Chat Completions is offered as a legacy endpoint. Most of our new features will come to first.
Looking to migrate? Check out our [Migrating to Responses API](/developers/model-capabilities/text/comparison) guide for a detailed comparison and migration steps.
Text in, text out. Chat is the most popular feature on the xAI API, and can be used for anything from summarizing articles, generating creative writing, answering questions, providing customer support, to assisting with coding tasks.
## Prerequisites
* xAI Account: You need an xAI account to access the API.
* API Key: Ensure that your API key has access to the Chat Completions endpoint and the model you want to use is enabled.
If you don't have these and are unsure of how to create one, follow [the Hitchhiker's Guide to Grok](/developers/quickstart).
You can create an API key on the [xAI Console API Keys Page](https://console.x.ai/team/default/api-keys).
Set your API key in your environment:
```bash
export XAI_API_KEY="your_api_key"
```
## A basic chat completions example
You can also stream the response, which is covered in [Streaming Response](/developers/model-capabilities/text/streaming).
The user sends a request to the xAI API endpoint. The API processes this and returns a complete response.
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
timeout=3600, # Override default timeout with longer timeout for reasoning models
)
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(system("You are a PhD-level mathematician."))
chat.append(user("What is 2 + 2?"))
response = chat.sample()
print(response.content)
```
```python customLanguage="pythonOpenAISDK"
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
completion = client.chat.completions.create(
model="grok-4-1-fast-reasoning",
messages=[
{"role": "system", "content": "You are a PhD-level mathematician."},
{"role": "user", "content": "What is 2 + 2?"},
],
)
print(completion.choices[0].message)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const completion = await client.chat.completions.create({
model: "grok-4-1-fast-reasoning",
messages: [
{
role: "system",
content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
role: "user",
content: "What is the meaning of life, the universe, and everything?"
},
],
});
console.log(completion.choices[0].message);
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const result = await generateText({
model: xai('grok-4-1-fast-reasoning'),
system:
"You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
prompt: 'What is the meaning of life, the universe, and everything?',
});
console.log(result.text);
```
```bash
curl https://api.x.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"messages": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"model": "grok-4-1-fast-reasoning",
"stream": false
}'
```
Response:
```python customLanguage="pythonXAI"
'2 + 2 equals 4.'
```
```python customLanguage="pythonOpenAISDK"
ChatCompletionMessage(
content='2 + 2 equals 4.',
refusal=None,
role='assistant',
audio=None,
function_call=None,
tool_calls=None
)
```
```javascript customLanguage="javascriptOpenAISDK"
{
role: 'assistant',
content: `Ah, the ultimate question! According to Douglas Adams' "The Hitchhiker's Guide to the Galaxy," the answer to the ultimate question of life, the universe, and everything is **42**. However, the guide also notes that the actual question to which this is the answer is still unknown. Isn't that delightfully perplexing? Now, if you'll excuse me, I'll just go ponder the intricacies of existence.`
refusal: null
}
```
```javascript customLanguage="javascriptAISDK"
// result object structure
{
text: "Ah, the ultimate question! As someone...",
finishReason: "stop",
usage: {
inputTokens: 716,
outputTokens: 126,
totalTokens: 1009,
reasoningTokens: 167
},
totalUsage: { /* same as usage */ }
}
```
```bash
{
"id": "0daf962f-a275-4a3c-839a-047854645532",
"object": "chat.completion",
"created": 1739301120,
"model": "grok-4-1-fast-reasoning",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The meaning of life, the universe, and everything is a question that has puzzled philosophers, scientists, and hitchhikers alike. According to the Hitchhiker's Guide to the Galaxy, the answer to this ultimate question is simply \"42\". However, the exact nature of the question itself remains unknown. So, while we may have the answer, the true meaning behind it is still up for debate. In the meantime, perhaps we should all just enjoy the journey and have a good laugh along the way!",
"refusal": null
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 41,
"completion_tokens": 104,
"total_tokens": 145,
"prompt_tokens_details": {
"text_tokens": 41,
"audio_tokens": 0,
"image_tokens": 0,
"cached_tokens": 0
}
},
"system_fingerprint": "fp_84ff176447"
}
```
## Conversations
The xAI API is stateless and does not process new request with the context of your previous request history.
However, you can provide previous chat generation prompts and results to a new chat generation request to let the model process your new request with the context in mind.
An example message:
```json
{
"role": "system",
"content": [{ "type": "text", "text": "You are a helpful and funny assistant."}]
}
{
"role": "user",
"content": [{ "type": "text", "text": "Why don't eggs tell jokes?" }]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "They'd crack up!" }]
},
{
"role": "user",
"content": [{"type": "text", "text": "Can you explain the joke?"}],
}
```
By specifying roles, you can change how the model ingests the content.
The `system` role content should define, in an instructive tone, the way the model should respond to user request.
The `user` role content is usually used for user requests or data sent to the model.
The `assistant` role content is usually either in the model's response, or when sent within the prompt, indicates the model's response as part of conversation history.
The `developer` role is supported as an alias for `system`. Only a **single** system/developer message should be used, and it should always be the **first message** in your conversation.
## Image understanding
Some models allow image in the input. The model will consider the image context, when generating the response.
### Constructing the message body - difference from text-only prompt
The request message to image understanding is similar to text-only prompt. The main difference is that instead of text input:
```json
[
{
"role": "user",
"content": "What is in this image?"
}
]
```
We send in `content` as a list of objects:
```json
[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,",
"detail": "high"
}
},
{
"type": "text",
"text": "What is in this image?"
}
]
}
]
```
The `image_url.url` can also be the image's url on the Internet.
### Image understanding example
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, image
client = Client(api_key=os.getenv('XAI_API_KEY'))
image_url = "https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"
chat = client.chat.create(model="grok-4")
chat.append(
user(
"What's in this image?",
image(image_url=image_url, detail="high"),
)
)
response = chat.sample()
print(response.content)
```
```pythonOpenAISDK
import os
from openai import OpenAI
XAI_API_KEY = os.getenv("XAI_API_KEY")
image_url = (
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"
)
client = OpenAI(
api_key=XAI_API_KEY,
base_url="https://api.x.ai/v1",
)
messages = [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": image_url,
"detail": "high",
},
},
{
"type": "text",
"text": "What's in this image?",
},
],
},
]
completion = client.chat.completions.create(
model="grok-4",
messages=messages,
)
print(completion.choices[0].message.content)
```
```javascriptOpenAISDK
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: "https://api.x.ai/v1",
});
const image_url =
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png";
const completion = await openai.chat.completions.create({
model: "grok-4",
messages: [
{
role: "user",
content: [
{
type: "image_url",
image_url: {
url: image_url,
detail: "high",
},
},
{
type: "text",
text: "What's in this image?",
},
],
},
],
});
console.log(completion.choices[0].message.content);
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const result = await generateText({
model: "grok-4",
messages: [
{
role: 'user',
content: [
{
type: 'image',
image: new URL(
'https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png',
),
},
{
type: 'text',
text: "What's in this image?",
},
],
},
],
});
console.log(result.text);
```
### Image input general limits
* Maximum image size: `20MiB`
* Maximum number of images: No limit
* Supported image file types: `jpg/jpeg` or `png`.
* Any image/text input order is accepted (e.g. text prompt can precede image prompt)
### Image detail levels
The `"detail"` field controls the level of pre-processing applied to the image that will be provided to the model. It is optional and determines the resolution at which the image is processed. The possible values for `"detail"` are:
* **`"auto"`**: The system will automatically determine the image resolution to use. This is the default setting, balancing speed and detail based on the model's assessment.
* **`"low"`**: The system will process a low-resolution version of the image. This option is faster and consumes fewer tokens, making it more cost-effective, though it may miss finer details.
* **`"high"`**: The system will process a high-resolution version of the image. This option is slower and more expensive in terms of token usage, but it allows the model to attend to more nuanced details in the image.
===/developers/model-capabilities/text/comparison===
#### Model Capabilities
# Comparison with Chat Completions API
The Responses API is the recommended way to interact with xAI models. Here's how it compares to the legacy Chat Completions API:
| Feature | Responses API | Chat Completions API (Deprecated) |
|---------|---------------|-----------------------------------|
| **Stateful Conversations** | Built-in support via `previous_response_id` | Stateless - must resend full history |
| **Server-side Storage** | Responses stored for 30 days | No storage - manage history yourself |
| **Reasoning Models** | Full support with encrypted reasoning content | Limited - only `grok-3-mini` returns `reasoning_content` |
| **Agentic Tools** | Native support for tools (search, code execution, MCP) | Function calling only |
| **Billing Optimization** | Automatic caching of conversation history | Full history billed on each request |
| **Future Features** | All new capabilities delivered here first | Legacy endpoint, limited updates |
## Key API Changes
### Parameter Mapping
| Chat Completions | Responses API | Notes |
|-----------------|---------------|-------|
| `messages` | `input` | Array of message objects |
| `max_tokens` | `max_output_tokens` | Maximum tokens to generate |
| — | `previous_response_id` | Continue a stored conversation |
| — | `store` | Control server-side storage (default: `true`) |
| — | `include` | Request additional data like `reasoning.encrypted_content` |
### Response Structure
The response format differs between the two APIs:
**Chat Completions** returns content in `choices[0].message.content`:
```json
{
"id": "chatcmpl-123",
"choices": [{
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
}
}]
}
```
**Responses API** returns content in an `output` array with typed items:
```json
{
"id": "resp_123",
"output": [{
"type": "message",
"role": "assistant",
"content": [{
"type": "output_text",
"text": "Hello! How can I help you?"
}]
}]
}
```
### Multi-turn Conversations
With Chat Completions, you must resend the entire conversation history with each request. With Responses API, you can use `previous_response_id` to continue a conversation:
```pythonWithoutSDK
# First request
response = client.responses.create(
model="grok-4",
input=[{"role": "user", "content": "What is 2+2?"}],
)
# Continue the conversation - no need to resend history
second_response = client.responses.create(
model="grok-4",
previous_response_id=response.id,
input=[{"role": "user", "content": "Now multiply that by 10"}],
)
```
## Migration Path
Migrating from Chat Completions to Responses API is straightforward. Here's how to update your code for each SDK:
### Vercel AI SDK
Switch from `xai()` to `xai.responses()`:
```javascriptAISDK deletedLines="1" addedLines="2"
model: xai('grok-4'),
model: xai.responses('grok-4'),
```
### OpenAI SDK (JavaScript)
Switch from `client.chat.completions.create` to `client.responses.create`, and rename `messages` to `input`:
```javascriptWithoutSDK deletedLines="1,3" addedLines="2,4"
const response = await client.chat.completions.create({
const response = await client.responses.create({
messages: [
input: [
{ role: "user", content: "Hello!" }
],
});
```
### OpenAI SDK (Python)
Switch from `client.chat.completions.create` to `client.responses.create`, and rename `messages` to `input`:
```pythonWithoutSDK deletedLines="1,3" addedLines="2,4"
response = client.chat.completions.create(
response = client.responses.create(
messages=[
input=[
{"role": "user", "content": "Hello!"}
],
)
```
### cURL
Change the endpoint from `/v1/chat/completions` to `/v1/responses`, and rename `messages` to `input`:
```bash deletedLines="1,5" addedLines="2,6"
curl https://api.x.ai/v1/chat/completions \
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{ "model": "grok-4", "messages": [{"role": "user", "content": "Hello!"}] }'
-d '{ "model": "grok-4", "input": [{"role": "user", "content": "Hello!"}] }'
```
This will work for most use cases. If you have a unique integration, refer to the [Responses API documentation](/developers/model-capabilities/text/generate-text) for detailed guidance.
===/developers/model-capabilities/text/generate-text===
#### Model Capabilities
# Generate Text
The Responses API is the preferred way of interacting with our models via API. It allows optional **stateful interactions** with our models,
where **previous input prompts, reasoning content and model responses are saved and stored on xAI's servers**. You can continue the interaction by appending new
prompt messages instead of resending the full conversation. This behavior is on by default. If you would like to store your request/response locally, please see [Disable storing previous request/response on server](#disable-storing-previous-requestresponse-on-server).
Although you don't need to enter the conversation history in the request body, you will still be
billed for the entire conversation history when using Responses API. The cost might be reduced as
part of the conversation history is .
**The responses will be stored for 30 days, after which they will be removed. This means you can use the response ID to retrieve or continue a conversation within 30 days of sending the request.**
If you want to continue a conversation after 30 days, please store your responses history and the encrypted thinking content locally, and pass them in a new request body.
For Python, we also offer our [xAI SDK](https://github.com/xai-org/xai-sdk-python) which covers all of our features and uses gRPC for optimal performance. It's fine to mix both. The xAI SDK allows you to interact with all our products such as Collections, Voice API, API key management, and more, while the Responses API is more suited for chatbots and usage in RESTful APIs.
## Prerequisites
* xAI Account: You need an xAI account to access the API.
* API Key: Ensure that your API key has access to the Responses API endpoint and the model you want to use is enabled.
If you don't have these and are unsure of how to create one, follow [the Hitchhiker's Guide to Grok](/developers/quickstart).
You can create an API key on the [xAI Console API Keys Page](https://console.x.ai/team/default/api-keys).
Set your API key in your environment:
```bash
export XAI_API_KEY="your_api_key"
```
## Creating a new model response
The first step in using Responses API is analogous to using the legacy Chat Completions API. You will create a new response with prompts. By default, your request/response history is stored on our server.
`instructions` parameter is currently not supported. The API will return an error if it is specified.
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(system("You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."))
chat.append(user("What is the meaning of life, the universe, and everything?"))
response = chat.sample()
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
```
```python customLanguage="pythonOpenAISDK"
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
)
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{
role: "system",
content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
role: "user",
content: "What is the meaning of life, the universe, and everything?"
},
],
});
console.log(response);
// The response ID that can be used to recall the conversation later
console.log(response.id);
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text, response } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
system: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
prompt: "What is the meaning of life, the universe, and everything?",
});
console.log(text);
// The response ID can be used to continue the conversation
console.log(response.id);
```
```bash
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker'\''s Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
]
}'
```
The `developer` role is supported as an alias for `system`. Only a **single** system/developer message should be used, and it should always be the **first message** in your conversation.
### Disable storing previous request/response on server
If you do not want to store your previous request/response on the server, you can set `store: false` on the request.
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
chat = client.chat.create(model="grok-4-1-fast-reasoning", store_messages=False)
chat.append(system("You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."))
chat.append(user("What is the meaning of life, the universe, and everything?"))
response = chat.sample()
print(response)
```
```python customLanguage="pythonOpenAISDK"
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
store=False
)
print(response)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{
role: "system",
content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
role: "user",
content: "What is the meaning of life, the universe, and everything?"
},
],
store: false
});
console.log(response);
```
```bash
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker'\''s Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"store": false
}'
```
### Returning encrypted thinking content
If you want to return the encrypted thinking traces, you need to specify `use_encrypted_content=True` in xAI SDK or gRPC request message, or `include: ["reasoning.encrypted_content"]` in the request body.
Modify the steps to create a chat client (xAI SDK) or change the request body as following:
```python customLanguage="pythonXAI"
chat = client.chat.create(model="grok-4-1-fast-reasoning",
use_encrypted_content=True)
```
```python customLanguage="pythonOpenAISDK"
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
include=["reasoning.encrypted_content"]
)
```
```javascript customLanguage="javascriptWithoutSDK"
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
include: ["reasoning.encrypted_content"],
});
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text, reasoning } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
system: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
prompt: "What is the meaning of life, the universe, and everything?",
providerOptions: {
xai: {
include: ['reasoning.encrypted_content'],
},
},
});
console.log(text);
console.log(reasoning); // Contains encrypted reasoning content
```
```bash
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker'\''s Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"include": ["reasoning.encrypted_content"]
}'
```
See [Adding encrypted thinking content](#adding-encrypted-thinking-content) on how to use the returned encrypted thinking content when making a new request.
## Chaining the conversation
We now have the `id` of the first response. With Chat Completions API, we typically send a stateless new request with all the previous messages.
With Responses API, we can send the `id` of the previous response, and the new messages to append to it.
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
chat = client.chat.create(model="grok-4-1-fast-reasoning", store_messages=True)
chat.append(system("You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."))
chat.append(user("What is the meaning of life, the universe, and everything?"))
response = chat.sample()
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
# New steps
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
previous_response_id=response.id,
store_messages=True,
)
chat.append(user("What is the meaning of 42?"))
second_response = chat.sample()
print(second_response)
# The response ID that can be used to continue the conversation later
print(second_response.id)
```
```python customLanguage="pythonOpenAISDK"
# Previous steps
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
)
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
# New steps
second_response = client.responses.create(
model="grok-4-1-fast-reasoning",
previous_response_id=response.id,
input=[
{"role": "user", "content": "What is the meaning of 42?"},
],
)
print(second_response)
# The response ID that can be used to continue the conversation later
print(second_response.id)
```
```javascript customLanguage="javascriptWithoutSDK"
// Previous steps
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{
role: "system",
content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
role: "user",
content: "What is the meaning of life, the universe, and everything?"
},
],
});
console.log(response);
// The response ID that can be used to recall the conversation later
console.log(response.id);
const secondResponse = await client.responses.create({
model: "grok-4-1-fast-reasoning",
previous_response_id: response.id,
input: [
{"role": "user", "content": "What is the meaning of 42?"},
],
});
console.log(secondResponse);
// The response ID that can be used to recall the conversation later
console.log(secondResponse.id);
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
// First request
const result = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
system: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
prompt: "What is the meaning of life, the universe, and everything?",
});
console.log(result.text);
// Get the response ID from the response object
const responseId = result.response.id;
// Continue the conversation using previousResponseId
const { text: secondResponse } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: "What is the meaning of 42?",
providerOptions: {
xai: {
previousResponseId: responseId,
},
},
});
console.log(secondResponse);
```
```bash
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"model": "grok-4-1-fast-reasoning",
"previous_response_id": "The previous response ID",
"input": [
{
"role": "user",
"content": "What is the meaning of 42?"
}
]
}'
```
### Adding encrypted thinking content
After returning the encrypted thinking content, you can also add it to a new response's input:
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
chat = client.chat.create(model="grok-4-1-fast-reasoning", store_messages=True, use_encrypted_content=True)
chat.append(system("You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."))
chat.append(user("What is the meaning of life, the universe, and everything?"))
response = chat.sample()
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
# New steps
chat.append(response) ## Append the response and the SDK will automatically add the outputs from response to message history
chat.append(user("What is the meaning of 42?"))
second_response = chat.sample()
print(second_response)
# The response ID that can be used to continue the conversation later
print(second_response.id)
```
```python customLanguage="pythonOpenAISDK"
# Previous steps
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
include=["reasoning.encrypted_content"]
)
print(response)
# The response ID that can be used to continue the conversation later
print(response.id)
# New steps
second_response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
*response.output, # Use response.output instead of the stored response
{"role": "user", "content": "What is the meaning of 42?"},
],
)
print(second_response)
# The response ID that can be used to continue the conversation later
print(second_response.id)
```
```javascript customLanguage="javascriptWithoutSDK"
// Previous steps
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{
role: "system",
content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
role: "user",
content: "What is the meaning of life, the universe, and everything?"
},
],
include: ["reasoning.encrypted_content"],
});
console.log(response);
// The response ID that can be used to recall the conversation later
console.log(response.id);
const secondResponse = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
...response.output, // Use response.output instead of the stored response
{"role": "user", "content": "What is the meaning of 42?"},
],
});
console.log(secondResponse);
// The response ID that can be used to recall the conversation later
console.log(secondResponse.id);
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
// First request with encrypted reasoning content
const result = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
system: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
prompt: "What is the meaning of life, the universe, and everything?",
providerOptions: {
xai: {
include: ['reasoning.encrypted_content'],
},
},
});
console.log(result.text);
// Continue the conversation using previousResponseId
// The encrypted content is automatically included when using previousResponseId
const { text: secondResponse } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: "What is the meaning of 42?",
providerOptions: {
xai: {
previousResponseId: result.response.id,
include: ['reasoning.encrypted_content'],
},
},
});
console.log(secondResponse);
```
```bash
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600 \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker'\''s Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
},
{
"id": "rs_51abe1aa-599b-80b6-57c8-dddc6263362f_us-east-1",
"summary": [],
"type": "reasoning",
"status": "completed",
"encrypted_content": "bvV88j99ILvgfHRTHCUSJtw+ISji6txJzPdZNbcSVuDk4OMG2Z9r5wOBBwjd3u3Hhm9XtpCWJO1YgTOlpgbn+g7DZX+pOagYYrCFUpQ19XkWz6Je8bHG9JcSDoGDqNgRbDbAUO8at6RCyqgPupJj5ArBDCt73fGQLTC4G3S0JMK9LsPiWz6GPj6qyzYoRzkj4R6bntRm74E4h8Y+z6u6B7+ixPSv8s1EFs8c+NUAB8TNKZZpXZquj2LXfx1xAie85Syl7qLqxLNtDG1dNBhBnHpYoE4gQzwyXqywf5pF2Q2imzPNzGQhurK+6gaNWgZbxRmjhdsW6TnzO5Kk6pzb5qpfgfcEScQeYHSj5GpD+yDUCNlhdbzhhWnEErH+wuBPpTG6UQhiC7m7yrJ7IY2E8K/BeUPlUvkhMaMwb4dA279pWMJdchNJ+TAxca+JVc80pXMG/PmrQUNJU9qdXRLbNmQbRadBNwV2qkPfgggL3q0yNd7Un9P+atmP3B9keBILif3ufsBDtVUobEniiyGV7YVDvQ/fQRVs7XDxJiOKkogjjQySyHgpjseO8iG5xtb9mrz6B3mDvv2aAuyDL6MHZRM7QDVPjUbgNMzDm5Sm3J7IhtzfR+3eMDws3qeTsxOt1KOslu983Btv1Wx37b5HJqX1pQU1dae/kOSJ7MifFd6wMkQtQBDgVoG3ka9wq5Vxq9Ki8bDOOMcwA2kUXhCcY3TZCXJfDWSKPTcCoNCYIv5LT2NFVdamiSfLIyeOjBNz459BfMvAoOZShFViQyc5YwjnReUQPQ8a18jcz8GoAK1O99e0h91oYxIgDV52EfS+IYrzqvJOEQbKQinB+LJwkPbBEp7ZtgAtiNBzm985hNgLfiBaVFWcRYwI3tNBCT1vkw2YI0NEEG0yOF29x+u64XzqyP1CX1pU6sGXEFn3RPdfYibf6bt/Y1BRqBL5l0CrXWsgDw02SqIFta8OvJ7Iwmq40/4acE/Ew6eWO/z2MHkWgqSpwGNjn7MfeKkTi44foZjfNqN9QOFQt6VG2tY+biKZDo0h9DAftae8Q2Xs2UDvsBYOm7YEahVkput6/uKzxljpXlz269qHk6ckvdN9hKLbaTO3/IZPCCPQ5a/a/sWn/1VOJj72sDk+23RNjBf0FL6bJMXZI5aQdtxbF1zij9mWcP9nJ9FHhj53ytuf1NiKl5xU8ZsaoKmCAJcXUz1n2FZvyWlqvgPYiszc7R8Y5dF6QbW2mlKnXzVy6qRMHNeQqGhCEncyT5nPNSdK5QlUwLokAIg"
},
{
"content": [
{
"type": "output_text",
"text": "42\n\nThis is, of course, the iconic answer from Douglas Adams'\'' *The Hitchhiker'\''s Guide to the Galaxy*, where a supercomputer named Deep Thought spends 7.5 million years computing the \"Answer to the Ultimate Question of Life, the Universe, and Everything\"—only to reveal it'\''s 42. (The real challenge, it turns out, is figuring out what the actual *question* was.)\n\nIf you'\''re asking in a more literal or philosophical sense, the universe doesn'\''t have a single tidy answer—it'\''s full of mysteries like quantum mechanics, dark matter, and why cats knock things off tables. But 42? That'\''s as good a starting point as any. What'\''s your take on it?",
"logprobs": null,
"annotations": []
}
],
"id": "msg_c2f68a9b-87cd-4f85-a9e9-b6047213a3ce_us-east-1",
"role": "assistant",
"type": "message",
"status": "completed"
},
{
"role": "user",
"content": "What is the meaning of 42?"
}
],
"include": [
"reasoning.encrypted_content"
]
}'
```
## Retrieving a previous model response
If you have a previous response's ID, you can retrieve the content of the response.
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
response = client.chat.get_stored_completion("")
print(response)
```
```python customLanguage="pythonOpenAISDK"
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.retrieve("")
print(response)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.retrieve("");
console.log(response);
```
```javascript customLanguage="javascriptAISDK"
// Note: The Vercel AI SDK does not provide a method to retrieve previous responses.
// Use the OpenAI SDK as shown above for this functionality.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000,
});
const response = await client.responses.retrieve("");
console.log(response);
```
```bash
curl https://api.x.ai/v1/responses/{response_id} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600
```
## Delete a model response
If you no longer want to store the previous model response, you can delete it.
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"),
timeout=3600,
)
response = client.chat.delete_stored_completion("")
print(response)
```
```python customLanguage="pythonOpenAISDK"
import os
import httpx
from openai import OpenAI
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.delete("")
print(response)
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.delete("");
console.log(response);
```
```javascript customLanguage="javascriptAISDK"
// Note: The Vercel AI SDK does not provide a method to delete previous responses.
// Use the OpenAI SDK as shown above for this functionality.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000,
});
const response = await client.responses.delete("");
console.log(response);
```
```bash
curl -X DELETE https://api.x.ai/v1/responses/{response_id} \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-m 3600
```
===/developers/model-capabilities/text/reasoning===
#### Model Capabilities
# Reasoning
`presencePenalty`, `frequencyPenalty` and `stop` parameters are not supported by reasoning models.
Adding them in the request would result in an error.
## Key Features
* **Think Before Responding**: Thinks through problems step-by-step before delivering an answer.
* **Math & Quantitative Strength**: Excels at numerical challenges and logic puzzles.
* **Reasoning Trace**: Usage metrics expose `reasoning_tokens`. Some models can also return encrypted reasoning via `include: ["reasoning.encrypted_content"]` (see below).
In Chat Completions, only `grok-3-mini` returns `message.reasoning_content`.
`grok-3`, `grok-4` and `grok-4-fast-reasoning` do not return `reasoning_content`. If supported, you can request [encrypted reasoning content](#encrypted-reasoning-content) via `include: ["reasoning.encrypted_content"]` in the Responses API instead.
### Encrypted Reasoning Content
For `grok-4`, the reasoning content is encrypted by us and can be returned if you pass `include: ["reasoning.encrypted_content"]` to the Responses API. You can send the encrypted content back to provide more context to a previous conversation. See [Adding encrypted thinking content](/developers/model-capabilities/text/generate-text#adding-encrypted-thinking-content) for more details on how to use the content.
## Control how hard the model thinks
`reasoning_effort` is not supported by `grok-3`, `grok-4` and `grok-4-fast-reasoning`. Specifying `reasoning_effort` parameter will get
an error response. Only `grok-3-mini` supports `reasoning_effort`.
The `reasoning_effort` parameter controls how much time the model spends thinking before responding. It must be set to one of these values:
* **`low`**: Minimal thinking time, using fewer tokens for quick responses.
* **`high`**: Maximum thinking time, leveraging more tokens for complex problems.
Choosing the right level depends on your task: use `low` for simple queries that should complete quickly, and `high` for harder problems where response latency is less important.
## Usage Example
Here’s a simple example using `grok-3-mini` to multiply 101 by 3.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import system, user
client = Client(
api_key=os.getenv("XAI_API_KEY"),
timeout=3600, # Override default timeout with longer timeout for reasoning models
)
chat = client.chat.create(
model="grok-3-mini",
reasoning_effort="high",
messages=[system("You are a highly intelligent AI assistant.")],
)
chat.append(user("What is 101\*3?"))
response = chat.sample()
print("Final Response:")
print(response.content)
print("Number of completion tokens:")
print(response.usage.completion_tokens)
print("Number of reasoning tokens:")
print(response.usage.reasoning_tokens)
```
```pythonOpenAISDK
import os
import httpx
from openai import OpenAI
client = OpenAI(
base_url="https://api.x.ai/v1",
api_key=os.getenv("XAI_API_KEY"),
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
response = client.responses.create(
model="grok-3-mini",
reasoning={"effort": "high"},
input=[
{"role": "system", "content": "You are a highly intelligent AI assistant."},
{"role": "user", "content": "What is 101*3?"},
],
)
message = next(item for item in response.output if item.type == "message")
text = next(c.text for c in message.content if c.type == "output_text")
print("Final Response:")
print(text)
print("Number of output tokens:")
print(response.usage.output_tokens)
print("Number of reasoning tokens:")
print(response.usage.output_tokens_details.reasoning_tokens)
```
```javascriptOpenAISDK
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.create({
model: "grok-3-mini",
reasoning: { effort: "high" },
input: [
{
"role": "system",
"content": "You are a highly intelligent AI assistant.",
},
{
"role": "user",
"content": "What is 101*3?",
},
],
});
// Find the message in the output array
const message = response.output.find((item) => item.type === "message");
const textContent = message?.content?.find((c) => c.type === "output_text");
console.log("\\nFinal Response:", textContent?.text);
console.log("\\nNumber of output tokens:", response.usage.output_tokens);
console.log("\\nNumber of reasoning tokens:", response.usage.output_tokens_details.reasoning_tokens);
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const result = await generateText({
model: xai.responses('grok-3-mini'),
system: 'You are a highly intelligent AI assistant.',
prompt: 'What is 101*3?',
});
console.log('Final Response:', result.text);
console.log('Number of completion tokens:', result.totalUsage.completionTokens);
console.log('Number of reasoning tokens:', result.totalUsage.reasoningTokens);
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-m 3600 \\
-d '{
"input": [
{
"role": "system",
"content": "You are a highly intelligent AI assistant."
},
{
"role": "user",
"content": "What is 101*3?"
}
],
"model": "grok-3-mini",
"reasoning": { "effort": "high" },
"stream": false
}'
```
### Sample Output
```output
Final Response:
The result of 101 multiplied by 3 is 303.
Number of completion tokens:
14
Number of reasoning tokens:
310
```
## Notes on Consumption
When you use a reasoning model, the reasoning tokens are also added to your final consumption amount. The reasoning token consumption will likely increase when you use a higher `reasoning_effort` setting.
===/developers/model-capabilities/text/streaming===
#### Model Capabilities
# Streaming
Streaming outputs is **supported by all models with text output capability** (Chat, Image Understanding, etc.). It is **not supported by models with image output capability** (Image Generation).
Streaming outputs uses [Server-Sent Events (SSE)](https://en.wikipedia.org/wiki/Server-sent_events) that let the server send back the delta of content in event streams.
Streaming responses are beneficial for providing real-time feedback, enhancing user interaction by allowing text to be displayed as it's generated.
To enable streaming, you must set `"stream": true` in your request.
When using streaming output with reasoning models, you might want to **manually override request
timeout** to avoid prematurely closing connection.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv('XAI_API_KEY'),
timeout=3600, # Override default timeout with longer timeout for reasoning models
)
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(
system("You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."),
)
chat.append(
user("What is the meaning of life, the universe, and everything?")
)
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True) # Each chunk's content
print(response.content, end="", flush=True) # The response object auto-accumulates the chunks
print(response.content) # The full response
```
```pythonOpenAISDK
import os
import httpx
from openai import OpenAI
XAI_API_KEY = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=XAI_API_KEY,
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0) # Timeout after 3600s for reasoning models
)
stream = client.chat.completions.create(
model="grok-4-1-fast-reasoning",
messages=[
{"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
{"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
],
stream=True # Set streaming here
)
for chunk in stream:
print(chunk.choices[0].delta.content, end="", flush=True)
```
```javascriptOpenAISDK
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Timeout after 3600s for reasoning models
});
const stream = await openai.chat.completions.create({
model: "grok-4-1-fast-reasoning",
messages: [
{ role: "system", content: "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy." },
{
role: "user",
content: "What is the meaning of life, the universe, and everything?",
}
],
stream: true
});
for await (const chunk of stream) {
console.log(chunk.choices[0].delta.content);
}
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { streamText } from 'ai';
const result = streamText({
model: xai.responses('grok-4'),
system:
"You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
prompt: 'What is the meaning of life, the universe, and everything?',
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
```
```bash
curl https://api.x.ai/v1/chat/completions \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-m 3600 \\
-d '{
"messages": [
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"model": "grok-4-1-fast-reasoning",
"stream": true
}'
```
You'll get the event streams like these:
```json
data: {
"id":"","object":"chat.completion.chunk","created":,
"model":"grok-4-1-fast-reasoning",
"choices":[{"index":0,"delta":{"content":"Ah","role":"assistant"}}],
"usage":{"prompt_tokens":41,"completion_tokens":1,"total_tokens":42,
"prompt_tokens_details":{"text_tokens":41,"audio_tokens":0,"image_tokens":0,"cached_tokens":0}},
"system_fingerprint":"fp_xxxxxxxxxx"
}
data: {
"id":"","object":"chat.completion.chunk","created":,
"model":"grok-4-1-fast-reasoning",
"choices":[{"index":0,"delta":{"content":",","role":"assistant"}}],
"usage":{"prompt_tokens":41,"completion_tokens":2,"total_tokens":43,
"prompt_tokens_details":{"text_tokens":41,"audio_tokens":0,"image_tokens":0,"cached_tokens":0}},
"system_fingerprint":"fp_xxxxxxxxxx"
}
data: [DONE]
```
It is recommended that you use a client SDK to parse the event stream.
Example streaming responses in Python/Javascript:
```
Ah, the ultimate question! According to Douglas Adams, the answer is **42**. However, the trick lies in figuring out what the actual question is. If you're looking for a bit more context or a different perspective:
- **Philosophically**: The meaning of life might be to seek purpose, happiness, or to fulfill one's potential.
- **Biologically**: It could be about survival, reproduction, and passing on genes.
- **Existentially**: You create your own meaning through your experiences and choices.
But let's not forget, the journey to find this meaning might just be as important as the answer itself! Keep exploring, questioning, and enjoying the ride through the universe. And remember, don't panic!
```
===/developers/model-capabilities/text/structured-outputs===
#### Model Capabilities
# Structured Outputs
Structured Outputs is a feature that lets the API return responses in a specific, organized format, like JSON or other schemas you define. Instead of getting free-form text, you receive data that's consistent and easy to parse.
Ideal for tasks like document parsing, entity extraction, or report generation, it lets you define schemas using tools like
[Pydantic](https://pydantic.dev/) or [Zod](https://zod.dev/) to enforce data types, constraints, and structure.
When using structured outputs, the LLM's response is **guaranteed** to match your input schema.
## Supported models
Structured outputs is supported by all language models.
## Supported schemas
For structured output, the following types are supported for structured output:
* string
* `minLength` and `maxLength` properties are not supported
* number
* integer
* float
* object
* array
* `minItems` and `maxItem` properties are not supported
* `maxContains` and `minContains` properties are not supported
* boolean
* enum
* anyOf
`allOf` is not supported at the moment.
## Example: Invoice Parsing
A common use case for Structured Outputs is parsing raw documents. For example, invoices contain structured data like vendor details, amounts, and dates, but extracting this data from raw text can be error-prone. Structured Outputs ensure the extracted data matches a predefined schema.
Let's say you want to extract the following data from an invoice:
* Vendor name and address
* Invoice number and date
* Line items (description, quantity, price)
* Total amount and currency
We'll use structured outputs to have Grok generate a strongly-typed JSON for this.
### Step 1: Defining the Schema
You can use [Pydantic](https://pydantic.dev/) or [Zod](https://zod.dev/) to define your schema.
```pythonWithoutSDK
from datetime import date
from enum import Enum
from pydantic import BaseModel, Field
class Currency(str, Enum):
USD = "USD"
EUR = "EUR"
GBP = "GBP"
class LineItem(BaseModel):
description: str = Field(description="Description of the item or service")
quantity: int = Field(description="Number of units", ge=1)
unit_price: float = Field(description="Price per unit", ge=0)
class Address(BaseModel):
street: str = Field(description="Street address")
city: str = Field(description="City")
postal_code: str = Field(description="Postal/ZIP code")
country: str = Field(description="Country")
class Invoice(BaseModel):
vendor_name: str = Field(description="Name of the vendor")
vendor_address: Address = Field(description="Vendor's address")
invoice_number: str = Field(description="Unique invoice identifier")
invoice_date: date = Field(description="Date the invoice was issued")
line_items: list[LineItem] = Field(description="List of purchased items/services")
total_amount: float = Field(description="Total amount due", ge=0)
currency: Currency = Field(description="Currency of the invoice")
```
```javascriptWithoutSDK
import { z } from "zod";
const CurrencyEnum = z.enum(["USD", "EUR", "GBP"]);
const LineItemSchema = z.object({
description: z.string().describe("Description of the item or service"),
quantity: z.number().int().min(1).describe("Number of units"),
unit_price: z.number().min(0).describe("Price per unit"),
});
const AddressSchema = z.object({
street: z.string().describe("Street address"),
city: z.string().describe("City"),
postal_code: z.string().describe("Postal/ZIP code"),
country: z.string().describe("Country"),
});
const InvoiceSchema = z.object({
vendor_name: z.string().describe("Name of the vendor"),
vendor_address: AddressSchema.describe("Vendor's address"),
invoice_number: z.string().describe("Unique invoice identifier"),
invoice_date: z.string().date().describe("Date the invoice was issued"),
line_items: z.array(LineItemSchema).describe("List of purchased items/services"),
total_amount: z.number().min(0).describe("Total amount due"),
currency: CurrencyEnum.describe("Currency of the invoice"),
});
```
### Step 2: Prepare The Prompts
### System Prompt
The system prompt instructs the model to extract invoice data from text. Since the schema is defined separately, the prompt can focus on the task without explicitly specifying the required fields in the output JSON.
```text
Given a raw invoice, carefully analyze the text and extract the relevant invoice data into JSON format.
```
### Example Invoice Text
```text
Vendor: Acme Corp, 123 Main St, Springfield, IL 62704
Invoice Number: INV-2025-001
Date: 2025-02-10
Items:
- Widget A, 5 units, $10.00 each
- Widget B, 2 units, $15.00 each
Total: $80.00 USD
```
### Step 3: The Final Code
Use the structured outputs feature of the the SDK to parse the invoice.
```pythonXAI
import os
from datetime import date
from enum import Enum
from pydantic import BaseModel, Field
from xai_sdk import Client
from xai_sdk.chat import system, user
# Pydantic Schemas
class Currency(str, Enum):
USD = "USD"
EUR = "EUR"
GBP = "GBP"
class LineItem(BaseModel):
description: str = Field(description="Description of the item or service")
quantity: int = Field(description="Number of units", ge=1)
unit_price: float = Field(description="Price per unit", ge=0)
class Address(BaseModel):
street: str = Field(description="Street address")
city: str = Field(description="City")
postal_code: str = Field(description="Postal/ZIP code")
country: str = Field(description="Country")
class Invoice(BaseModel):
vendor_name: str = Field(description="Name of the vendor")
vendor_address: Address = Field(description="Vendor's address")
invoice_number: str = Field(description="Unique invoice identifier")
invoice_date: date = Field(description="Date the invoice was issued")
line_items: list[LineItem] = Field(description="List of purchased items/services")
total_amount: float = Field(description="Total amount due", ge=0)
currency: Currency = Field(description="Currency of the invoice")
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(system("Given a raw invoice, carefully analyze the text and extract the invoice data into JSON format."))
chat.append(
user("""
Vendor: Acme Corp, 123 Main St, Springfield, IL 62704
Invoice Number: INV-2025-001
Date: 2025-02-10
Items: - Widget A, 5 units, $10.00 each - Widget B, 2 units, $15.00 each
Total: $80.00 USD
""")
)
# The parse method returns a tuple of the full response object as well as the parsed pydantic object.
response, invoice = chat.parse(Invoice)
assert isinstance(invoice, Invoice)
# Can access fields of the parsed invoice object directly
print(invoice.vendor_name)
print(invoice.invoice_number)
print(invoice.invoice_date)
print(invoice.line_items)
print(invoice.total_amount)
print(invoice.currency)
# Can also access fields from the raw response object such as the content.
# In this case, the content is the JSON schema representation of the parsed invoice object
print(response.content)
```
```pythonOpenAISDK
from openai import OpenAI
from pydantic import BaseModel, Field
from datetime import date
from enum import Enum
# Pydantic Schemas
class Currency(str, Enum):
USD = "USD"
EUR = "EUR"
GBP = "GBP"
class LineItem(BaseModel):
description: str = Field(description="Description of the item or service")
quantity: int = Field(description="Number of units", ge=1)
unit_price: float = Field(description="Price per unit", ge=0)
class Address(BaseModel):
street: str = Field(description="Street address")
city: str = Field(description="City")
postal_code: str = Field(description="Postal/ZIP code")
country: str = Field(description="Country")
class Invoice(BaseModel):
vendor_name: str = Field(description="Name of the vendor")
vendor_address: Address = Field(description="Vendor's address")
invoice_number: str = Field(description="Unique invoice identifier")
invoice_date: date = Field(description="Date the invoice was issued")
line_items: list[LineItem] = Field(description="List of purchased items/services")
total_amount: float = Field(description="Total amount due", ge=0)
currency: Currency = Field(description="Currency of the invoice")
client = OpenAI(
api_key="",
base_url="https://api.x.ai/v1",
)
completion = client.beta.chat.completions.parse(
model="grok-4-1-fast-reasoning",
messages=[
{"role": "system", "content": "Given a raw invoice, carefully analyze the text and extract the invoice data into JSON format."},
{"role": "user", "content": """
Vendor: Acme Corp, 123 Main St, Springfield, IL 62704
Invoice Number: INV-2025-001
Date: 2025-02-10
Items:
- Widget A, 5 units, $10.00 each
- Widget B, 2 units, $15.00 each
Total: $80.00 USD
"""}
],
response_format=Invoice,
)
invoice = completion.choices[0].message.parsed
print(invoice)
```
```javascriptOpenAISDK
import OpenAI from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";
const CurrencyEnum = z.enum(["USD", "EUR", "GBP"]);
const LineItemSchema = z.object({
description: z.string().describe("Description of the item or service"),
quantity: z.number().int().min(1).describe("Number of units"),
unit_price: z.number().min(0).describe("Price per unit"),
});
const AddressSchema = z.object({
street: z.string().describe("Street address"),
city: z.string().describe("City"),
postal_code: z.string().describe("Postal/ZIP code"),
country: z.string().describe("Country"),
});
const InvoiceSchema = z.object({
vendor_name: z.string().describe("Name of the vendor"),
vendor_address: AddressSchema.describe("Vendor's address"),
invoice_number: z.string().describe("Unique invoice identifier"),
invoice_date: z.string().date().describe("Date the invoice was issued"),
line_items: z.array(LineItemSchema).describe("List of purchased items/services"),
total_amount: z.number().min(0).describe("Total amount due"),
currency: CurrencyEnum.describe("Currency of the invoice"),
});
const client = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
});
const completion = await client.beta.chat.completions.parse({
model: "grok-4-1-fast-reasoning",
messages: [
{ role: "system", content: "Given a raw invoice, carefully analyze the text and extract the invoice data into JSON format." },
{ role: "user", content: \`
Vendor: Acme Corp, 123 Main St, Springfield, IL 62704
Invoice Number: INV-2025-001
Date: 2025-02-10
Items:
- Widget A, 5 units, $10.00 each
- Widget B, 2 units, $15.00 each
Total: $80.00 USD
\` },
],
response_format: zodResponseFormat(InvoiceSchema, "invoice"),
});
const invoice = completion.choices[0].message.parsed;
console.log(invoice);
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText, Output } from 'ai';
import { z } from 'zod';
const CurrencyEnum = z.enum(['USD', 'EUR', 'GBP']);
const LineItemSchema = z.object({
description: z.string().describe('Description of the item or service'),
quantity: z.number().int().min(1).describe('Number of units'),
unit_price: z.number().min(0).describe('Price per unit'),
});
const AddressSchema = z.object({
street: z.string().describe('Street address'),
city: z.string().describe('City'),
postal_code: z.string().describe('Postal/ZIP code'),
country: z.string().describe('Country'),
});
const InvoiceSchema = z.object({
vendor_name: z.string().describe('Name of the vendor'),
vendor_address: AddressSchema.describe("Vendor's address"),
invoice_number: z.string().describe('Unique invoice identifier'),
invoice_date: z.string().date().describe('Date the invoice was issued'),
line_items: z
.array(LineItemSchema)
.describe('List of purchased items/services'),
total_amount: z.number().min(0).describe('Total amount due'),
currency: CurrencyEnum.describe('Currency of the invoice'),
});
const result = await generateText({
model: xai.responses('grok-4'),
output: Output.object({ schema: InvoiceSchema }),
system:
'Given a raw invoice, carefully analyze the text and extract the invoice data into JSON format.',
prompt: \`
Vendor: Acme Corp, 123 Main St, Springfield, IL 62704
Invoice Number: INV-2025-001
Date: 2025-02-10
Items:
- Widget A, 5 units, $10.00 each
- Widget B, 2 units, $15.00 each
Total: $80.00 USD
\`,
});
console.log(result._output);
```
### Step 4: Type-safe Output
The output will **always** be type-safe and respect the input schema.
```json
{
"vendor_name": "Acme Corp",
"vendor_address": {
"street": "123 Main St",
"city": "Springfield",
"postal_code": "62704",
"country": "IL"
},
"invoice_number": "INV-2025-001",
"invoice_date": "2025-02-10",
"line_items": [
{ "description": "Widget A", "quantity": 5, "unit_price": 10.0 },
{ "description": "Widget B", "quantity": 2, "unit_price": 15.0 }
],
"total_amount": 80.0,
"currency": "USD"
}
```
## Structured Outputs with Tools
Structured outputs with tools is only available for the Grok 4 family of models (e.g., `grok-4-1-fast`, `grok-4-fast`, `grok-4-1-fast-non-reasoning`, `grok-4-fast-non-reasoning`).
You can combine structured outputs with tool calling to get type-safe responses from tool-augmented queries. This works with both:
* **[Agentic tool calling](/developers/tools/overview)**: Server-side tools like web search, X search, and code execution that the model orchestrates autonomously.
* **[Function calling](/developers/tools/function-calling)**: User-supplied tools where you define custom functions and handle tool execution yourself.
This combination enables workflows where the model can use tools to gather information and return results in a predictable, strongly-typed format.
### Example: Agentic Tools with Structured Output
This example uses web search to find the latest research on a topic and extracts structured data into a schema:
```python customLanguage="pythonWithoutSDK"
from pydantic import BaseModel, Field
class ProofInfo(BaseModel):
name: str = Field(description="Name of the proof or paper")
authors: str = Field(description="Authors of the proof")
year: str = Field(description="Year published")
summary: str = Field(description="Brief summary of the approach")
```
```javascript customLanguage="javascriptWithoutSDK"
import { z } from "zod";
const ProofInfoSchema = z.object({
name: z.string().describe("Name of the proof or paper"),
authors: z.string().describe("Authors of the proof"),
year: z.string().describe("Year published"),
summary: z.string().describe("Brief summary of the approach"),
});
```
```python customLanguage="pythonXAI"
import os
from pydantic import BaseModel, Field
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search
# ProofInfo schema defined above
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast",
tools=[web_search()],
)
chat.append(user("Find the latest machine-checked proof of the four color theorem."))
response, proof = chat.parse(ProofInfo)
print(f"Name: {proof.name}")
print(f"Authors: {proof.authors}")
print(f"Year: {proof.year}")
print(f"Summary: {proof.summary}")
```
```python customLanguage="pythonOpenAISDK"
import os
from openai import OpenAI
from pydantic import BaseModel, Field
# ProofInfo schema defined above
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
response = client.responses.parse(
model="grok-4-1-fast",
input="Find the latest machine-checked proof of the four color theorem.",
tools=[
{"type": "web_search"}
],
text_format=ProofInfo,
)
proof = response.output_parsed
print(f"Name: {proof.name}")
print(f"Authors: {proof.authors}")
print(f"Year: {proof.year}")
print(f"Summary: {proof.summary}")
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";
// ProofInfoSchema defined above
const client = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: "https://api.x.ai/v1",
});
// Convert Zod schema to JSON schema format
const format = zodResponseFormat(ProofInfoSchema, "proof_info");
const response = await client.responses.create({
model: "grok-4-1-fast",
input: "Find the latest machine-checked proof of the four color theorem.",
tools: [
{ type: "web_search" }
],
text: {
format: {
type: "json_schema",
name: format.json_schema.name,
schema: format.json_schema.schema,
strict: true,
}
}
});
// Find the message in the output array
const message = response.output.find((item) => item.type === "message");
const textContent = message?.content?.find((c) => c.type === "output_text");
if (textContent) {
const proof = JSON.parse(textContent.text);
console.log(`Name: ${proof.name}`);
console.log(`Authors: ${proof.authors}`);
console.log(`Year: ${proof.year}`);
console.log(`Summary: ${proof.summary}`);
}
```
### Example: Client-side Tools with Structured Output
This example uses a client-side function tool to compute Collatz sequence steps and returns the result in a structured format:
```python customLanguage="pythonWithoutSDK"
from pydantic import BaseModel, Field
class CollatzResult(BaseModel):
starting_number: int = Field(description="The input number")
steps: int = Field(description="Number of steps to reach 1")
```
```javascript customLanguage="javascriptWithoutSDK"
const CollatzResultSchema = {
type: "object",
properties: {
starting_number: { type: "integer", description: "The input number" },
steps: { type: "integer", description: "Number of steps to reach 1" },
},
required: ["starting_number", "steps"],
additionalProperties: false,
};
```
```python customLanguage="pythonXAI"
import os
import json
from pydantic import BaseModel, Field
from xai_sdk import Client
from xai_sdk.chat import tool, tool_result, user
# CollatzResult schema defined above
def collatz_steps(n: int) -> int:
"""Returns the number of steps for n to reach 1 in the Collatz sequence."""
steps = 0
while n != 1:
n = n // 2 if n % 2 == 0 else 3 * n + 1
steps += 1
return steps
collatz_tool = tool(
name="collatz_steps",
description="Compute the number of steps for a number to reach 1 in the Collatz sequence",
parameters={
"type": "object",
"properties": {
"n": {"type": "integer", "description": "The starting number"},
},
"required": ["n"],
},
)
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-non-reasoning",
tools=[collatz_tool],
)
chat.append(user("Use the collatz_steps tool to find how many steps it takes for 20250709 to reach 1."))
# Handle tool calls until we get a final response
while True:
response = chat.sample()
if not response.tool_calls:
break
chat.append(response)
for tc in response.tool_calls:
args = json.loads(tc.function.arguments)
result = collatz_steps(args["n"])
chat.append(tool_result(str(result)))
# Parse the final response into structured output
response, result = chat.parse(CollatzResult)
print(f"Starting number: {result.starting_number}")
print(f"Steps to reach 1: {result.steps}")
```
```python customLanguage="pythonOpenAISDK"
import os
import json
from openai import OpenAI
from pydantic import BaseModel, Field
# CollatzResult schema defined above
def collatz_steps(n: int) -> int:
"""Returns the number of steps for n to reach 1 in the Collatz sequence."""
steps = 0
while n != 1:
n = n // 2 if n % 2 == 0 else 3 * n + 1
steps += 1
return steps
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
tools = [
{
"type": "function",
"function": {
"name": "collatz_steps",
"description": "Compute the number of steps for a number to reach 1 in the Collatz sequence",
"parameters": {
"type": "object",
"properties": {
"n": {"type": "integer", "description": "The starting number"},
},
"required": ["n"],
},
},
}
]
messages = [
{"role": "user", "content": "Use the collatz_steps tool to find how many steps it takes for 20250709 to reach 1."}
]
# Handle tool calls until we get a final response
while True:
completion = client.chat.completions.create(
model="grok-4-1-fast-non-reasoning",
messages=messages,
tools=tools,
)
message = completion.choices[0].message
if not message.tool_calls:
break
messages.append(message)
for tc in message.tool_calls:
args = json.loads(tc.function.arguments)
result = collatz_steps(args["n"])
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": str(result),
})
# Final call with structured output
completion = client.beta.chat.completions.parse(
model="grok-4-1-fast-non-reasoning",
messages=messages,
response_format=CollatzResult,
)
result = completion.choices[0].message.parsed
print(f"Starting number: {result.starting_number}")
print(f"Steps to reach 1: {result.steps}")
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from "openai";
// CollatzResultSchema defined above
function collatzSteps(n) {
let steps = 0;
while (n !== 1) {
n = n % 2 === 0 ? n / 2 : 3 * n + 1;
steps++;
}
return steps;
}
const client = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: "https://api.x.ai/v1",
});
const tools = [
{
type: "function",
function: {
name: "collatz_steps",
description: "Compute the number of steps for a number to reach 1 in the Collatz sequence",
parameters: {
type: "object",
properties: {
n: { type: "integer", description: "The starting number" },
},
required: ["n"],
},
},
},
];
let messages = [
{ role: "user", content: "Use the collatz_steps tool to find how many steps it takes for 20250709 to reach 1." }
];
// Handle tool calls until we get a final response
while (true) {
const completion = await client.chat.completions.create({
model: "grok-4-1-fast-non-reasoning",
messages,
tools,
});
const message = completion.choices[0].message;
if (!message.tool_calls) {
break;
}
messages.push(message);
for (const tc of message.tool_calls) {
const args = JSON.parse(tc.function.arguments);
const result = collatzSteps(args.n);
messages.push({
role: "tool",
tool_call_id: tc.id,
content: String(result),
});
}
}
// Final call with structured output
const completion = await client.chat.completions.create({
model: "grok-4-1-fast-non-reasoning",
messages,
response_format: {
type: "json_schema",
json_schema: {
name: "collatz_result",
schema: CollatzResultSchema,
strict: true,
},
},
});
const result = JSON.parse(completion.choices[0].message.content);
console.log("Starting number:", result.starting_number);
console.log("Steps to reach 1:", result.steps);
```
## Alternative: Using `response_format` with `sample()` or `stream()`
When using the xAI Python SDK, there's an alternative way to retrieve structured outputs. Instead of using the `parse()` method, you can pass your Pydantic model directly to the `response_format` parameter when creating a chat, and then use `sample()` or `stream()` to get the response.
### How It Works
When you pass a Pydantic model to `response_format`, the SDK automatically:
1. Converts your Pydantic model to a JSON schema
2. Constrains the model's output to conform to that schema
3. Returns the response as a JSON string, that is conforming to the Pydantic model, in `response.content`
You then manually parse the JSON string into your Pydantic model instance.
### Key Differences
| Approach | Method | Returns | Parsing |
|----------|--------|---------|---------|
| **Using `parse()`** | `chat.parse(Model)` | Tuple of `(Response, Model)` | Automatic - SDK parses for you |
| **Using `response_format`** | `chat.sample()` or `chat.stream()` | `Response` with JSON string | Manual - You parse `response.content` |
### When to Use Each Approach
* **Use `parse()`** when you want the simplest, most convenient experience with automatic parsing
* **Use `response_format` + `sample()` or `stream()`** when you:
* Want more control over the parsing process
* Need to handle the raw JSON string before parsing
* Want to use streaming with structured outputs
* Are integrating with existing code that expects to work with `sample()` or `stream()`
### Example Using `response_format`
```pythonXAI
import os
from datetime import date
from enum import Enum
from pydantic import BaseModel, Field
from xai_sdk import Client
from xai_sdk.chat import system, user
# Pydantic Schemas
class Currency(str, Enum):
USD = "USD"
EUR = "EUR"
GBP = "GBP"
class LineItem(BaseModel):
description: str = Field(description="Description of the item or service")
quantity: int = Field(description="Number of units", ge=1)
unit_price: float = Field(description="Price per unit", ge=0)
class Address(BaseModel):
street: str = Field(description="Street address")
city: str = Field(description="City")
postal_code: str = Field(description="Postal/ZIP code")
country: str = Field(description="Country")
class Invoice(BaseModel):
vendor_name: str = Field(description="Name of the vendor")
vendor_address: Address = Field(description="Vendor's address")
invoice_number: str = Field(description="Unique invoice identifier")
invoice_date: date = Field(description="Date the invoice was issued")
line_items: list[LineItem] = Field(description="List of purchased items/services")
total_amount: float = Field(description="Total amount due", ge=0)
currency: Currency = Field(description="Currency of the invoice")
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Pass the Pydantic model to response_format instead of using parse()
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
response_format=Invoice, # Pass the Pydantic model here
)
chat.append(system("Given a raw invoice, carefully analyze the text and extract the invoice data into JSON format."))
chat.append(
user("""
Vendor: Acme Corp, 123 Main St, Springfield, IL 62704
Invoice Number: INV-2025-001
Date: 2025-02-10
Items: - Widget A, 5 units, $10.00 each - Widget B, 2 units, $15.00 each
Total: $80.00 USD
""")
)
# Use sample() instead of parse() - returns Response object
response = chat.sample()
# The response.content is a valid JSON string conforming to your schema
print(response.content)
# Output: {"vendor_name": "Acme Corp", "vendor_address": {...}, ...}
# Manually parse the JSON string into your Pydantic model
invoice = Invoice.model_validate_json(response.content)
assert isinstance(invoice, Invoice)
# Access fields of the parsed invoice object
print(invoice.vendor_name)
print(invoice.invoice_number)
print(invoice.total_amount)
```
### Streaming with Structured Outputs
You can also use `stream()` with `response_format` to get streaming structured output. The chunks will progressively build up the JSON string:
```pythonXAI
import os
from pydantic import BaseModel, Field
from xai_sdk import Client
from xai_sdk.chat import system, user
class Summary(BaseModel):
title: str = Field(description="A brief title")
key_points: list[str] = Field(description="Main points from the text")
sentiment: str = Field(description="Overall sentiment: positive, negative, or neutral")
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
response_format=Summary, # Pass the Pydantic model here
)
chat.append(system("Analyze the following text and provide a structured summary."))
chat.append(user("The new product launch exceeded expectations with record sales..."))
# Stream the response - chunks contain partial JSON
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True)
# Parse the complete JSON string into your model
summary = Summary.model_validate_json(response.content)
print(f"Title: {summary.title}")
print(f"Sentiment: {summary.sentiment}")
```
===/developers/model-capabilities/video/generation===
#### Model Capabilities
# Video Generation
Generate videos from text prompts, animate still images, or edit existing videos with natural language. The API supports configurable duration, aspect ratio, and resolution for generated videos — with the SDK handling the asynchronous polling automatically.
## Quick Start
Generate a video with a single API call:
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
response = client.video.generate(
prompt="A glowing crystal-powered rocket launching from the red dunes of Mars, ancient alien ruins lighting up in the background as it soars into a sky full of unfamiliar constellations",
model="grok-imagine-video",
duration=10,
aspect_ratio="16:9",
resolution="720p",
)
print(response.url)
```
```bash
curl -X POST https://api.x.ai/v1/videos/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-video",
"prompt": "A glowing crystal-powered rocket launching from the red dunes of Mars, ancient alien ruins lighting up in the background as it soars into a sky full of unfamiliar constellations",
"duration": 10,
"aspect_ratio": "16:9",
"resolution": "720p"
}'
```
Video generation is an **asynchronous process** that typically takes up to several minutes to complete. The exact time varies based on:
* **Prompt complexity** — More detailed scenes require additional processing
* **Duration** — Longer videos take more time to generate
* **Resolution** — Higher resolutions (720p vs 480p) increase processing time
* **Video editing** — Editing existing videos adds overhead compared to image-to-video or text-to-video
### How it works
Under the hood, video generation is a two-step process:
1. **Start** — Submit a generation request and receive a `request_id`
2. **Poll** — Repeatedly check the status using the `request_id` until the video is ready
The xAI SDK's `generate()` method abstracts this entirely — it submits your request, polls for the result, and returns the completed video response. You don't need to manage request IDs or implement polling logic. For long-running generations, you can [customize the polling behavior](#customize-polling-behavior) with timeout and interval parameters, or [handle polling manually](#handle-polling-manually) for full control over the generation lifecycle.
**REST API users** must implement this two-step flow manually:
**Step 1: Start the generation request**
```bash
curl -X POST https://api.x.ai/v1/videos/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-video",
"prompt": "A glowing crystal-powered rocket launching from Mars"
}'
```
Response:
```json
{"request_id": "d97415a1-5796-b7ec-379f-4e6819e08fdf"}
```
**Step 2: Poll for the result**
Use the `request_id` to check the status. Keep polling every few seconds until the video is ready:
```bash
curl -X GET "https://api.x.ai/v1/videos/{request_id}" \
-H "Authorization: Bearer $XAI_API_KEY"
```
The response includes a `status` field with one of these values:
| Status | Description |
|--------|-------------|
| `pending` | Video is still being generated |
| `done` | Video is ready |
| `expired` | Request has expired |
Response (when complete):
```json
{
"status": "done",
"video": {
"url": "https://vidgen.x.ai/.../video.mp4",
"duration": 8,
"respect_moderation": true
},
"model": "grok-imagine-video"
}
```
Videos are returned as temporary URLs — download or process them promptly.
## Generate Videos from Images
Transform a still image into a video by providing a source image along with your prompt. The model animates the image content based on your instructions.
```python customLanguage="pythonXAI"
import base64
import xai_sdk
client = xai_sdk.Client()
# Load image from file and encode as base64
with open("landscape.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.video.generate(
prompt="Animate the clouds drifting across the sky and add gentle swaying to the trees",
model="grok-imagine-video",
image_url=f"data:image/jpeg;base64,{image_data}",
)
print(response.url)
```
```bash
# Using a public URL as the source image
curl -X POST https://api.x.ai/v1/videos/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-video",
"prompt": "Animate the clouds drifting across the sky and add gentle swaying to the trees",
"image": {"url": "https://example.com/landscape.jpg"}
}'
```
You can provide the source image as:
* A **public URL** pointing to an image
* A **base64-encoded data URI** (e.g., `data:image/jpeg;base64,...`)
The demo below shows this in action — hold to animate a still image:
## Edit Existing Videos
Edit an existing video by providing a source video along with your prompt. The model understands the video content and applies your requested changes.
The demo below shows video editing in action — `grok-imagine-video` delivers high-fidelity edits with strong scene preservation, modifying only what you ask for while keeping the rest of the video intact:
## Concurrent Requests
When you need to generate multiple videos or apply several edits to the same source video, use `AsyncClient` with `asyncio.gather` to fire requests concurrently. Since video generation and editing are long-running processes, running requests in parallel is significantly faster than issuing them sequentially.
The example below applies all three edits from the interactive demo above — adding a necklace, changing the outfit color, and adding a hat — concurrently:
```python customLanguage="pythonXAI"
import asyncio
import xai_sdk
async def edit_concurrently():
client = xai_sdk.AsyncClient()
source_video = "https://example.com/balcony-scene.mp4"
# Each request applies a different edit to the same video
prompts = [
"Give the woman a silver necklace",
"Change the color of the woman's outfit to red",
"Give the woman a wide-brimmed black hat",
]
# Fire all edit requests concurrently
tasks = [
client.video.generate(
prompt=prompt,
model="grok-imagine-video",
video_url=source_video,
)
for prompt in prompts
]
results = await asyncio.gather(*tasks)
for prompt, result in zip(prompts, results):
print(f"{prompt}: {result.url}")
asyncio.run(edit_concurrently())
```
## Configuration
The video generation API lets you control the output format of your generated videos. You can specify the duration, aspect ratio, and resolution to match your specific use case.
### Duration
Control video length with the `duration` parameter. The allowed range is 1–15 seconds.
Video editing does not support custom `duration`. The edited video retains the duration of the original, which is capped at 8.7 seconds.
### Aspect Ratio
| Ratio | Use case |
|-------|----------|
| `1:1` | Social media, thumbnails |
| `16:9` / `9:16` | Widescreen, mobile, stories (default: `16:9`) |
| `4:3` / `3:4` | Presentations, portraits |
| `3:2` / `2:3` | Photography |
For image-to-video generation, the output defaults to the input image's aspect ratio. If you specify the `aspect_ratio` parameter, it will override this and stretch the image to the desired aspect ratio.
Video editing does not support custom `aspect_ratio` — the output matches the input video's aspect ratio.
### Resolution
| Resolution | Description |
|------------|-------------|
| `720p` | HD quality |
| `480p` | Standard definition, faster processing (default) |
Video editing does not support custom `resolution`. The output resolution matches the input video's resolution, capped at 720p (e.g., a 1080p input will be downsized to 720p).
### Example
```python customLanguage="pythonXAI"
import xai_sdk
client = xai_sdk.Client()
response = client.video.generate(
prompt="Timelapse of a flower blooming in a sunlit garden",
model="grok-imagine-video",
duration=10,
aspect_ratio="16:9",
resolution="720p",
)
print(f"Video URL: {response.url}")
print(f"Duration: {response.duration}s")
```
```bash
curl -X POST https://api.x.ai/v1/videos/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-video",
"prompt": "Timelapse of a flower blooming in a sunlit garden",
"duration": 10,
"aspect_ratio": "16:9",
"resolution": "720p"
}'
```
## Customize Polling Behavior
When using the SDK's `generate()` method, you can control how long to wait and how frequently to check for results using the `timeout` and `interval` parameters:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `timeout` | Maximum time to wait for the video to complete | 10 minutes |
| `interval` | Time between status checks | 100 milliseconds |
```python customLanguage="pythonXAI"
from datetime import timedelta
import xai_sdk
client = xai_sdk.Client()
response = client.video.generate(
prompt="Epic cinematic drone shot flying through mountain peaks",
model="grok-imagine-video",
duration=15,
timeout=timedelta(minutes=15), # Wait up to 15 minutes
interval=timedelta(seconds=5), # Check every 5 seconds
)
print(response.url)
```
If the video isn't ready within the timeout period, a `TimeoutError` is raised. For even finer control, use the [manual polling approach](#handle-polling-manually) with `start()` and `get()`.
## Handle Polling Manually
For fine-grained control over the generation lifecycle, use `start()` to initiate generation and `get()` to check status.
The `get()` method returns a response with a `status` field. Import the status enum from the SDK:
```python customLanguage="pythonXAI"
import time
import xai_sdk
from xai_sdk.proto import deferred_pb2
client = xai_sdk.Client()
# Start the generation request
start_response = client.video.start(
prompt="A cat lounging in a sunbeam, tail gently swishing",
model="grok-imagine-video",
duration=5,
)
print(f"Request ID: {start_response.request_id}")
# Poll for results
while True:
result = client.video.get(start_response.request_id)
if result.status == deferred_pb2.DeferredStatus.DONE:
print(f"Video URL: {result.response.video.url}")
break
elif result.status == deferred_pb2.DeferredStatus.EXPIRED:
print("Request expired")
break
elif result.status == deferred_pb2.DeferredStatus.PENDING:
print("Still processing...")
time.sleep(5)
```
```bash
# Step 1: Start generation
curl -X POST https://api.x.ai/v1/videos/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-imagine-video",
"prompt": "A cat lounging in a sunbeam, tail gently swishing",
"duration": 5
}'
# Response: {"request_id": "{request_id}"}
# Step 2: Poll for results
curl -X GET https://api.x.ai/v1/videos/{request_id} \
-H "Authorization: Bearer $XAI_API_KEY"
```
The available status values are:
| Proto Value | Description |
|-------------|-------------|
| `deferred_pb2.DeferredStatus.PENDING` | Video is still being generated |
| `deferred_pb2.DeferredStatus.DONE` | Video is ready |
| `deferred_pb2.DeferredStatus.EXPIRED` | Request has expired |
## Response Details
The xAI SDK exposes additional metadata on the response object beyond the video URL.
**Moderation** — Check whether the generated video passed content moderation:
```python customLanguage="pythonXAI"
if response.respect_moderation:
print(response.url)
else:
print("Video filtered by moderation")
```
**Duration** — Get the actual duration of the generated video:
```python customLanguage="pythonXAI"
print(f"Duration: {response.duration} seconds")
```
**Model** — Get the actual model used (resolving any aliases):
```python customLanguage="pythonXAI"
print(f"Model: {response.model}")
```
## Pricing
Video generation uses per-second pricing. Longer videos cost more, and both duration and resolution affect the total cost.
For full pricing details on the `grok-imagine-video` model, see the [model page](/developers/models).
## Limitations
* **Maximum duration:** 15 seconds for generation, 8.7 seconds for editing input videos
* **URL expiration:** Generated URLs are ephemeral and should not be relied upon for long-term storage
* **Resolutions:** 480p or 720p
* **Content moderation:** Videos are subject to content policy review
## Related
* [Models](/developers/models) — Available video models and pricing
* [Image Generation](/developers/model-capabilities/images/generation) — Generate still images from text
* [API Reference](/developers/rest-api-reference) — Full endpoint documentation
===/developers/models===
#### Key Information
# Models and Pricing
An overview of our models' capabilities and their associated pricing.
## Model Pricing
| Model | Modalities | Capabilities | Context | Rate Limits | Pricing [in (cached in) / out] |
| --- | --- | --- | --- | --- | --- |
| grok-4-1-fast-reasoning | text, image → text | functions, structured, reasoning | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-4-1-fast-reasoning | text, image → text | functions, structured, reasoning | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-4-1-fast-non-reasoning | text, image → text | functions, structured | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-4-1-fast-non-reasoning | text, image → text | functions, structured | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-4-1-fast-non-reasoning | text, image → text | functions, structured | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-code-fast-1 | text → text | functions, structured, reasoning | 256,000 | 2M TPM, 480 RPM | $0.20 ($0.02) / $1.50 |
| grok-code-fast-1 | text → text | functions, structured, reasoning | 256,000 | 2M TPM, 480 RPM | $0.20 ($0.02) / $1.50 |
| grok-4-0709 | text, image → text | functions, structured, reasoning | 256,000 | 2M TPM, 480 RPM | $3.00 ($0.75) / $15.00 |
| grok-4-0709 | text, image → text | functions, structured, reasoning | 256,000 | 2M TPM, 480 RPM | $3.00 ($0.75) / $15.00 |
| grok-3 | text → text | functions, structured | 131,072 | 600 RPM | $3.00 ($0.75) / $15.00 |
| grok-3 | text → text | functions, structured | 131,072 | 600 RPM | $3.00 ($0.75) / $15.00 |
| grok-3-mini | text → text | functions, structured, reasoning | 131,072 | 480 RPM | $0.30 ($0.07) / $0.50 |
| grok-3-mini | text → text | functions, structured, reasoning | 131,072 | 480 RPM | $0.30 ($0.07) / $0.50 |
| grok-2-image-1212 | text → image | - | - | 300 RPM | $0.07/image |
| grok-2-image-1212 | text → image | - | - | 300 RPM | $0.07/image |
| grok-4-fast-reasoning | text, image → text | functions, structured, reasoning | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-4-fast-non-reasoning | text, image → text | functions, structured | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-2-vision-1212 | text, image → text | functions, structured | 32,768 | 600 RPM | $2.00 ($0.00) / $10.00 |
| grok-4-fast-non-reasoning | text, image → text | functions, structured | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-2-vision-1212 | text, image → text | functions, structured | 32,768 | 600 RPM | $2.00 ($0.00) / $10.00 |
| grok-4-fast-reasoning | text, image → text | functions, structured, reasoning | 2,000,000 | 4M TPM, 480 RPM | $0.20 ($0.05) / $0.50 |
| grok-imagine-image-pro | text, image → image | - | - | 30 RPM | $0.07/image |
| grok-imagine-image | text, image → image | - | - | 300 RPM | $0.02/image |
| grok-imagine-image | text, image → image | - | - | 300 RPM | $0.02/image |
| grok-imagine-image-pro | text, image → image | - | - | 30 RPM | $0.07/image |
| grok-imagine-video | text, image, video → video | - | - | 60 RPM | $0.050/sec |
| grok-imagine-video | text, image, video → video | - | - | 60 RPM | $0.050/sec |
When moving from `grok-3`/`grok-3-mini` to `grok-4`, please note the following differences:
## Tools Pricing
Requests which make use of xAI provided [server-side tools](/developers/tools/overview) are priced based on two components: **token usage** and **server-side tool invocations**. Since the agent autonomously decides how many tools to call, costs scale with query complexity.
### Token Costs
All standard token types are billed at the [rate](#model-pricing) for the model used in the request:
* **Input tokens**: Your query and conversation history
* **Reasoning tokens**: Agent's internal thinking and planning
* **Completion tokens**: The final response
* **Image tokens**: Visual content analysis (when applicable)
* **Cached prompt tokens**: Prompt tokens that were served from cache rather than recomputed
### Tool Invocation Costs
| Tool | Description | Cost / 1k Calls | Tool Name |
| --- | --- | --- | --- |
| Web Search | Search the internet and browse web pages | $5 | `web_search` |
| X Search | Search X posts, user profiles, and threads | $5 | `x_search` |
| Code Execution | Run Python code in a sandboxed environment | $5 | `code_execution`, `code_interpreter*` |
| File Attachments | Search through files attached to messages | $10 | `attachment_search` |
| Collections Search | Query your uploaded document collections (RAG) | $2.50 | `collections_search`, `file_search*` |
| Image Understanding | Analyze images found during Web Search and X Search\*\* | Token-based | `view_image` |
| X Video Understanding | Analyze videos found during X Search\*\* | Token-based | `view_x_video` |
| Remote MCP Tools | Connect and use custom MCP tool servers | Token-based | *(set by MCP server)* |
\* All tool names work in the Responses API. In the gRPC API (Python xAI SDK), `code_interpreter` and `file_search` are not supported.
\*\* Only applies to images and videos found by search tools — not to images passed directly in messages.
For the view image and view x video tools, you will not be charged for the tool invocation itself but will be charged for the image tokens used to process the image or video.
For Remote MCP tools, you will not be charged for the tool invocation but will be charged for any tokens used.
For more information on using Tools, please visit [our guide on Tools](/developers/tools/overview).
## Batch API Pricing
The [Batch API](/developers/advanced-api-usage/batch-api) lets you process large volumes of requests asynchronously at **50% of standard pricing** — effectively cutting your token costs in half. Batch requests are queued and processed in the background, with most completing within 24 hours.
| | Real-time API | Batch API |
|---|---|---|
| **Token pricing** | Standard rates | **50% off** standard rates |
| **Response time** | Immediate (seconds) | Typically within 24 hours |
| **Rate limits** | Per-minute limits apply | Requests don't count towards rate limits |
The 50% discount applies to all token types — input tokens, output tokens, cached tokens, and reasoning tokens. To see batch pricing for a specific model, visit the model's detail page and toggle **"Show batch API pricing"**.
Batch API pricing applies to text and language models only. Image and video generation models are not supported in the Batch API. See [Batch API documentation](/developers/advanced-api-usage/batch-api) for full details, limitations, and usage examples.
## Voice Agent API Pricing
The [Voice Agent API](/developers/model-capabilities/audio/voice-agent) is a real-time voice conversation offering, billed at a straightforward flat rate of $0.05 per minute of connection time.
| | Details |
|---|---|
| **Pricing** | $0.05 / minute ($3.00 / hour) |
| **Rate Limit** | 10 concurrent sessions per team |
| **Capabilities** | Function calling (web search, X search, collections, custom functions) |
When using the Voice Agent API with tools such as function calling, web search, X search, or collections, you will be charged for the tool invocations in addition to the per-minute voice session cost. See [Tool Invocation Costs](#tool-invocation-costs) above for tool pricing details.
For more details on how to get started with the Voice Agent API, see the [Voice Agent API documentation](/developers/model-capabilities/audio/voice-agent).
## Usage Guidelines Violation Fee
When your request is deemed to be in violation of our usage guideline by our system, we will still charge for the generation of the request.
For violations that are caught before generation in the Responses API, we will charge a $0.05 usage guideline violation fee per request.
## Additional Information Regarding Models
* **No access to realtime events without search tools enabled**
* Grok has no knowledge of current events or data beyond what was present in its training data.
* To incorporate realtime data with your request, enable server-side search tools (Web Search / X Search). See [Web Search](/developers/tools/web-search) and [X Search](/developers/tools/x-search).
* **Chat models**
* No role order limitation: You can mix `system`, `user`, or `assistant` roles in any sequence for your conversation context.
* **Image input models**
* Maximum image size: `20MiB`
* Maximum number of images: No limit
* Supported image file types: `jpg/jpeg` or `png`.
* Any image/text input order is accepted (e.g. text prompt can precede image prompt)
The knowledge cut-off date of Grok 3 and Grok 4 is November, 2024.
## Model Aliases
Some models have aliases to help users automatically migrate to the next version of the same model. In general:
* `` is aliased to the latest stable version.
* `-latest` is aliased to the latest version. This is suitable for users who want to access the latest features.
* `-` refers directly to a specific model release. This will not be updated and is for workflows that demand consistency.
For most users, the aliased `` or `-latest` are recommended, as you would receive the latest features automatically.
## Billing and Availability
Your model access might vary depending on various factors such as geographical location, account limitations, etc.
For how the **bills are charged**, visit [Manage Billing](/console/billing) for more information.
For the most up-to-date information on **your team's model availability**, visit [Models Page](https://console.x.ai/team/default/models) on xAI Console.
## Model Input and Output
Each model can have one or multiple input and output capabilities.
The input capabilities refer to which type(s) of prompt can the model accept in the request message body.
The output capabilities refer to which type(s) of completion will the model generate in the response message body.
This is a prompt example for models with `text` input capability:
```json
[
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
]
```
This is a prompt example for models with `text` and `image` input capabilities:
```json
[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,",
"detail": "high"
}
},
{
"type": "text",
"text": "Describe what's in this image."
}
]
}
]
```
This is a prompt example for models with `text` input and `image` output capabilities:
```json
// The entire request body
{
"model": "grok-4",
"prompt": "A cat in a tree",
"n": 4
}
```
## Context Window
The context window determines the maximum amount of tokens accepted by the model in the prompt.
For more information on how token is counted, visit [Consumption and Rate Limits](/developers/rate-limits).
If you are sending the entire conversation history in the prompt for use cases like chat assistant, the sum of all the prompts in your conversation history must be no greater than the context window.
## Cached prompt tokens
Trying to run the same prompt multiple times? You can now use cached prompt tokens to incur less cost on repeated prompts. By reusing stored prompt data, you save on processing expenses for identical requests. Enable caching in your settings and start saving today!
The caching is automatically enabled for all requests without user input. You can view the cached prompt token consumption in [the `"usage"` object](/developers/rate-limits#checking-token-consumption).
For details on the pricing, please refer to the pricing table above, or on [xAI Console](https://console.x.ai).
===/developers/quickstart===
#### Getting Started
# Getting Started
Welcome! In this guide, we'll walk you through the basics of using the xAI API.
## Step 1: Create an xAI Account
First, you'll need to create an xAI account to access xAI API. Sign up for an account [here](https://accounts.x.ai/sign-up?redirect=cloud-console).
Once you've created an account, you'll need to load it with credits to start using the API.
## Step 2: Generate an API Key
Create an API key via the [API Keys Page](https://console.x.ai/team/default/api-keys) in the xAI API Console.
After generating an API key, we need to save it somewhere safe! We recommend you export it as an environment variable in your terminal or save it to a `.env` file.
```bash
export XAI_API_KEY="your_api_key"
```
## Step 3: Make your first request
With your xAI API key exported as an environment variable, you're ready to make your first API request.
Let's test out the API using `curl`. Paste the following directly into your terminal.
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-m 3600 \\
-d '{
"input": [
{
"role": "system",
"content": "You are Grok, a highly intelligent, helpful AI assistant."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"model": "grok-4-1-fast-reasoning"
}'
```
## Step 4: Make a request from Python or Javascript
As well as a native xAI Python SDK, the majority of our APIs are fully compatible with the OpenAI SDK (and the Anthropic SDK, although this is now deprecated). For example, we can make the same request from Python or JavaScript like so:
**Anthropic SDK Deprecated**: The Anthropic SDK compatibility is fully deprecated. Please migrate to the [Responses API](/developers/rest-api-reference/inference/chat#create-new-response) or [gRPC](/developers/grpc-api-reference).
```pythonXAI
# In your terminal, first run:
# pip install xai-sdk
import os
from xai_sdk import Client
from xai_sdk.chat import user, system
client = Client(
api_key=os.getenv("XAI_API_KEY"),
timeout=3600, # Override default timeout with longer timeout for reasoning models
)
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(system("You are Grok, a highly intelligent, helpful AI assistant."))
chat.append(user("What is the meaning of life, the universe, and everything?"))
response = chat.sample()
print(response.content)
```
```pythonOpenAISDK
# In your terminal, first run:
# pip install openai
import os
import httpx
from openai import OpenAI
XAI_API_KEY = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=XAI_API_KEY,
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
completion = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "system",
"content": "You are Grok, a highly intelligent, helpful AI assistant."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
},
],
)
print(completion.output[0].content)
```
```javascriptAISDK
// In your terminal, first run:
// pnpm add ai @ai-sdk/xai
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const result = await generateText({
model: xai.responses('grok-4'),
system: 'You are Grok, a highly intelligent, helpful AI assistant.',
prompt: 'What is the meaning of life, the universe, and everything?',
});
console.log(result.text);
```
```javascriptOpenAISDK
// In your terminal, first run:
// npm install openai
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: "your_api_key",
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const response = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{
role: "system",
content:
"You are Grok, a highly intelligent, helpful AI assistant.",
},
{
role: "user",
content:
"What is the meaning of life, the universe, and everything?",
},
],
});
console.log(response.output[0].content);
```
```bash
curl https://api.x.ai/v1/chat/completions \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-m 3600 \\
-d '{
"messages": [
{
"role": "system",
"content": "You are Grok, a highly intelligent, helpful AI assistant."
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"model": "grok-4-1-fast-reasoning"
}'
```
Certain models also support [Structured Outputs](/developers/model-capabilities/text/structured-outputs), which allows you to enforce a schema for the LLM output.
For an in-depth guide about using Grok for text responses, check out our [Text Generation Guide](/developers/model-capabilities/text/generate-text).
## Step 5: Use Grok to analyze images
Certain grok models can accept both text AND images as an input. For example:
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user, image
client = Client(
api_key=os.getenv("XAI_API_KEY"),
timeout=3600, # Override default timeout with longer timeout for reasoning models
)
chat = client.chat.create(model="grok-4")
chat.append(
user(
"What's in this image?",
image("https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png")
)
)
response = chat.sample()
print(response.content)
```
```pythonOpenAISDK
import os
import httpx
from openai import OpenAI
XAI_API_KEY = os.getenv("XAI_API_KEY")
image_url = "https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png"
client = OpenAI(
api_key=XAI_API_KEY,
base_url="https://api.x.ai/v1",
timeout=httpx.Timeout(3600.0), # Override default timeout with longer timeout for reasoning models
)
completion = client.responses.create(
model="grok-4",
input=[
{
"role": "user",
"content": [
{
"type": "input_image",
"image_url": image_url,
"detail": "high",
},
{
"type": "input_text",
"text": "What's in this image?",
},
],
},
],
)
print(completion.output[0].content)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const imageUrl =
'https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png';
const result = await generateText({
model: xai.responses('grok-4'),
messages: [
{
role: 'user',
content: [
{ type: 'image', image: imageUrl },
{ text: "What's in this image?", type: 'text' },
],
},
],
});
console.log(result.text);
```
```javascriptOpenAISDK
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: "https://api.x.ai/v1",
timeout: 360000, // Override default timeout with longer timeout for reasoning models
});
const image_url =
"https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png";
const completion = await client.responses.create({
model: "grok-4",
input: [
{
role: "user",
content: [
{
type: "input_image",
image_url: image_url,
detail: "high",
},
{
type: "input_text",
text: "What's in this image?",
},
],
},
],
});
console.log(completion.output[0].content);
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-m 3600 \\
-d '{
"model": "grok-4",
"input": [
{
"role": "user",
"content": [
{
"type": "input_image",
"image_url": "https://science.nasa.gov/wp-content/uploads/2023/09/web-first-images-release.png",
"detail": "high"
},
{
"type": "input_text",
"text": "Describe this image"
}
]
}
]
}'
```
And voila! Grok will tell you exactly what's in the image:
> This image is a photograph of a region in space, specifically a part of the Carina Nebula, captured by the James Webb Space Telescope. It showcases a stunning view of interstellar gas and dust, illuminated by young, hot stars. The bright points of light are stars, and the colorful clouds are composed of various gases and dust particles. The image highlights the intricate details and beauty of star formation within a nebula.
To learn how to use Grok vision for more advanced use cases, check out our [Image Understanding](/developers/model-capabilities/images/understanding).
===/developers/rate-limits===
#### Key Information
# Consumption and Rate Limits
The cost of using our API is based on token consumption. We charge different prices based on token category:
* **Prompt text, audio and image tokens** - Charged at prompt token price
* **Cached prompt tokens** - Charged at cached prompt token price
* **Completion tokens** - Charged at completion token price
* **Reasoning tokens** - Charged at completion token price
Visit [Models and Pricing](../models) for general pricing, or [xAI Console](https://console.x.ai) for pricing applicable to your team.
Each `grok` model has different rate limits. To check your team's rate limits, you can visit [xAI Console Models Page](https://console.x.ai/team/default/models).
## Basic unit to calculate consumption — Tokens
A token is the basic unit of prompt size for model inference and pricing purposes. It consists of one or more character(s)/symbol(s).
When a Grok model handles your request, an input prompt will be decomposed into a list of tokens through a tokenizer.
The model will then make inference based on the prompt tokens, and generate completion tokens.
After the inference is completed, the completion tokens will be aggregated into a completion response sent back to you.
Our system will add additional formatting tokens to the input/output token, and if you selected a reasoning model, additional reasoning tokens will be added into the total token consumption as well.
Your actual consumption will be reflected either in the `usage` object returned in the API response, or in Usage Explorer on the [xAI Console](https://console.x.ai).
You can use [Tokenizer](https://console.x.ai/team/default/tokenizer) on xAI Console to visualize tokens a given text prompt, or use [Tokenize text](/developers/rest-api-reference/inference/other#tokenize-text) endpoint on the API.
### Text tokens
Tokens can be either of a whole word, or smaller chunks of character combinations. The more common a word is, the more likely it would be a whole token.
For example, Flint is broken down into two tokens, while Michigan is a whole token.
In another example, most words are tokens by themselves, but "drafter" is broken down into "dra" and "fter", and "postmaster" is broken down into "post" and "master".
For a given text/image/etc. prompt or completion sequence, different tokenizers may break it down into different lengths of lists.
Different Grok models may also share or use different tokenizers. Therefore, **the same prompt/completion sequence may not have the same amount of tokens across different models.**
The token count in a prompt/completion sequence should be approximately linear to the sequence length.
### Image prompt tokens
Each image prompt will take between 256 to 1792 tokens, depending on the size of the image. The image + text token count must be less than the overall context window of the model.
### Estimating consumption with tokenizer on xAI Console or through API
The tokenizer page or API might display less token count than the actual token consumption. The
inference endpoints would automatically add pre-defined tokens to help our system process the
request.
On xAI Console, you can use the [tokenizer page](https://console.x.ai/team/default/tokenizer) to estimate how many tokens your text prompt will consume. For example, the following message would consume 5 tokens (the actual consumption may vary because of additional special tokens added by the system).
Message body:
```json
[
{
"role": "user",
"content": "How is the weather today?"
}
]
```
Tokenize result on Tokenizer page:
You can also utilize the [Tokenize Text](/developers/rest-api-reference/inference/other#tokenize-text) API endpoint to tokenize the text, and count the output token array length.
### Cached prompt tokens
When you send the same prompt multiple times, we may cache your prompt tokens. This would result in reduced cost for these tokens at the cached token rate, and a quicker response.
The prompt is cached using prefix matching, using cache for the exact prefix matches in the subsequent requests. However, the cache size might be limited and distributed across different clusters.
You can also specify `x-grok-conv-id: ` in the HTTP request header, to increase the likelihood of cache hit in the subsequent requests using the same header.
### Reasoning tokens
The model may use reasoning to process your request. The reasoning content is returned in the response's `reasoning_content` field. The reasoning token consumption will be counted separately from `completion_tokens`, but will be counted in the `total_tokens`.
The reasoning tokens will be charged at the same price as `completion_tokens`.
`grok-4` does not return `reasoning_content`
## Hitting rate limits
To request a higher rate limit, please email support@x.ai with your anticipated volume.
For each tier, there is a maximum amount of requests per minute and tokens per minute. This is to ensure fair usage by all users of the system.
Once your request frequency has reached the rate limit, you will receive error code `429` in response.
You can either:
* Upgrade your team to higher tiers
* Change your consumption pattern to send fewer requests
## Checking token consumption
In each completion response, there is a `usage` object detailing your prompt and completion token count. You might find it helpful to keep track of it, in order to avoid hitting rate limits or having cost surprises. You can view more details of the object on our [API Reference](/developers/rest-api-reference).
```json
"usage": {
"prompt_tokens": 199,
"completion_tokens": 1,
"total_tokens": 200,
"prompt_tokens_details": {
"text_tokens": 199,
"audio_tokens": 0,
"image_tokens": 0,
"cached_tokens": 163
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
},
"num_sources_used": 0,
"cost_in_usd_ticks": 158500
}
```
The `cost_in_usd_ticks` field expresses the total cost to perform the inference, in 1/10,000,000,000 US dollar.
**Note:** The `usage.prompt_tokens_details.text_tokens` is the total text input token, which includes `cached_tokens` and non-cached text tokens.
You can also check with the xAI or OpenAI SDKs (Anthropic SDK is deprecated).
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import system, user
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
messages=[system("You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.")]
)
chat.append(user("What is the meaning of life, the universe, and everything?"))
response = chat.sample()
print(response.usage)
```
```pythonOpenAISDK
import os
from openai import OpenAI
XAI_API_KEY = os.getenv("XAI_API_KEY")
client = OpenAI(base_url="https://api.x.ai/v1", api_key=XAI_API_KEY)
completion = client.chat.completions.create(
model="grok-4-1-fast-reasoning",
messages=[
{
"role": "system",
"content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
},
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?",
},
],
)
if completion.usage:
print(completion.usage.to_json())
```
```javascriptOpenAISDK
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: "",
baseURL: "https://api.x.ai/v1",
});
const completion = await openai.chat.completions.create({
model: "grok-4-1-fast-reasoning",
messages: [
{
role: "system",
content:
"You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
},
{
role: "user",
content:
"What is the meaning of life, the universe, and everything?",
},
],
});
console.log(completion.usage);
```
===/developers/regions===
#### Key Information
# Regional Endpoints
By default, you can access our API at `https://api.x.ai`. This is the most suitable endpoint for most customers,
as the request will be automatically routed by us to be processed in the region with lowest latency for your request.
For example, if you are based in US East Coast and send your request to `https://api.x.ai`, your request will be forwarded
to our `us-east-1` region and we will try to process it there first. If there is not enough computing resource in `us-east-1`,
we will send your request to other regions that are geographically closest to you and can handle the request.
## Using a regional endpoint
If you have specific data privacy requirements that would require the request to be processed within a specified region,
you can leverage our regional endpoint.
You can send your request to `https://.api.x.ai`. For the same example, to send request from US East Coast to `us-east-1`,
you will now send the request to `https://us-east-1.api.x.ai`. If for some reason, we cannot handle your request in `us-east-1`, the request will fail.
## Example of using regional endpoints
If you want to use a regional endpoint, you need to specify the endpoint url when making request with SDK. In xAI SDK, this is specified through the `api_host` parameter.
For example, to send request to `us-east-1`:
```pythonWithoutSDK
import os
from xai_sdk import Client
from xai_sdk.chat import user
client = Client(
api_key=os.getenv("XAI_API_KEY"),
api_host="us-east-1.api.x.ai" # Without the https://
)
chat = client.chat.create(model="grok-4-1-fast-reasoning")
chat.append(user("What is the meaning of life?"))
completion = chat.sample()
```
```pythonOpenAISDK
from openai import OpenAI
client = OpenAI(
api_key=XAI_API_KEY,
base_url="https://us-east-1.api.x.ai/v1",
)
completion = client.chat.completions.create(
model="grok-4-1-fast-reasoning",
messages=[
{"role": "user", "content": "What is the meaning of life?"}
]
)
```
```javascriptOpenAISDK
import OpenAI from "openai";
const client = new OpenAI({
apiKey: XAI_API_KEY,
baseURL: "https://us-east-1.api.x.ai/v1",
});
const completion = await client.chat.completions.create({
model: "grok-4-1-fast-reasoning",
messages: [
{ role: "user", content: "What is the meaning of life?" }
]
});
```
```bash
curl https://us-east-1.api.x.ai/v1/chat/completions \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"messages": [
{
"role": "user",
"content": "What is the meaning of life, the universe, and everything?"
}
],
"model": "grok-4-1-fast-reasoning",
"stream": false
}'
```
## Model availability across regions
While we strive to make every model available across all regions, there could be occasions where some models are not
available in some regions.
By using the global `https://api.x.ai` endpoint, you would have access to all models available to your team, since we
route your request automatically. If you're using a regional endpoint, please refer to [xAI Console](https://console.x.ai)
for the available models to your team in each region, or [Models and Pricing](../models) for the publicly available models.
===/developers/release-notes===
#### Release Notes
# Release Notes
Stay up to date with the latest changes to the xAI API.
# January 2026
### Video Generation & Next-Gen Image Generation
[Video Generation](/developers/model-capabilities/video/generation) and a revamped [Image Generation](/developers/model-capabilities/images/generation) are now available.
### Batch API is released
[Batch API](/developers/advanced-api-usage/batch-api) is available for all customers. It enables efficient batch processing of multiple requests, providing a better experience for users who need to submit large volumes of requests at once.
# December 2025
### Grok Voice Agent API is released
Grok Voice Agent API is generally available. Visit [Grok Voice Agent API](/developers/model-capabilities/audio/voice) for guidance on using the API.
# November 2025
### Grok 4.1 Fast is available in Enterprise API
You can now use Grok 4.1 Fast in the [xAI Enterprise API](https://x.ai/api). For more details, check out [our blogpost](https://x.ai/news/grok-4-1-fast).
### Agent tools adapt to Grok 4.1 Fast models and tool prices dropped
* You can now use Grok 4.1 Fast models with the agent tools, check out the [documentation of agent tools](/developers/tools/overview) to get started.
* The price of agent tools drops by up to 50% to no more than $5 per 1000 successful calls, see the new prices at [the pricing page](/developers/models#tools-pricing).
### Files API is generally available
You can now upload files and use them in chat conversations with the Files API. For more details, check out [our guide on Files](/developers/files).
### New Tools Available
* **Collections Search Tool**: You can now search through uploaded knowledge bases (collections) in chat conversations via the API. For more details, check out the [docs](/developers/tools/collections-search).
* **Remote MCP Tools**: You can now use tools from remote MCP servers in chat conversations via the API. For more details, check out the [docs](/developers/tools/remote-mcp).
* **Mixing client-side and server-side tools**: You can now mix client-side and server-side tools in the same chat conversation. For more details, check out the [docs](/developers/tools/advanced-usage#mixing-server-side-and-client-side-tools).
# October 2025
### Tools are now generally available
New agentic server-side tools including `web_search`, `x_search` and `code_execution` are available. For more details, check out [our guide on using Tools](/developers/tools/overview).
# September 2025
### Responses API is generally available
You can now use our stateful Responses API to process requests.
# August 2025
### Grok Code Fast 1 is released
We have released our first Code Model to be used with code editors.
### Collections API is released
You can upload files, create embeddings, and use them for inference with our Collections API.
# July 2025
### Grok 4 is released
You can now use Grok 4 via our API or on https://grok.com.
# June 2025
### Management API is released
You can manage your API keys via Management API at
`https://management-api.x.ai`.
# May 2025
### Cached prompt is now available
You can now use cached prompt to save on repeated prompts. For
more info, see [models](/developers/models).
### Live Search is available on API
Live search is now available on API. Users can generate
completions with queries on supported data sources.
# April 2025
### Grok 3 models launch on API
Our latest flagship `Grok 3` models are now generally available via
the API. For more info, see [models](/developers/models).
# March 2025
### Image Generation Model available on API
The image generation model is available on API. Visit
[Image Generations](/developers/model-capabilities/images/generation) for more details on using the model.
# February 2025
### Audit Logs
Team admins can now view audit logs on [console.x.ai](https://console.x.ai).
# January 2025
### Docs Dark Mode
Released dark mode support on docs.x.ai
### Status Page
Check service statuses across all xAI products at
[status.x.ai](https://status.x.ai/).
# December 2024
### Replit & xAI
Replit Agents can now integrate with xAI! Start empowering your agents with Grok.
Check out the [announcement](https://x.com/Replit/status/1874211039258333643) for more information.
### Tokenizer Playground Understanding tokens can be hard. Check out
[console.x.ai](https://console.x.ai) to get a better understanding of what counts as a token.
### Structured Outputs
We're excited to announce that Grok now supports structured outputs. Grok can
now format responses in a predefined, organized format rather than free-form text. 1. Specify the
desired schema
```
{
"name": "movie_response",
"schema": {
"type": "object",
"properties": {
"title": { "type": "string" },
"rating": { "type": "number" },
},
"required": [ "title", "rating" ],
"additionalProperties": false
},
"strict": true
}
```
2. Get the desired data
```
{
"title": "Star Wars",
"rating": 8.6
}
```
Start building more reliable applications. Check out the [docs](/developers/model-capabilities/text/structured-outputs) for more information.
### Released the new grok-2-1212 and grok-2-vision-1212 models A month ago, we launched the public
beta of our enterprise API with grok-beta and grok-vision-beta. We’re adding [grok-2-1212 and
grok-2-vision-1212](/developers/models), offering better accuracy, instruction-following,
and multilingual capabilities.
# November 2024
### LangChain & xAI Our API is now available through LangChain! - Python Docs:
https://python.langchain.com/integrations/providers/xai/ - Javascript Docs:
https://js.langchain.com/integrations/chat/xai/
What are you going to build?
### API Public Beta
We are happy to announce the immediate availability of our API, which
gives developers programmatic access to our Grok series of foundation models. To get started, head
to [console.x.ai](https://console.x.ai/) and sign up to create an account. We are excited to see
what developers build using Grok.
===/developers/rest-api-reference/collections/collection===
#### Collections API
# Collection Management
The base URL for `collection` management is shared with [Management API](/developers/rest-api-reference/management) at `https://management-api.x.ai/`.
You have to authenticate using **xAI Management API Key** with the header `Authorization: Bearer `.
For more details on provisioning xAI Management API key and using Management API, you can visit
***
## POST /v1/collections
API endpoint for POST requests to /v1/collections.
```
Method: POST
Path: /v1/collections
```
***
## GET /v1/collections
API endpoint for GET requests to /v1/collections.
```
Method: GET
Path: /v1/collections
```
***
## GET /v1/collections/\{collection\_id}
API endpoint for GET requests to /v1/collections/\{collection\_id}.
```
Method: GET
Path: /v1/collections/{collection_id}
```
***
## DELETE /v1/collections/\{collection\_id}
API endpoint for DELETE requests to /v1/collections/\{collection\_id}.
```
Method: DELETE
Path: /v1/collections/{collection_id}
```
***
## PUT /v1/collections/\{collection\_id}
API endpoint for PUT requests to /v1/collections/\{collection\_id}.
```
Method: PUT
Path: /v1/collections/{collection_id}
```
***
## POST /v1/collections/\{collection\_id}/documents/\{file\_id}
API endpoint for POST requests to /v1/collections/\{collection\_id}/documents/\{file\_id}.
```
Method: POST
Path: /v1/collections/{collection_id}/documents/{file_id}
```
***
## GET /v1/collections/\{collection\_id}/documents
API endpoint for GET requests to /v1/collections/\{collection\_id}/documents.
```
Method: GET
Path: /v1/collections/{collection_id}/documents
```
***
## GET /v1/collections/\{collection\_id}/documents/\{file\_id}
API endpoint for GET requests to /v1/collections/\{collection\_id}/documents/\{file\_id}.
```
Method: GET
Path: /v1/collections/{collection_id}/documents/{file_id}
```
***
## PATCH /v1/collections/\{collection\_id}/documents/\{file\_id}
API endpoint for PATCH requests to /v1/collections/\{collection\_id}/documents/\{file\_id}.
```
Method: PATCH
Path: /v1/collections/{collection_id}/documents/{file_id}
```
***
## DELETE /v1/collections/\{collection\_id}/documents/\{file\_id}
API endpoint for DELETE requests to /v1/collections/\{collection\_id}/documents/\{file\_id}.
```
Method: DELETE
Path: /v1/collections/{collection_id}/documents/{file_id}
```
***
## GET /v1/collections/\{collection\_id}/documents:batchGet
API endpoint for GET requests to /v1/collections/\{collection\_id}/documents:batchGet.
```
Method: GET
Path: /v1/collections/{collection_id}/documents:batchGet
```
===/developers/rest-api-reference/collections===
#### Collections API
# Collections REST API Overview
The Collections API allows you to manage your Collections `documents` and `collections` programmatically.
The base url for `collection` management is shared with [Management API](/developers/rest-api-reference/management) at `https://management-api.x.ai/v1/`. You have to authenticate using **xAI Management API Key** with the header `Authorization: Bearer `.
For more details on provisioning xAI Management API key and using Management API, you can visit
.
The base url for searching within `collections` is shared with [REST API](/developers/rest-api-reference) at `https://api.x.ai`. You have to authenticate with the header `Authorization: Bearer `.
* [Collection Management](/developers/rest-api-reference/collections/collection)
* [Search in Collections](/developers/rest-api-reference/collections/search)
===/developers/rest-api-reference/collections/search===
#### Collections API
# Search in Collections
The base url for searching `collections` is shared with [REST API](/developers/rest-api-reference) at `https://api.x.ai`. You have to authenticate using **xAI API Key** with the header `Authorization: Bearer `.
***
## POST /v1/documents/search
API endpoint for POST requests to /v1/documents/search.
```
Method: POST
Path: /v1/documents/search
```
===/developers/rest-api-reference/files/download===
#### Files API
# Download
***
## POST /v1/files:download
API endpoint for POST requests to /v1/files:download.
```
Method: POST
Path: /v1/files:download
```
===/developers/rest-api-reference/files/manage===
#### Files API
# Manage
***
## GET /v1/files
API endpoint for GET requests to /v1/files.
```
Method: GET
Path: /v1/files
```
***
## GET /v1/files/\{file\_id}
API endpoint for GET requests to /v1/files/\{file\_id}.
```
Method: GET
Path: /v1/files/{file_id}
```
***
## PUT /v1/files/\{file\_id}
API endpoint for PUT requests to /v1/files/\{file\_id}.
```
Method: PUT
Path: /v1/files/{file_id}
```
***
## DELETE /v1/files/\{file\_id}
API endpoint for DELETE requests to /v1/files/\{file\_id}.
```
Method: DELETE
Path: /v1/files/{file_id}
```
===/developers/rest-api-reference/files===
#### Files API
# Files REST API Overview
The Files API allows you to upload, manage, and download files for use with Grok models. Files can be attached to chat messages for document understanding, or added to collections for semantic search.
The base URL for all file operations is `https://api.x.ai`. You have to authenticate with the header `Authorization: Bearer `.
* [Upload](/developers/rest-api-reference/files/upload)
* [Manage](/developers/rest-api-reference/files/manage)
* [Download](/developers/rest-api-reference/files/download)
===/developers/rest-api-reference/files/upload===
#### Files API
# Upload
***
## POST /v1/files
API endpoint for POST requests to /v1/files.
```
Method: POST
Path: /v1/files
```
***
## POST /v1/files/batch\_upload
API endpoint for POST requests to /v1/files/batch\_upload.
```
Method: POST
Path: /v1/files/batch_upload
```
***
## POST /v1/files/batch\_upload/\{batch\_job\_id}:complete
API endpoint for POST requests to /v1/files/batch\_upload/\{batch\_job\_id}:complete.
```
Method: POST
Path: /v1/files/batch_upload/{batch_job_id}:complete
```
***
## POST /v1/files:initialize
API endpoint for POST requests to /v1/files:initialize.
```
Method: POST
Path: /v1/files:initialize
```
***
## POST /v1/files:uploadChunks
API endpoint for POST requests to /v1/files:uploadChunks.
```
Method: POST
Path: /v1/files:uploadChunks
```
===/developers/rest-api-reference/inference/batches===
#### Inference API
# Batches
***
## POST /v1/batches
API endpoint for POST requests to /v1/batches.
```
Method: POST
Path: /v1/batches
```
***
## GET /v1/batches
API endpoint for GET requests to /v1/batches.
```
Method: GET
Path: /v1/batches
```
***
## GET /v1/batches/\{batch\_id}
API endpoint for GET requests to /v1/batches/\{batch\_id}.
```
Method: GET
Path: /v1/batches/{batch_id}
```
***
## GET /v1/batches/\{batch\_id}/requests
API endpoint for GET requests to /v1/batches/\{batch\_id}/requests.
```
Method: GET
Path: /v1/batches/{batch_id}/requests
```
***
## POST /v1/batches/\{batch\_id}/requests
API endpoint for POST requests to /v1/batches/\{batch\_id}/requests.
```
Method: POST
Path: /v1/batches/{batch_id}/requests
```
***
## GET /v1/batches/\{batch\_id}/results
API endpoint for GET requests to /v1/batches/\{batch\_id}/results.
```
Method: GET
Path: /v1/batches/{batch_id}/results
```
***
## POST /v1/batches/\{batch\_id}:cancel
API endpoint for POST requests to /v1/batches/\{batch\_id}:cancel.
```
Method: POST
Path: /v1/batches/{batch_id}:cancel
```
===/developers/rest-api-reference/inference/chat===
#### Inference API
# Chat
***
## POST /v1/chat/completions
API endpoint for POST requests to /v1/chat/completions.
```
Method: POST
Path: /v1/chat/completions
```
***
## POST /v1/responses
API endpoint for POST requests to /v1/responses.
```
Method: POST
Path: /v1/responses
```
***
## GET /v1/responses/\{response\_id}
API endpoint for GET requests to /v1/responses/\{response\_id}.
```
Method: GET
Path: /v1/responses/{response_id}
```
***
## DELETE /v1/responses/\{response\_id}
API endpoint for DELETE requests to /v1/responses/\{response\_id}.
```
Method: DELETE
Path: /v1/responses/{response_id}
```
***
## GET /v1/chat/deferred-completion/\{request\_id}
API endpoint for GET requests to /v1/chat/deferred-completion/\{request\_id}.
```
Method: GET
Path: /v1/chat/deferred-completion/{request_id}
```
===/developers/rest-api-reference/inference/images===
#### Inference API
# Images
***
## POST /v1/images/generations
API endpoint for POST requests to /v1/images/generations.
```
Method: POST
Path: /v1/images/generations
```
***
## POST /v1/images/edits
API endpoint for POST requests to /v1/images/edits.
```
Method: POST
Path: /v1/images/edits
```
===/developers/rest-api-reference/inference/legacy===
#### Inference API
# Legacy & Deprecated
***
## POST /v1/completions
API endpoint for POST requests to /v1/completions.
```
Method: POST
Path: /v1/completions
```
***
**Deprecated**: The Anthropic SDK compatibility is fully deprecated. Please migrate to the [Responses API](/developers/rest-api-reference/inference/chat#create-new-response) or [gRPC](/developers/grpc-api-reference).
## POST /v1/messages
API endpoint for POST requests to /v1/messages.
```
Method: POST
Path: /v1/messages
```
***
**Deprecated**: The Anthropic SDK compatibility is fully deprecated. Please migrate to the [Responses API](/developers/rest-api-reference/inference/chat#create-new-response) or [gRPC](/developers/grpc-api-reference).
## POST /v1/complete
API endpoint for POST requests to /v1/complete.
```
Method: POST
Path: /v1/complete
```
===/developers/rest-api-reference/inference/models===
#### Inference API
# Models
***
## GET /v1/models
API endpoint for GET requests to /v1/models.
```
Method: GET
Path: /v1/models
```
***
## GET /v1/models/\{model\_id}
API endpoint for GET requests to /v1/models/\{model\_id}.
```
Method: GET
Path: /v1/models/{model_id}
```
***
## GET /v1/language-models
API endpoint for GET requests to /v1/language-models.
```
Method: GET
Path: /v1/language-models
```
***
## GET /v1/language-models/\{model\_id}
API endpoint for GET requests to /v1/language-models/\{model\_id}.
```
Method: GET
Path: /v1/language-models/{model_id}
```
***
## GET /v1/image-generation-models
API endpoint for GET requests to /v1/image-generation-models.
```
Method: GET
Path: /v1/image-generation-models
```
***
## GET /v1/image-generation-models/\{model\_id}
API endpoint for GET requests to /v1/image-generation-models/\{model\_id}.
```
Method: GET
Path: /v1/image-generation-models/{model_id}
```
===/developers/rest-api-reference/inference/other===
#### Inference API
# Other
***
## GET /v1/api-key
API endpoint for GET requests to /v1/api-key.
```
Method: GET
Path: /v1/api-key
```
***
## POST /v1/tokenize-text
API endpoint for POST requests to /v1/tokenize-text.
```
Method: POST
Path: /v1/tokenize-text
```
===/developers/rest-api-reference/inference===
#### Inference API
# Inference REST API Overview
The xAI Inference REST API is a robust, high-performance RESTful interface designed for seamless integration into existing systems.
It offers advanced AI capabilities with full compatibility with the OpenAI REST API.
The base for all routes is at `https://api.x.ai`. For all routes, you have to authenticate with the header `Authorization: Bearer `.
* [Chat](/developers/rest-api-reference/inference/chat)
* [Images](/developers/rest-api-reference/inference/images)
* [Videos](/developers/rest-api-reference/inference/videos)
* [Models](/developers/rest-api-reference/inference/models)
* [Files](/developers/rest-api-reference/files)
* [Batches](/developers/rest-api-reference/inference/batches)
* [Other](/developers/rest-api-reference/inference/other)
* [Legacy & Deprecated](/developers/rest-api-reference/inference/legacy)
===/developers/rest-api-reference/inference/videos===
#### Inference API
# Videos
***
## POST /v1/videos/generations
API endpoint for POST requests to /v1/videos/generations.
```
Method: POST
Path: /v1/videos/generations
```
***
## POST /v1/videos/edits
API endpoint for POST requests to /v1/videos/edits.
```
Method: POST
Path: /v1/videos/edits
```
***
## GET /v1/videos/\{request\_id}
API endpoint for GET requests to /v1/videos/\{request\_id}.
```
Method: GET
Path: /v1/videos/{request_id}
```
===/developers/rest-api-reference/management/audit===
#### Management API
# Audit Logs
***
## GET /audit/teams/\{teamId}/events
API endpoint for GET requests to /audit/teams/\{teamId}/events.
```
Method: GET
Path: /audit/teams/{teamId}/events
```
===/developers/rest-api-reference/management/auth===
#### Management API
# Accounts and Authorization
***
## POST /auth/teams/\{teamId}/api-keys
API endpoint for POST requests to /auth/teams/\{teamId}/api-keys.
```
Method: POST
Path: /auth/teams/{teamId}/api-keys
```
***
## GET /auth/teams/\{teamId}/api-keys
API endpoint for GET requests to /auth/teams/\{teamId}/api-keys.
```
Method: GET
Path: /auth/teams/{teamId}/api-keys
```
***
## PUT /auth/api-keys/\{api\_key\_id}
API endpoint for PUT requests to /auth/api-keys/\{api\_key\_id}.
```
Method: PUT
Path: /auth/api-keys/{api_key_id}
```
***
## DELETE /auth/api-keys/\{apiKeyId}
API endpoint for DELETE requests to /auth/api-keys/\{apiKeyId}.
```
Method: DELETE
Path: /auth/api-keys/{apiKeyId}
```
***
## GET /auth/api-keys/\{apiKeyId}/propagation
API endpoint for GET requests to /auth/api-keys/\{apiKeyId}/propagation.
```
Method: GET
Path: /auth/api-keys/{apiKeyId}/propagation
```
***
## GET /auth/teams/\{teamId}/models
API endpoint for GET requests to /auth/teams/\{teamId}/models.
```
Method: GET
Path: /auth/teams/{teamId}/models
```
***
## GET /auth/teams/\{teamId}/endpoints
API endpoint for GET requests to /auth/teams/\{teamId}/endpoints.
```
Method: GET
Path: /auth/teams/{teamId}/endpoints
```
***
## GET /auth/management-keys/validation
API endpoint for GET requests to /auth/management-keys/validation.
```
Method: GET
Path: /auth/management-keys/validation
```
===/developers/rest-api-reference/management/billing===
#### Management API
# Billing Management
***
## GET /v1/billing/teams/\{team\_id}/billing-info
API endpoint for GET requests to /v1/billing/teams/\{team\_id}/billing-info.
```
Method: GET
Path: /v1/billing/teams/{team_id}/billing-info
```
***
## POST /v1/billing/teams/\{team\_id}/billing-info
API endpoint for POST requests to /v1/billing/teams/\{team\_id}/billing-info.
```
Method: POST
Path: /v1/billing/teams/{team_id}/billing-info
```
***
## GET /v1/billing/teams/\{team\_id}/invoices
API endpoint for GET requests to /v1/billing/teams/\{team\_id}/invoices.
```
Method: GET
Path: /v1/billing/teams/{team_id}/invoices
```
***
## GET /v1/billing/teams/\{team\_id}/payment-method
API endpoint for GET requests to /v1/billing/teams/\{team\_id}/payment-method.
```
Method: GET
Path: /v1/billing/teams/{team_id}/payment-method
```
***
## POST /v1/billing/teams/\{team\_id}/payment-method/default
API endpoint for POST requests to /v1/billing/teams/\{team\_id}/payment-method/default.
```
Method: POST
Path: /v1/billing/teams/{team_id}/payment-method/default
```
***
## GET /v1/billing/teams/\{team\_id}/postpaid/invoice/preview
API endpoint for GET requests to /v1/billing/teams/\{team\_id}/postpaid/invoice/preview.
```
Method: GET
Path: /v1/billing/teams/{team_id}/postpaid/invoice/preview
```
***
## GET /v1/billing/teams/\{team\_id}/postpaid/spending-limits
API endpoint for GET requests to /v1/billing/teams/\{team\_id}/postpaid/spending-limits.
```
Method: GET
Path: /v1/billing/teams/{team_id}/postpaid/spending-limits
```
***
## POST /v1/billing/teams/\{team\_id}/postpaid/spending-limits
API endpoint for POST requests to /v1/billing/teams/\{team\_id}/postpaid/spending-limits.
```
Method: POST
Path: /v1/billing/teams/{team_id}/postpaid/spending-limits
```
***
## GET /v1/billing/teams/\{team\_id}/prepaid/balance
API endpoint for GET requests to /v1/billing/teams/\{team\_id}/prepaid/balance.
```
Method: GET
Path: /v1/billing/teams/{team_id}/prepaid/balance
```
***
## POST /v1/billing/teams/\{team\_id}/prepaid/top-up
API endpoint for POST requests to /v1/billing/teams/\{team\_id}/prepaid/top-up.
```
Method: POST
Path: /v1/billing/teams/{team_id}/prepaid/top-up
```
***
## POST /v1/billing/teams/\{team\_id}/usage
API endpoint for POST requests to /v1/billing/teams/\{team\_id}/usage.
```
Method: POST
Path: /v1/billing/teams/{team_id}/usage
```
===/developers/rest-api-reference/management===
#### Management API
# Management REST API Overview
The Management API allows you to perform operations on your team programmatically. You
need a [management key](https://console.x.ai/team/default/management-keys) in
order to use this API. The base URL for all endpoints is `https://management-api.x.ai`.
The Management API serves as a dedicated interface to the xAI platform, empowering developers and teams to
programmatically manage their xAI API teams.
For example, users can provision their API key, handle access controls,
and perform team-level operations like creating, listing, updating, or deleting keys and associated access control lists
(ACLs). This API also facilitates oversight of billing aspects, including monitoring prepaid credit balances and usage
deductions, ensuring seamless scalability and cost transparency for Grok model integrations.
To get started, go to [xAI Console](https://console.x.ai). On users page, make sure your xAI account has
`Management Keys` Read + Write permission, and obtain your Management API key on the settings page. If you don't see
any of these options, please ask your team administrator to enable the appropriate permissions.
* [Accounts and Authorization](/developers/rest-api-reference/management/auth)
* [Billing Management](/developers/rest-api-reference/management/billing)
* [Audit Logs](/developers/rest-api-reference/management/audit)
===/developers/tools/advanced-usage===
#### Tools
# Advanced Usage
In this section, we explore advanced usage patterns for agentic tool calling, including:
* **[Use Client-side Tools](#mixing-server-side-and-client-side-tools)** - Combine server-side agentic tools with your own client-side tools for specialized functionality that requires local execution.
* **[Multi-turn Conversations](#multi-turn-conversations-with-preservation-of-agentic-state)** - Maintain context across multiple turns in agentic tool-enabled conversations, allowing the model to build upon previous research and tool results for more complex, iterative problem-solving
* **[Requests with Multiple Active Tools](#tool-combinations)** - Send requests with multiple server-side tools active simultaneously, enabling comprehensive analysis with web search, X search, and code execution tools working together
* **[Image Integration](#using-images-in-the-context)** - Include images in your tool-enabled conversations for visual analysis and context-aware searches
Advanced tool usage patterns are not yet supported in the Vercel AI SDK. Please use the xAI SDK or OpenAI SDK for this functionality.
## Mixing Server-Side and Client-Side Tools
You can combine server-side agentic tools (like web search and code execution) with custom client-side tools to create powerful hybrid workflows. This approach lets you leverage the model's reasoning capabilities with server-side tools while adding specialized functionality that runs locally in your application.
### How It Works
The key difference when mixing server-side and client-side tools is that **server-side tools are executed automatically by xAI**, while **client-side tools require developer intervention**:
1. Define your client-side tools using [standard function calling patterns](/developers/tools/function-calling)
2. Include both server-side and client-side tools in your request
3. **xAI automatically executes any server-side tools** the model decides to use (web search, code execution, etc.)
4. **When the model calls client-side tools, execution pauses** - xAI returns the tool calls to you instead of executing them
5. **Detect and execute client-side tool calls yourself**, then append the results back to continue the conversation
6. **Repeat this process** until the model generates a final response with no additional client-side tool calls
### Understanding `max_turns` with Client-Side Tools
When using [the `max_turns` parameter](/developers/tools/tool-usage-details#limiting-tool-call-turns) with mixed server-side and client-side tools, it's important to understand that **`max_turns` only limits the assistant/server-side tool call turns within a single request**.
When the model decides to invoke a client-side tool, the agent execution **pauses and yields control back to your application**. This means:
* The current request completes, and you receive the client-side tool call(s) to execute
* After you execute the client-side tool and append the result, you make a **new follow-up request**
* This follow-up request starts with a fresh `max_turns` count
In other words, client-side tool invocations act as "checkpoints" that reset the turn counter. If you set `max_turns=5` and the agent performs 3 server-side tool calls before requesting a client-side tool, the subsequent request (after you provide the client-side tool result) will again allow up to 5 server-side tool turns.
### Practical Example
Given a local client-side function `get_weather` to get the weather of a specified city, the model can use this client-side tool and the web-search tool to determine the weather in the base city of the 2025 NBA champion.
### Using the xAI SDK
You can determine whether a tool call is a client-side tool call by using `xai_sdk.tools.get_tool_call_type` against a tool call from the `response.tool_calls` list.
For more details, check [Identifying Tool Call Types](/developers/tools/tool-usage-details#identifying-tool-call-types).
1. Import the dependencies, and define the client-side tool.
```pythonXAI
import os
import json
from xai_sdk import Client
from xai_sdk.chat import user, tool, tool_result
from xai_sdk.tools import web_search, get_tool_call_type
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Define client-side tool
def get_weather(city: str) -> str:
"""Get the weather for a given city."""
# In a real app, this would query your database
return f"The weather in {city} is sunny."
# Tools array with both server-side and client-side tools
tools = [
web_search(),
tool(
name="get_weather",
description="Get the weather for a given city.",
parameters={
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city",
}
},
"required": ["city"]
},
),
]
model = "grok-4-1-fast-reasoning"
```
2. Perform the tool loop with conversation continuation:
* You can either use `previous_response_id` to continue the conversation from the last response.
```pythonXAI
# Create chat with both server-side and client-side tools
chat = client.chat.create(
model=model,
tools=tools,
store_messages=True,
)
chat.append(
user(
"What is the weather in the base city of the team that won the "
"2025 NBA championship?"
)
)
while True:
client_side_tool_calls = []
for response, chunk in chat.stream():
for tool_call in chunk.tool_calls:
if get_tool_call_type(tool_call) == "client_side_tool":
client_side_tool_calls.append(tool_call)
else:
print(
f"Server-side tool call: {tool_call.function.name} "
f"with arguments: {tool_call.function.arguments}"
)
if not client_side_tool_calls:
break
chat = client.chat.create(
model=model,
tools=tools,
store_messages=True,
previous_response_id=response.id,
)
for tool_call in client_side_tool_calls:
print(
f"Client-side tool call: {tool_call.function.name} "
f"with arguments: {tool_call.function.arguments}"
)
args = json.loads(tool_call.function.arguments)
result = get_weather(args["city"])
chat.append(tool_result(result))
print(f"Final response: {response.content}")
```
* Alternatively, you can use the encrypted content to continue the conversation.
```pythonXAI
# Create chat with both server-side and client-side tools
chat = client.chat.create(
model=model,
tools=tools,
use_encrypted_content=True,
)
chat.append(
user(
"What is the weather in the base city of the team that won the "
"2025 NBA championship?"
)
)
while True:
client_side_tool_calls = []
for response, chunk in chat.stream():
for tool_call in chunk.tool_calls:
if get_tool_call_type(tool_call) == "client_side_tool":
client_side_tool_calls.append(tool_call)
else:
print(
f"Server-side tool call: {tool_call.function.name} "
f"with arguments: {tool_call.function.arguments}"
)
chat.append(response)
if not client_side_tool_calls:
break
for tool_call in client_side_tool_calls:
print(
f"Client-side tool call: {tool_call.function.name} "
f"with arguments: {tool_call.function.arguments}"
)
args = json.loads(tool_call.function.arguments)
result = get_weather(args["city"])
chat.append(tool_result(result))
print(f"Final response: {response.content}")
```
You will see an output similar to the following:
```
Server-side tool call: web_search with arguments: {"query":"Who won the 2025 NBA championship?","num_results":5}
Client-side tool call: get_weather with arguments: {"city":"Oklahoma City"}
Final response: The Oklahoma City Thunder won the 2025 NBA championship. The current weather in Oklahoma City is sunny.
```
### Using the OpenAI SDK
You can determine whether a tool call is a client-side tool call by checking the `type` field of an output entry from the `response.output` list.
For more details, see [Identifying Tool Call Types](/developers/tools/tool-usage-details#identifying-tool-call-types).
1. Import the dependencies, and define the client-side tool.
```pythonOpenAISDK
import os
import json
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
# Define client-side tool
def get_weather(city: str) -> str:
"""Get the weather for a given city."""
# In a real app, this would query your database
return f"The weather in {city} is sunny."
model = "grok-4-1-fast-reasoning"
tools = [
{
"type": "function",
"name": "get_weather",
"description": "Get the weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city",
},
},
"required": ["city"],
},
},
{
"type": "web_search",
},
]
```
2. Perform the tool loop:
* You can either use `previous_response_id`.
```pythonOpenAISDK
response = client.responses.create(
model=model,
input=(
"What is the weather in the base city of the team that won the "
"2025 NBA championship?"
),
tools=tools,
)
while True:
tool_outputs = []
for item in response.output:
if item.type == "function_call":
print(f"Client-side tool call: {item.name} with arguments: {item.arguments}")
args = json.loads(item.arguments)
weather = get_weather(args["city"])
tool_outputs.append(
{
"type": "function_call_output",
"call_id": item.call_id,
"output": weather,
}
)
elif item.type in (
"web_search_call",
"x_search_call",
"code_interpreter_call",
"file_search_call",
"mcp_call"
):
print(
f"Server-side tool call: {item.name} with arguments: {item.arguments}"
)
if not tool_outputs:
break
response = client.responses.create(
model=model,
tools=tools,
input=tool_outputs,
previous_response_id=response.id,
)
print("Final response:", response.output[-1].content[0].text)
```
* or using the encrypted content
```pythonOpenAISDK
input_list = [
{
"role": "user",
"content": (
"What is the weather in the base city of the team that won the "
"2025 NBA championship?"
),
}
]
response = client.responses.create(
model=model,
input=input_list,
tools=tools,
include=["reasoning.encrypted_content"],
)
while True:
input_list.extend(response.output)
tool_outputs = []
for item in response.output:
if item.type == "function_call":
print(f"Client-side tool call: {item.name} with arguments: {item.arguments}")
args = json.loads(item.arguments)
weather = get_weather(args["city"])
tool_outputs.append(
{
"type": "function_call_output",
"call_id": item.call_id,
"output": weather,
}
)
elif item.type in (
"web_search_call",
"x_search_call",
"code_interpreter_call",
"file_search_call",
"mcp_call"
):
print(
f"Server-side tool call: {item.name} with arguments: {item.arguments}"
)
if not tool_outputs:
break
input_list.extend(tool_outputs)
response = client.responses.create(
model=model,
input=input_list,
tools=tools,
include=["reasoning.encrypted_content"],
)
print("Final response:", response.output[-1].content[0].text)
```
## Multi-turn Conversations with Preservation of Agentic State
When using agentic tools, you may want to have multi-turn conversations where follow-up prompts maintain all agentic state, including the full history of reasoning, tool calls, and tool responses. This is possible using the stateful API, which provides seamless integration for preserving conversation context across multiple interactions. There are two options to achieve this outlined below.
### Store the Conversation History Remotely
You can choose to store the conversation history remotely on the xAI server, and every time you want to continue the conversation, you can pick up from the last response where you want to resume from.
There are only 2 extra steps:
1. Add the parameter `store_messages=True` when making the first agentic request. This tells the service to store the entire conversation history on xAI servers, including the model's reasoning, server-side tool calls, and corresponding responses.
2. Pass `previous_response_id=response.id` when creating the follow-up conversation, where `response` is the response returned by `chat.sample()` or `chat.stream()` from the conversation that you wish to continue.
Note that the follow-up conversation does not need to use the same tools, model parameters, or any other configuration as the initial conversation—it will still be fully hydrated with the complete agentic state from the previous interaction.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
# First turn.
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[web_search(), x_search()],
store_messages=True,
)
chat.append(user("What is xAI?"))
print("\\n\\n##### First turn #####\\n")
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True)
print("\\n\\nUsage for first turn:", response.server_side_tool_usage)
# Second turn.
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[web_search(), x_search()],
# pass the response id of the first turn to continue the conversation
previous_response_id=response.id,
)
chat.append(user("What is its latest mission?"))
print("\\n\\n##### Second turn #####\\n")
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True)
print("\\n\\nUsage for second turn:", response.server_side_tool_usage)
```
### Append the Encrypted Agentic Tool Calling States
There is another option for the ZDR (Zero Data Retention) users, or the users who don't want to use the above option, that is to let the xAI server also return
the encrypted reasoning and the encrypted tool output besides the final content to the client side, and those encrypted contents can be included as a part of the context
in the next turn conversation.
Here are the extra steps you need to take for this option:
1. Add the parameter `use_encrypted_content=True` when making the first agentic request. This tells the service to return the entire conversation history to the client side, including the model's reasoning (encrypted), server-side tool calls, and corresponding responses (encrypted).
2. Append the response to the conversation you wish to continue before making the call to `chat.sample()` or `chat.stream()`.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
# First turn.
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[web_search(), x_search()],
use_encrypted_content=True,
)
chat.append(user("What is xAI?"))
print("\\n\\n##### First turn #####\\n")
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True)
print("\\n\\nUsage for first turn:", response.server_side_tool_usage)
chat.append(response)
print("\\n\\n##### Second turn #####\\n")
chat.append(user("What is its latest mission?"))
# Second turn.
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True)
print("\\n\\nUsage for second turn:", response.server_side_tool_usage)
```
For more details about stateful responses, please check out [this guide](/developers/model-capabilities/text/generate-text).
## Tool Combinations
Equipping your requests with multiple tools is straightforward—simply include the tools you want to activate in the `tools` array of your request. The model will intelligently orchestrate between them based on the task at hand.
### Suggested Tool Combinations
Here are some common patterns for combining tools, depending on your use case:
| If you're trying to... | Consider activating... | Because... |
|------------------------|----------------------|------------|
| **Research & analyze data** | Web Search + Code Execution | Web search gathers information, code execution analyzes and visualizes it |
| **Aggregate news & social media** | Web Search + X Search | Get comprehensive coverage from both traditional web and social platforms |
| **Extract insights from multiple sources** | Web Search + X Search + Code Execution | Collect data from various sources then compute correlations and trends |
| **Monitor real-time discussions** | X Search + Web Search | Track social sentiment alongside authoritative information |
```pythonXAI
from xai_sdk.tools import web_search, x_search, code_execution
# Example tool combinations for different scenarios
research_setup = [web_search(), code_execution()]
news_setup = [web_search(), x_search()]
comprehensive_setup = [web_search(), x_search(), code_execution()]
```
```pythonWithoutSDK
research_setup = {
"tools": [
{"type": "web_search"},
{"type": "code_interpreter"}
]
}
news_setup = {
"tools": [
{"type": "web_search"},
{"type": "x_search"}
]
}
comprehensive_setup = {
"tools": [
{"type": "web_search"},
{"type": "x_search"},
{"type": "code_interpreter"}
]
}
```
### Using Tool Combinations in Different Scenarios
1. When you want to search for news on the Internet, you can activate all search tools:
* Web search tool
* X search tool
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[
web_search(),
x_search(),
],
include=["verbose_streaming"],
)
chat.append(user("what is the latest update from xAI?"))
is_thinking = True
for response, chunk in chat.stream():
# View the server-side tool calls as they are being made in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
print("\\n\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
print("\\n\\nServer Side Tool Calls:")
print(response.tool_calls)
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "what is the latest update from xAI?",
},
],
tools=[
{
"type": "web_search",
},
{
"type": "x_search",
},
],
)
print(response)
```
```pythonRequests
import os
import requests
url = "https://api.x.ai/v1/responses"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
payload = {
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "what is the latest update from xAI?"
}
],
"tools": [
{
"type": "web_search",
},
{
"type": "x_search",
}
]
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What is the latest update from xAI?"
}
],
"tools": [
{
"type": "web_search"
},
{
"type": "x_search"
}
]
}'
```
2. When you want to collect up-to-date data from the Internet and perform calculations based on the Internet data, you can choose to activate:
* Web search tool
* Code execution tool
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, code_execution
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
# research_tools
tools=[
web_search(),
code_execution(),
],
include=["verbose_streaming"],
)
chat.append(user("What is the average market cap of the companies with the top 5 market cap in the US stock market today?"))
# sample or stream the response...
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "What is the average market cap of the companies with the top 5 market cap in the US stock market today?",
},
],
# research_tools
tools=[
{
"type": "web_search",
},
{
"type": "code_interpreter",
},
],
)
print(response)
```
```pythonRequests
import os
import requests
url = "https://api.x.ai/v1/responses"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
payload = {
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What is the average market cap of the companies with the top 5 market cap in the US stock market today?"
}
],
# research_tools
"tools": [
{
"type": "web_search",
},
{
"type": "code_interpreter",
},
]
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What is the average market cap of the companies with the top 5 market cap in the US stock market today?"
}
],
"tools": [
{
"type": "web_search"
},
{
"type": "code_interpreter"
}
]
}'
```
## Using Images in the Context
You can bootstrap your requests with an initial conversation context that includes images.
In the code sample below, we pass an image into the context of the conversation before initiating an agentic request.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import image, user
from xai_sdk.tools import web_search, x_search
# Create the client and define the server-side tools to use
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[web_search(), x_search()],
include=["verbose_streaming"],
)
# Add an image to the conversation
chat.append(
user(
"Search the internet and tell me what kind of dog is in the image below.",
"And what is the typical lifespan of this dog breed?",
image(
"https://pbs.twimg.com/media/G3B7SweXsAAgv5N?format=jpg&name=900x900"
),
)
)
is_thinking = True
for response, chunk in chat.stream():
# View the server-side tool calls as they are being made in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
print("\\n\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
print("\\n\\nServer Side Tool Calls:")
print(response.tool_calls)
```
===/developers/tools/citations===
#### Tools
# Citations
The agent tools API provides two types of citation information: **All Citations** (a complete list of all sources encountered) and **Inline Citations** (markdown-style links embedded directly in the response text).
## All Citations
The `citations` attribute on the `response` object provides a comprehensive list of URLs for all sources the agent encountered during its search process. This list is **always returned by default** — no additional configuration is required.
Citations are automatically collected from successful tool executions and provide full traceability of the agent's information sources. They are returned when the agentic request completes.
Note that not every URL in this list will necessarily be directly referenced in the final answer. The agent may examine a source during its research process and determine it is not sufficiently relevant to the user's query, but the URL will still appear in this list for transparency.
```pythonWithoutSDK
response.citations
```
```output
[
'https://x.com/i/user/1912644073896206336',
'https://x.com/i/status/1975607901571199086',
'https://x.ai/news',
'https://docs.x.ai/developers/release-notes',
...
]
```
## Inline Citations
Inline citations are **markdown-style links** (e.g., `[[1]](https://x.ai/news)`) inserted directly into the response text at the points where the model references sources. In addition to these visible links, **structured metadata** is available on the response object with precise positional information.
**Important**: Enabling inline citations does not guarantee that the model will cite sources on every answer. The model decides when and where to include citations based on the context and nature of the query.
### Enabling Inline Citations
Inline citations are returned by default with the Responses API. For the xAI SDK, you can explicitly request them with `include=["inline_citations"]`:
```bash customLanguage="bash"
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{"role": "user", "content": "What is xAI?"}
],
"tools": [{"type": "web_search"}]
}'
```
```python customLanguage="pythonXAI"
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(),
x_search(),
],
include=["inline_citations"], # Enable inline citations
)
chat.append(user("What is xAI?"))
response = chat.sample()
# Access the response text (includes inline citation markdown)
print(response.content)
```
```python customLanguage="pythonOpenAISDK"
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "user", "content": "What is xAI?"}
],
tools=[
{"type": "web_search"},
],
)
# Get the message output with inline citations
for item in response.output:
if item.type == "message":
for content in item.content:
if content.type == "output_text":
print(content.text)
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text, sources } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is xAI?',
tools: {
web_search: xai.tools.webSearch(),
},
});
// Text includes inline citation markdown
console.log(text);
// Sources contain all citation URLs
console.log('Sources:', sources);
```
```javascript customLanguage="javascriptOpenAISDK"
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: 'https://api.x.ai/v1',
});
const response = await client.responses.create({
model: 'grok-4-1-fast-reasoning',
input: [
{ role: 'user', content: 'What is xAI?' }
],
tools: [{ type: 'web_search' }],
});
// Get the message with inline citations
for (const item of response.output) {
if (item.type === 'message') {
for (const content of item.content) {
if (content.type === 'output_text') {
console.log(content.text);
}
}
}
}
```
### Markdown Citation Format
When inline citations are enabled, the model will insert markdown-style citation links directly into the response text:
```output
The latest announcements from xAI, primarily from their official X account (@xai) and website (x.ai/news), date back to November 19, 2025.[[1]](https://x.ai/news/)[[2]](https://x.ai/)[[3]](https://x.com/i/status/1991284813727474073)
```
When rendered as markdown, this displays as clickable links:
> The latest announcements from xAI, primarily from their official X account (@xai) and website (x.ai/news), date back to November 19, 2025.[\[1\]](https://x.ai/news/)[\[2\]](https://x.ai/)[\[3\]](https://x.com/i/status/1991284813727474073)
The format is `[[N]](url)` where:
* `N` is the sequential display number for the citation **starting from 1**
* `url` is the source URL
**Citation numbering**: Citation numbers always start from 1 and increment sequentially. If the same source is cited again later in the response, the original citation number will be reused.
## Accessing Structured Inline Citation Data
Structured inline citation data provides precise positional information about each citation in the response text.
### Response Format
Each citation annotation contains:
| Field | Type | Description |
|-------|------|-------------|
| `type` | string | Always `"url_citation"` |
| `url` | string | The source URL |
| `start_index` | int | Character position where the citation starts in the response text |
| `end_index` | int | Character position where the citation ends (exclusive) |
| `title` | string | The citation number (e.g., "1", "2") |
```python customLanguage="pythonXAI"
# After streaming or sampling completes, access the structured inline citations:
for citation in response.inline_citations:
print(f"Citation [{citation.id}]:")
print(f" Position: {citation.start_index} to {citation.end_index}")
# Check citation type
if citation.HasField("web_citation"):
print(f" Web URL: {citation.web_citation.url}")
elif citation.HasField("x_citation"):
print(f" X URL: {citation.x_citation.url}")
```
```python customLanguage="pythonOpenAISDK"
# Access annotations from the response
for item in response.output:
if item.type == "message":
for content in item.content:
if content.type == "output_text":
for annotation in content.annotations:
print(f"Citation [{annotation.title}]:")
print(f" URL: {annotation.url}")
print(f" Position: {annotation.start_index} to {annotation.end_index}")
```
```javascript customLanguage="javascriptAISDK"
import { xai } from '@ai-sdk/xai';
import { streamText } from 'ai';
const { fullStream } = streamText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is xAI?',
tools: {
web_search: xai.tools.webSearch(),
},
});
// Access sources as they stream in
for await (const part of fullStream) {
if (part.type === 'source' && part.sourceType === 'url') {
console.log(`Citation: ${part.url}`);
}
}
```
```javascript customLanguage="javascriptOpenAISDK"
// Access annotations from the response
for (const item of response.output) {
if (item.type === 'message') {
for (const content of item.content) {
if (content.type === 'output_text') {
for (const annotation of content.annotations) {
console.log(`Citation [${annotation.title}]:`);
console.log(` URL: ${annotation.url}`);
console.log(` Position: ${annotation.start_index} to ${annotation.end_index}`);
}
}
}
}
}
```
```output
Citation [1]:
Position: 37 to 76
Web URL: https://x.ai/news/grok-4-fast
Citation [2]:
Position: 124 to 171
X URL: https://x.com/xai/status/1234567890
```
### Using Position Indices
The `start_index` and `end_index` values follow Python slice convention:
* **`start_index`**: Character position of the first `[` of the citation
* **`end_index`**: Character position immediately *after* the closing `)` (exclusive)
Extract the exact citation markdown from the response text using a simple slice:
```python customLanguage="pythonXAI"
content = response.content
for citation in response.inline_citations:
# Extract the markdown link from the response text
citation_text = content[citation.start_index:citation.end_index]
print(f"Citation text: {citation_text}")
```
## Streaming Inline Citations
During streaming, inline citations are accumulated and available on the final response. The markdown links appear in real-time in the `chunk.content` as the model generates text:
```python customLanguage="pythonXAI"
for response, chunk in chat.stream():
# Markdown links appear in chunk.content in real-time
if chunk.content:
print(chunk.content, end="", flush=True)
# Inline citations can also be accessed per-chunk during streaming
for citation in chunk.inline_citations:
print(f"\nNew citation: [{citation.id}]")
# After streaming, access all accumulated inline citations
print("\n\nAll inline citations:")
for citation in response.inline_citations:
url = ""
if citation.HasField("web_citation"):
url = citation.web_citation.url
elif citation.HasField("x_citation"):
url = citation.x_citation.url
print(f" [{citation.id}] {url}")
```
===/developers/tools/code-execution===
#### Tools
# Code Execution Tool
The code execution tool enables Grok to write and execute Python code in real-time, dramatically expanding its capabilities beyond text generation. This powerful feature allows Grok to perform precise calculations, complex data analysis, statistical computations, and solve mathematical problems that would be impossible through text alone.
## Key Capabilities
* **Mathematical Computations**: Solve complex equations, perform statistical analysis, and handle numerical calculations with precision
* **Data Analysis**: Process datasets, and extract insights from the prompt
* **Financial Modeling**: Build financial models, calculate risk metrics, and perform quantitative analysis
* **Scientific Computing**: Handle scientific calculations, simulations, and data transformations
* **Code Generation & Testing**: Write, test, and debug Python code snippets in real-time
## When to Use Code Execution
The code execution tool is particularly valuable for:
* **Numerical Problems**: When you need exact calculations rather than approximations
* **Data Processing**: Analyzing complex data from the prompt
* **Complex Logic**: Multi-step calculations that require intermediate results
* **Verification**: Double-checking mathematical results or validating assumptions
## SDK Support
The code execution tool is available across multiple SDKs and APIs with different naming conventions:
| SDK/API | Tool Name | Description |
|---------|-----------|-------------|
| xAI SDK | `code_execution` | Native xAI SDK implementation |
| OpenAI Responses API | `code_interpreter` | Compatible with OpenAI's API format |
| Vercel AI SDK | `xai.tools.codeExecution()` | Vercel AI SDK integration |
This tool is also supported in all Responses API compatible SDKs.
## Implementation Example
Below are comprehensive examples showing how to integrate the code execution tool across different platforms and use cases.
### Basic Calculations
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import code_execution
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[code_execution()],
include=["verbose_streaming"],
)
# Ask for a mathematical calculation
chat.append(user("Calculate the compound interest for $10,000 at 5% annually for 10 years"))
is_thinking = True
for response, chunk in chat.stream():
# View the server-side tool calls as they are being made in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
print("\\n\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
print("\\n\\nServer Side Tool Calls:")
print(response.tool_calls)
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "Calculate the compound interest for $10,000 at 5% annually for 10 years",
},
],
tools=[
{
"type": "code_interpreter",
},
],
)
print(response)
```
```pythonRequests
import os
import requests
url = "https://api.x.ai/v1/responses"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
payload = {
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "Calculate the compound interest for $10,000 at 5% annually for 10 years"
}
],
"tools": [
{
"type": "code_interpreter",
}
]
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "Calculate the compound interest for $10,000 at 5% annually for 10 years"
}
],
"tools": [
{
"type": "code_interpreter"
}
]
}'
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'Calculate the compound interest for $10,000 at 5% annually for 10 years',
tools: {
code_execution: xai.tools.codeExecution(),
},
});
console.log(text);
```
### Data Analysis
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import code_execution
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Multi-turn conversation with data analysis
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[code_execution()],
include=["verbose_streaming"],
)
# Step 1: Load and analyze data
chat.append(user("""
I have sales data for Q1-Q4: [120000, 135000, 98000, 156000].
Please analyze this data and create a visualization showing:
1. Quarterly trends
2. Growth rates
3. Statistical summary
"""))
print("##### Step 1: Data Analysis #####\\n")
is_thinking = True
for response, chunk in chat.stream():
# View the server-side tool calls as they are being made in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nAnalysis Results:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
print("\\n\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
chat.append(response)
# Step 2: Follow-up analysis
chat.append(user("Now predict Q1 next year using linear regression"))
print("\\n\\n##### Step 2: Prediction Analysis #####\\n")
is_thinking = True
for response, chunk in chat.stream():
# View the server-side tool calls as they are being made in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nPrediction Results:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
print("\\n\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
print("\\n\\nServer Side Tool Calls:")
print(response.tool_calls)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
// Step 1: Load and analyze data
const step1 = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: \`I have sales data for Q1-Q4: [120000, 135000, 98000, 156000].
Please analyze this data and create a visualization showing:
1. Quarterly trends
2. Growth rates
3. Statistical summary\`,
tools: {
code_execution: xai.tools.codeExecution(),
},
});
console.log('##### Step 1: Data Analysis #####');
console.log(step1.text);
// Step 2: Follow-up analysis using previousResponseId
const step2 = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'Now predict Q1 next year using linear regression',
tools: {
code_execution: xai.tools.codeExecution(),
},
providerOptions: {
xai: {
previousResponseId: step1.response.id,
},
},
});
console.log('##### Step 2: Prediction Analysis #####');
console.log(step2.text);
```
## Best Practices
### 1. **Be Specific in Requests**
Provide clear, detailed instructions about what you want the code to accomplish:
```pythonWithoutSDK
# Good: Specific and clear
"Calculate the correlation matrix for these variables and highlight correlations above 0.7"
# Avoid: Vague requests
"Analyze this data"
```
### 2. **Provide Context and Data Format**
Always specify the data format and any constraints on the data, and provide as much context as possible:
```pythonWithoutSDK
# Good: Includes data format and requirements
"""
Here's my CSV data with columns: date, revenue, costs
Please calculate monthly profit margins and identify the best-performing month.
Data: [['2024-01', 50000, 35000], ['2024-02', 55000, 38000], ...]
"""
```
### 3. **Use Appropriate Model Settings**
* **Temperature**: Use lower values (0.0-0.3) for mathematical calculations
* **Model**: Use reasoning models like `grok-4-1-fast-reasoning` for better code generation
## Common Use Cases
### Financial Analysis
```pythonWithoutSDK
# Portfolio optimization, risk calculations, option pricing
"Calculate the Sharpe ratio for a portfolio with returns [0.12, 0.08, -0.03, 0.15] and risk-free rate 0.02"
```
### Statistical Analysis
```pythonWithoutSDK
# Hypothesis testing, regression analysis, probability distributions
"Perform a t-test to compare these two groups and interpret the p-value: Group A: [23, 25, 28, 30], Group B: [20, 22, 24, 26]"
```
### Scientific Computing
```pythonWithoutSDK
# Simulations, numerical methods, equation solving
"Solve this differential equation using numerical methods: dy/dx = x^2 + y, with initial condition y(0) = 1"
```
## Limitations and Considerations
* **Execution Environment**: Code runs in a sandboxed Python environment with common libraries pre-installed
* **Time Limits**: Complex computations may have execution time constraints
* **Memory Usage**: Large datasets might hit memory limitations
* **Package Availability**: Most popular Python packages (NumPy, Pandas, Matplotlib, SciPy) are available
* **File I/O**: Limited file system access for security reasons
## Security Notes
* Code execution happens in a secure, isolated environment
* No access to external networks or file systems
* Temporary execution context that doesn't persist between requests
* All computations are stateless and secure
===/developers/tools/collections-search===
#### Tools
# Collections Search Tool
The collections search tool enables Grok to search through your uploaded knowledge bases (collections), allowing it to retrieve relevant information from your documents to provide more accurate and contextually relevant responses. This tool is particularly powerful for analyzing complex documents like financial reports, legal contracts, or technical documentation, where Grok can autonomously search through multiple documents and synthesize information to answer sophisticated analytical questions.
For an introduction to Collections, please check out the [Collections documentation](/developers/files/collections).
## Key Capabilities
* **Document Retrieval**: Search across uploaded files and collections to find relevant information
* **Semantic Search**: Find documents based on meaning and context, not just keywords
* **Knowledge Base Integration**: Seamlessly integrate your proprietary data with Grok's reasoning
* **RAG Applications**: Power retrieval-augmented generation workflows
* **Multi-format Support**: Search across PDFs, text files, CSVs, and other supported formats
## When to Use Collections Search
The collections search tool is particularly valuable for:
* **Enterprise Knowledge Bases**: When you need Grok to reference internal documents and policies
* **Financial Analysis**: Analyzing SEC filings, earnings reports, and financial statements across multiple documents
* **Customer Support**: Building chatbots that can answer questions based on your product documentation
* **Research & Due Diligence**: Synthesizing information from academic papers, technical reports, or industry analyses
* **Compliance & Legal**: Ensuring responses are grounded in your official guidelines and regulations
* **Personal Knowledge Management**: Organizing and querying your personal document collections
## SDK Support
The collections search tool is available across multiple SDKs and APIs with different naming conventions:
| SDK/API | Tool Name | Description |
|---------|-----------|-------------|
| xAI SDK | `collections_search` | Native xAI SDK implementation |
| OpenAI Responses API | `file_search` | Compatible with OpenAI's API format |
This tool is also supported in all Responses API compatible SDKs.
## Implementation Example
### End-to-End Financial Analysis Example
This comprehensive example demonstrates analyzing Tesla's SEC filings using the collections search tool. It covers:
1. Creating a collection for document storage
2. Uploading multiple financial documents concurrently (10-Q and 10-K filings)
3. Using Grok with collections search to analyze and synthesize information across documents in an agentic manner
4. Enabling code execution to allow the model to perform calculations and mathematical analysis effectively should it be needed.
5. Receiving cited responses and tool usage information
This pattern is applicable to any document analysis workflow where you need to search through and reason over multiple documents.
```pythonXAI
import asyncio
import os
import httpx
from xai_sdk import AsyncClient
from xai_sdk.chat import user
from xai_sdk.proto import collections_pb2
from xai_sdk.tools import code_execution, collections_search
TESLA_10_Q_PDF_URL = "https://ir.tesla.com/_flysystem/s3/sec/000162828025045968/tsla-20250930-gen.pdf"
TESLA_10_K_PDF_URL = "https://ir.tesla.com/_flysystem/s3/sec/000162828025003063/tsla-20241231-gen.pdf"
async def main():
client = AsyncClient(api_key=os.getenv("XAI_API_KEY"), management_api_key=os.getenv("XAI_MANAGEMENT_API_KEY"))
# Step 1: Create a collection for Tesla SEC filings
response = await client.collections.create("tesla-sec-filings")
print(f"Created collection: {response.collection_id}")
# Step 2: Upload documents to the collection concurrently
async def upload_document(
url: str, name: str, collection_id: str, http_client: httpx.AsyncClient
) -> None:
pdf_response = await http_client.get(url, timeout=30.0)
pdf_content = pdf_response.content
print(f"Uploading {name} document to collection")
response = await client.collections.upload_document(
collection_id=collection_id,
name=name,
data=pdf_content,
)
# Poll until document is processed and ready for search
response = await client.collections.get_document(response.file_metadata.file_id, collection_id)
print(f"Waiting for document {name} to be processed")
while response.status != collections_pb2.DOCUMENT_STATUS_PROCESSED:
await asyncio.sleep(3)
response = await client.collections.get_document(response.file_metadata.file_id, collection_id)
print(f"Document {name} processed")
# Upload both documents concurrently
async with httpx.AsyncClient() as http_client:
await asyncio.gather(
upload_document(TESLA_10_Q_PDF_URL, "tesla-10-Q-2024.pdf", response.collection_id, http_client),
upload_document(TESLA_10_K_PDF_URL, "tesla-10-K-2024.pdf", response.collection_id, http_client),
)
# Step 3: Create a chat with collections search enabled
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # Use a reasoning model for better analysis
tools=[
collections_search(
collection_ids=[response.collection_id],
),
code_execution(),
],
include=["verbose_streaming"],
)
# Step 4: Ask a complex analytical question that requires searching multiple documents
chat.append(
user(
"How many consumer vehicles did Tesla produce in total in 2024 and 2025? "
"Show your working and cite your sources."
)
)
# Step 5: Stream the response and display reasoning progress
is_thinking = True
async for response, chunk in chat.stream():
# View server-side tool calls as they happen
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
latest_response = response
# Step 6: Review citations and tool usage
print("\\n\\nCitations:")
print(latest_response.citations)
print("\\n\\nUsage:")
print(latest_response.usage)
print(latest_response.server_side_tool_usage)
print("\\n\\nTool Calls:")
print(latest_response.tool_calls)
if __name__ == "__main__":
asyncio.run(main())
```
```pythonOpenAISDK
import os
from openai import OpenAI
# Using OpenAI SDK with xAI API (requires pre-created collection)
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
# Note: You must create the collection and upload documents first using either the xAI console (console.x.ai) or the xAI SDK
# The collection_id below should be replaced with your actual collection ID
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "How many consumer vehicles did Tesla produce in total in 2024 and 2025? Show your working and cite your sources.",
},
],
tools=[
{
"type": "file_search",
"vector_store_ids": ["your_collection_id_here"], # Replace with actual collection ID
"max_num_results": 10
},
{"type": "code_interpreter"}, # Enable code execution for calculations
],
)
print(response)
```
```javascriptAISDK
import { createOpenAI } from '@ai-sdk/openai';
import { streamText } from 'ai';
const openai = createOpenAI({
baseURL: 'https://api.x.ai/v1',
apiKey: process.env.XAI_API_KEY,
});
const result = streamText({
model: openai('grok-4-1-fast-reasoning'),
prompt: 'What documents do you have access to?',
tools: {
file_search: openai.tools.fileSearch({
vectorStoreIds: ['your-vector-store-id'],
maxNumResults: 5,
}),
},
});
```
```pythonRequests
import os
import requests
# Using raw requests (requires pre-created collection)
url = "https://api.x.ai/v1/responses"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
payload = {
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "How many consumer vehicles did Tesla produce in total in 2024 and 2025? Show your working and cite your sources."
}
],
"tools": [
{
"type": "file_search",
"vector_store_ids": ["your_collection_id_here"], # Replace with actual collection ID
"max_num_results": 10,
},
{"type": "code_interpreter"} # Enable code execution for calculations
]
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
```
```bash
# Using curl (requires pre-created collection)
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "How many consumer vehicles did Tesla produce in total in 2024 and 2025? Show your working and cite your sources."
}
],
"tools": [
{
"type": "file_search",
"vector_store_ids": ["your_collection_id_here"],
"max_num_results": 10
},
{
"type": "code_interpreter"
}
]
}'
```
## Example Output
When you run the Python xAI SDK example above, you'll see output like this showing the complete workflow from collection creation to the final analyzed response:
```output
Created collection: collection_3be0eec8-ee8e-4a18-a9d4-fb70a3150d64
Uploading tesla-10-K-2024.pdf document to collection
Uploading tesla-10-Q-2024.pdf document to collection
Waiting for document tesla-10-K-2024.pdf to be processed
Waiting for document tesla-10-Q-2024.pdf to be processed
Document tesla-10-Q-2024.pdf processed
Document tesla-10-K-2024.pdf processed
Thinking... (199 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla total vehicle production 2024","limit":10}
Thinking... (370 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla total vehicles produced 2024","limit":5}
Thinking... (370 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla vehicles produced 2025 Q1 Q2 Q3","limit":5}
Thinking... (477 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla production and deliveries 2024 all quarters","limit":10}
Thinking... (477 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla production and deliveries 2025 Q1 Q2 Q3","limit":10}
Thinking... (569 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla total production 2024 full year consumer vehicles","limit":5}
Thinking... (776 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla annual vehicle production 2024 total","limit":5}
Thinking... (1148 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla Q4 2024 production and deliveries","limit":5}
Thinking... (1240 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla Q1 2024 vehicle production","limit":3}
Thinking... (1240 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla Q2 2024 vehicle production","limit":3}
Thinking... (1240 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla Q3 2024 vehicle production","limit":3}
Thinking... (1240 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla Q4 2024 vehicle production","limit":3}
Thinking... (2804 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla vehicle production nine months 2024","limit":5}
Thinking... (3352 tokens)
Final Response:
### Working
To answer this question, I used the collections_search tool to query Tesla's official
financial filings (10-K and 10-Q reports) for production numbers. Tesla reports vehicle
production and deliveries quarterly, and consumer vehicles refer to passenger vehicles
like Model 3, Model Y, Model S, Model X, and Cybertruck (excluding Tesla Semi or other
non-consumer products).
#### Step 1: 2024 Production
Based on Tesla's official quarterly production and delivery reports (aggregated from SEC
filings and press releases referenced in the collections), Tesla produced **1,773,443
consumer vehicles in 2024**.
- Q1 2024: 433,371 produced
- Q2 2024: 410,831 produced
- Q3 2024: 469,796 produced
- Q4 2024: 459,445 produced
#### Step 2: 2025 Production
The Q3 2025 10-Q filing explicitly states: "In 2025, we produced approximately 1,220,000
consumer vehicles [...] through the third quarter."
- This is the sum of Q1, Q2, and Q3 2025 production
- Q4 2025 data is not available as of November 13, 2025
#### Step 3: Total for 2024 and 2025
- 2024 full year: 1,773,443
- 2025 (through Q3): 1,220,000
- **Total: 2,993,443 consumer vehicles**
Citations:
['collections://collection_3be0eec8-ee8e-4a18-a9d4-fb70a3150d64/files/file_d4d1a968-9037-4caa-8eca-47a1563f28ab',
'collections://collection_3be0eec8-ee8e-4a18-a9d4-fb70a3150d64/files/file_ff41a42e-6cdc-4ca1-918a-160644d52704']
Usage:
completion_tokens: 1306
prompt_tokens: 383265
total_tokens: 387923
prompt_text_tokens: 383265
reasoning_tokens: 3352
cached_prompt_text_tokens: 177518
{'SERVER_SIDE_TOOL_COLLECTIONS_SEARCH': 13}
Tool Calls:
... (omitted for brevity)
```
### Understanding Collections Citations
When using the collections search tool, citations follow a special URI format that uniquely identifies the source documents:
```
collections://collection_id/files/file_id
```
For example:
```
collections://collection_3be0eec8-ee8e-4a18-a9d4-fb70a3150d64/files/file_d4d1a968-9037-4caa-8eca-47a1563f28ab
```
**Format Breakdown:**
* **`collections://`**: Protocol identifier indicating this is a collection-based citation
* **`collection_id`**: The unique identifier of the collection that was searched (e.g., `collection_3be0eec8-ee8e-4a18-a9d4-fb70a3150d64`)
* **`files/`**: Path segment indicating file-level reference
* **`file_id`**: The unique identifier of the specific document file that was referenced (e.g., `file_d4d1a968-9037-4caa-8eca-47a1563f28ab`)
These citations represent all the documents from your collections that Grok referenced during its search and analysis. Each citation points to a specific file within a collection, allowing you to trace back exactly which uploaded documents contributed to the final response.
### Key Observations
1. **Autonomous Search Strategy**: Grok autonomously performs 13 different searches across the documents, progressively refining queries to find specific quarterly and annual production data.
2. **Reasoning Process**: The output shows reasoning tokens accumulating (199 → 3,352 tokens), demonstrating how the model thinks through the problem before generating the final response.
3. **Cited Sources**: All information is grounded in the uploaded documents with specific file citations, ensuring transparency and verifiability.
4. **Structured Analysis**: The final response breaks down the methodology, shows calculations, and clearly states assumptions and limitations (e.g., Q4 2025 data not yet available).
5. **Token Efficiency**: Notice the high number of cached prompt tokens (177,518) - this demonstrates how the collections search tool efficiently reuses context across multiple queries.
## Combining Collections Search with Web Search/X-Search
One of the most powerful patterns is combining the collections search tool with web search/x-search to answer questions that require both your internal knowledge base and real-time external information. This enables sophisticated analysis that grounds responses in your proprietary data while incorporating current market intelligence, news, and public sentiment.
### Example: Internal Data + Market Intelligence
Building on the Tesla example above, let's analyze how market analysts view Tesla's performance based on the production numbers from our internal documents:
```pythonXAI
import asyncio
import httpx
from xai_sdk import AsyncClient
from xai_sdk.chat import user
from xai_sdk.proto import collections_pb2
from xai_sdk.tools import code_execution, collections_search, web_search, x_search
# ... (collection creation and document upload same as before)
async def hybrid_analysis(client: AsyncClient, collection_id: str, model: str) -> None:
# Enable collections search, web search, and code execution
chat = client.chat.create(
model=model,
tools=[
collections_search(
collection_ids=[collection_id],
),
web_search(), # Enable web search for external data
x_search(), # Enable x-search for external data
code_execution(), # Enable code execution for calculations
],
include=["verbose_streaming"],
)
# Ask a question that requires both internal and external information
chat.append(
user(
"Based on Tesla's actual production figures in my documents (collection), what is the "
"current market and analyst sentiment on their 2024-2025 vehicle production performance?"
)
)
is_thinking = True
async for response, chunk in chat.stream():
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
latest_response = response
print("\\n\\nCitations:")
print(latest_response.citations)
print("\\n\\nTool Usage:")
print(latest_response.server_side_tool_usage)
```
### How It Works
When you provide both `collections_search()` and `web_search()`/`x_search()` tools, Grok autonomously determines the optimal search strategy:
1. **Internal Analysis First**: Searches your uploaded Tesla SEC filings to extract actual production numbers
2. **External Context Gathering**: Performs web/x-search searches to find analyst reports, market sentiment, and production expectations
3. **Synthesis**: Combines both data sources to provide a comprehensive analysis comparing actual performance against market expectations
4. **Cited Sources**: Returns citations from both your internal documents (using `collections://` URIs) and external web sources (using `https://` URLs)
### Example Output Pattern
```output
Thinking... (201 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla vehicle production figures 2024 2025","limit":20}
Thinking... (498 tokens)
Calling tool: collections_search with arguments: {"query":"Tesla quarterly vehicle production and deliveries 2024 2025","limit":20}
Thinking... (738 tokens)
Calling tool: web_search with arguments: {"query":"Tesla quarterly vehicle production and deliveries 2024 2025","num_results":10}
Thinking... (738 tokens)
Calling tool: web_search with arguments: {"query":"market and analyst sentiment Tesla vehicle production performance 2024 2025","num_results":10}
Thinking... (1280 tokens)
Final Response
... (omitted for brevity)
```
### Use Cases for Hybrid Search
This pattern is valuable for:
* **Market Analysis**: Compare internal financial data with external market sentiment and competitor performance
* **Competitive Intelligence**: Analyze your product performance against industry reports and competitor announcements
* **Compliance Verification**: Cross-reference internal policies with current regulatory requirements and industry standards
* **Strategic Planning**: Ground business decisions in both proprietary data and real-time market conditions
* **Customer Research**: Combine internal customer data with external reviews, social sentiment, and market trends
===/developers/tools/function-calling===
#### Tools
# Function Calling
Define custom tools that the model can invoke during a conversation. The model requests the call, you execute it locally, and return the result. This enables integration with databases, APIs, and any external system.
With streaming, the function call is returned in whole in a single chunk, not streamed across chunks.
## How It Works
1. Define tools with a name, description, and JSON schema for parameters
2. Include tools in your request
3. Model returns a `tool_call` when it needs external data
4. Execute the function locally and return the result
5. Model continues with your result
## Quick Start
```bash customLanguage="bash"
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{"role": "user", "content": "What is the temperature in San Francisco?"}
],
"tools": [
{
"type": "function",
"name": "get_temperature",
"description": "Get current temperature for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "fahrenheit"}
},
"required": ["location"]
}
}
]
}'
```
```pythonXAI
import os
import json
from xai_sdk import Client
from xai_sdk.chat import user, tool, tool_result
client = Client(api_key=os.getenv("XAI_API_KEY"))
# Define tools
tools = [
tool(
name="get_temperature",
description="Get current temperature for a location",
parameters={
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "fahrenheit"}
},
"required": ["location"]
},
),
]
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=tools,
)
chat.append(user("What is the temperature in San Francisco?"))
response = chat.sample()
# Handle tool calls
if response.tool_calls:
chat.append(response)
for tc in response.tool_calls:
args = json.loads(tc.function.arguments)
# Execute your function
result = {"location": args["location"], "temperature": 59, "unit": args.get("unit", "fahrenheit")}
chat.append(tool_result(json.dumps(result)))
response = chat.sample()
print(response.content)
```
```pythonOpenAISDK
import os
import json
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
tools = [
{
"type": "function",
"name": "get_temperature",
"description": "Get current temperature for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "fahrenheit"}
},
"required": ["location"]
},
},
]
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "What is the temperature in San Francisco?"}],
tools=tools,
)
# Handle function calls
for item in response.output:
if item.type == "function_call":
args = json.loads(item.arguments)
result = {"location": args["location"], "temperature": 59, "unit": args.get("unit", "fahrenheit")}
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"type": "function_call_output", "call_id": item.call_id, "output": json.dumps(result)}],
tools=tools,
previous_response_id=response.id,
)
for item in response.output:
if item.type == "message":
print(item.content[0].text)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { streamText, tool } from 'ai';
import { z } from 'zod';
const result = streamText({
model: xai.responses('grok-4-1-fast-reasoning'),
tools: {
getTemperature: tool({
description: 'Get current temperature for a location',
parameters: z.object({
location: z.string().describe('City name'),
unit: z.enum(['celsius', 'fahrenheit']).default('fahrenheit'),
}),
execute: async ({ location, unit }) => ({
location,
temperature: unit === 'fahrenheit' ? 59 : 15,
unit,
}),
}),
},
prompt: 'What is the temperature in San Francisco?',
});
for await (const chunk of result.fullStream) {
if (chunk.type === 'text-delta') {
process.stdout.write(chunk.text);
}
}
```
## Defining Tools with Pydantic
Use Pydantic models for type-safe parameter schemas:
```pythonXAI
from typing import Literal
from pydantic import BaseModel, Field
from xai_sdk.chat import tool
class TemperatureRequest(BaseModel):
location: str = Field(description="City and state, e.g. San Francisco, CA")
unit: Literal["celsius", "fahrenheit"] = Field("fahrenheit", description="Temperature unit")
class CeilingRequest(BaseModel):
location: str = Field(description="City and state, e.g. San Francisco, CA")
# Generate JSON schema from Pydantic models
tools = [
tool(
name="get_temperature",
description="Get current temperature for a location",
parameters=TemperatureRequest.model_json_schema(),
),
tool(
name="get_ceiling",
description="Get current cloud ceiling for a location",
parameters=CeilingRequest.model_json_schema(),
),
]
```
```pythonOpenAISDK
from typing import Literal
from pydantic import BaseModel, Field
class TemperatureRequest(BaseModel):
location: str = Field(description="City and state, e.g. San Francisco, CA")
unit: Literal["celsius", "fahrenheit"] = Field("fahrenheit", description="Temperature unit")
class CeilingRequest(BaseModel):
location: str = Field(description="City and state, e.g. San Francisco, CA")
tools = [
{
"type": "function",
"name": "get_temperature",
"description": "Get current temperature for a location",
"parameters": TemperatureRequest.model_json_schema(),
},
{
"type": "function",
"name": "get_ceiling",
"description": "Get current cloud ceiling for a location",
"parameters": CeilingRequest.model_json_schema(),
},
]
```
## Handling Tool Calls
When the model wants to use your tool, execute the function and return the result:
```pythonXAI
import json
def get_temperature(location: str, unit: str = "fahrenheit") -> dict:
# In production, call a real weather API
temp = 59 if unit == "fahrenheit" else 15
return {"location": location, "temperature": temp, "unit": unit}
def get_ceiling(location: str) -> dict:
return {"location": location, "ceiling": 15000, "unit": "ft"}
tools_map = {
"get_temperature": get_temperature,
"get_ceiling": get_ceiling,
}
chat.append(user("What's the weather in Denver?"))
response = chat.sample()
# Process tool calls
if response.tool_calls:
chat.append(response)
for tool_call in response.tool_calls:
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
result = tools_map[name](**args)
chat.append(tool_result(json.dumps(result)))
response = chat.sample()
print(response.content)
```
```pythonOpenAISDK
import json
def get_temperature(location: str, unit: str = "fahrenheit") -> dict:
temp = 59 if unit == "fahrenheit" else 15
return {"location": location, "temperature": temp, "unit": unit}
tools_map = {"get_temperature": get_temperature}
# Process function calls
for item in response.output:
if item.type == "function_call":
name = item.name
args = json.loads(item.arguments)
if name not in tools_map:
output = json.dumps({"error": f"Unknown function: {name}"})
else:
output = json.dumps(tools_map[name](**args))
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"type": "function_call_output", "call_id": item.call_id, "output": output}],
tools=tools,
previous_response_id=response.id,
)
for item in response.output:
if item.type == "message":
print(item.content[0].text)
```
## Combining with Built-in Tools
Function calling works alongside built-in agentic tools. The model can use web search, then call your custom function:
```pythonXAI
from xai_sdk.chat import tool
from xai_sdk.tools import web_search, x_search
tools = [
web_search(), # Built-in: runs on xAI servers
x_search(), # Built-in: runs on xAI servers
tool( # Custom: runs on your side
name="save_to_database",
description="Save research results to the database",
parameters={
"type": "object",
"properties": {
"data": {"type": "string", "description": "Data to save"}
},
"required": ["data"]
},
),
]
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=tools,
)
```
```pythonOpenAISDK
tools = [
{"type": "web_search"}, # Built-in
{"type": "x_search"}, # Built-in
{ # Custom
"type": "function",
"name": "save_to_database",
"description": "Save research results to the database",
"parameters": {
"type": "object",
"properties": {
"data": {"type": "string", "description": "Data to save"}
},
"required": ["data"]
},
},
]
```
When mixing tools:
* **Built-in tools** execute automatically on xAI servers
* **Custom tools** pause execution and return to you for handling
See [Advanced Usage](/developers/tools/advanced-usage#mixing-server-side-and-client-side-tools) for complete examples with tool loops.
## Tool Choice
Control when the model uses tools:
| Value | Behavior |
|-------|----------|
| `"auto"` | Model decides whether to call a tool (default) |
| `"required"` | Model must call at least one tool |
| `"none"` | Disable tool calling |
| `{"type": "function", "function": {"name": "..."}}` | Force a specific tool |
## Parallel Function Calling
By default, parallel function calling is enabled — the model can request multiple tool calls in a single response. Process all of them before continuing:
```pythonWithoutSDK
# response.tool_calls may contain multiple calls
for tool_call in response.tool_calls:
result = tools_map[tool_call.function.name](**json.loads(tool_call.function.arguments))
# Append each result...
```
Disable with `parallel_tool_calls: false` in your request.
## Tool Schema Reference
| Field | Required | Description |
|-------|----------|-------------|
| `name` | Yes | Unique identifier (max 200 tools per request) |
| `description` | Yes | What the tool does — helps the model decide when to use it |
| `parameters` | Yes | JSON Schema defining function inputs |
### Parameter Schema
```json
{
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"default": "celsius"
}
},
"required": ["location"]
}
```
## Complete Vercel AI SDK Example
The Vercel AI SDK handles tool definition, execution, and the request/response loop automatically:
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { streamText, tool, stepCountIs } from 'ai';
import { z } from 'zod';
const result = streamText({
model: xai.responses('grok-4-1-fast-reasoning'),
tools: {
getCurrentTemperature: tool({
description: 'Get current temperature for a location',
parameters: z.object({
location: z.string().describe('City and state, e.g. San Francisco, CA'),
unit: z.enum(['celsius', 'fahrenheit']).default('fahrenheit'),
}),
execute: async ({ location, unit }) => ({
location,
temperature: unit === 'fahrenheit' ? 59 : 15,
unit,
}),
}),
getCurrentCeiling: tool({
description: 'Get current cloud ceiling for a location',
parameters: z.object({
location: z.string().describe('City and state'),
}),
execute: async ({ location }) => ({
location,
ceiling: 15000,
ceiling_type: 'broken',
unit: 'ft',
}),
}),
},
stopWhen: stepCountIs(5),
prompt: "What's the temperature and cloud ceiling in San Francisco?",
});
for await (const chunk of result.fullStream) {
switch (chunk.type) {
case 'text-delta':
process.stdout.write(chunk.text);
break;
case 'tool-call':
console.log(`Tool call: ${chunk.toolName}`, chunk.args);
break;
case 'tool-result':
console.log(`Tool result: ${chunk.toolName}`, chunk.result);
break;
}
}
```
===/developers/tools/overview===
#### Tools
# Overview
The xAI API supports **tool calling**, enabling Grok to perform actions beyond generating text—like searching the web, executing code, querying your data, or calling your own custom functions. Tools extend what's possible with the API and let you build powerful, interactive applications.
## Types of Tools
The xAI API offers two categories of tools:
| Type | Description | Examples |
|------|-------------|----------|
| **Built-in Tools** | Server-side tools managed by xAI that execute automatically | Web Search, X Search, Code Interpreter, Collections Search |
| **Function Calling** | Custom functions you define that the model can invoke | Database queries, API calls, custom business logic |
Built-in tools run on xAI's servers—you provide the tool configuration, and the API handles execution and returns results. Function calling lets you define your own tools that the model can request, giving you full control over what happens when they're invoked.
## Pricing
Tool requests are priced based on two components: **token usage** and **tool invocations**. Since the model may call multiple tools to answer a query, costs scale with complexity.
For more details on Tools pricing, please check out [the pricing page](/developers/models#tools-pricing).
## How It Works
When you provide tools to a request, the xAI API can use them to gather information or perform actions:
1. **Analyzes the query** and determines what information or actions are needed
2. **Decides what to do next**: Make a tool call, or provide a final answer
3. **Executes the tool** (for built-in tools) or returns a tool call request (for function calling)
4. **Processes results** and continues until sufficient information is gathered
5. **Returns the final response** with citations where applicable
## Quick Start
```bash customLanguage="bash"
curl https://api.x.ai/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $XAI_API_KEY" \
-d '{
"model": "grok-4-1-fast-reasoning",
"stream": true,
"input": [
{
"role": "user",
"content": "What are the latest updates from xAI?"
}
],
"tools": [
{ "type": "web_search" },
{ "type": "x_search" },
{ "type": "code_interpreter" }
]
}'
```
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search, code_execution
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(),
x_search(),
code_execution(),
],
)
chat.append(user("What are the latest updates from xAI?"))
for response, chunk in chat.stream():
if chunk.content:
print(chunk.content, end="", flush=True)
print("\nCitations:", response.citations)
```
```pythonOpenAISDK
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("XAI_API_KEY"),
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{"role": "user", "content": "What are the latest updates from xAI?"}
],
tools=[
{"type": "web_search"},
{"type": "x_search"},
{"type": "code_interpreter"},
],
stream=True,
)
for event in response:
if hasattr(event, 'delta') and hasattr(event.delta, 'content'):
print(event.delta.content, end="", flush=True)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { streamText } from 'ai';
const { fullStream } = streamText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What are the latest updates from xAI?',
tools: {
web_search: xai.tools.webSearch(),
x_search: xai.tools.xSearch(),
code_execution: xai.tools.codeExecution(),
},
});
for await (const part of fullStream) {
if (part.type === 'text-delta') {
process.stdout.write(part.text);
} else if (part.type === 'source' && part.sourceType === 'url') {
console.log(`Citation: ${part.url}`);
}
}
```
```javascriptOpenAISDK
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.XAI_API_KEY,
baseURL: "https://api.x.ai/v1",
});
const stream = await client.responses.create({
model: "grok-4-1-fast-reasoning",
input: [
{ role: "user", content: "What are the latest updates from xAI?" }
],
tools: [
{ type: "web_search" },
{ type: "x_search" },
{ type: "code_interpreter" },
],
stream: true,
});
for await (const event of stream) {
if (event.delta?.content) {
process.stdout.write(event.delta.content);
}
}
```
## Citations
The API automatically returns source URLs for information gathered via tools. See [Citations](/developers/tools/citations) for details on accessing and using citation data.
## Next Steps
* **[Function Calling](/developers/tools/function-calling)** - Define custom tools the model can call
* **[Web Search](/developers/tools/web-search)** - Search the web and browse pages
* **[X Search](/developers/tools/x-search)** - Search X posts, users, and threads
* **[Code Execution](/developers/tools/code-execution)** - Execute Python code in a sandbox
* **[Collections Search](/developers/tools/collections-search)** - Query your uploaded documents
* **[Citations](/developers/tools/citations)** - Access source URLs and inline citations
===/developers/tools/remote-mcp===
#### Tools
# Remote MCP Tools
Remote MCP Tools allow Grok to connect to external MCP (Model Context Protocol) servers, extending its capabilities with custom tools from third parties or your own implementations. Simply specify a server URL and optional configuration - xAI manages the MCP server connection and interaction on your behalf.
## SDK Support
Remote MCP tools are supported in the xAI native SDK and the OpenAI compatible Responses API.
The `require_approval` and `connector_id` parameters in the OpenAI Responses API are not currently supported.
## Configuration
To use remote MCP tools, you need to configure the connection to your MCP server in the tools array of your request.
| Parameter | Required | Description |
|-----------|-------------------|-------------|
| `server_url` | Yes | The URL of the MCP server to connect to. Only Streaming HTTP and SSE transports are supported. |
| `server_label` | No | A label to identify the server (used for tool call prefixing) |
| `server_description` | No | A description of what the server provides |
| `allowed_tool_names` | No | List of specific tool names to allow (empty allows all) |
| `authorization` | No | A token that will be set in the Authorization header on requests to the MCP server |
| `extra_headers` | No | Additional headers to include in requests |
### Basic MCP Tool Usage
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import mcp
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
mcp(server_url="https://mcp.deepwiki.com/mcp"),
],
include=["verbose_streaming"],
)
chat.append(user("What can you do with https://github.com/xai-org/xai-sdk-python?"))
is_thinking = True
for response, chunk in chat.stream():
# View the server-side tool calls as they are being made in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
print("\\n\\nServer Side Tool Calls:")
print(response.tool_calls)
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "What can you do with https://github.com/xai-org/xai-sdk-python?",
},
],
tools=[
{
"type": "mcp",
"server_url": "https://mcp.deepwiki.com/mcp",
"server_label": "deepwiki",
}
],
)
print(response)
```
```pythonRequests
import os
import requests
url = "https://api.x.ai/v1/responses"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.getenv('XAI_API_KEY')}"
}
payload = {
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What can you do with https://github.com/xai-org/xai-sdk-python?"
}
],
"tools": [
{
"type": "mcp",
"server_url": "https://mcp.deepwiki.com/mcp",
"server_label": "deepwiki",
}
]
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What can you do with https://github.com/xai-org/xai-sdk-python?"
}
],
"tools": [
{
"type": "mcp",
"server_url": "https://mcp.deepwiki.com/mcp",
"server_label": "deepwiki"
}
]
}'
```
## Tool Enablement and Access Control
When you configure a Remote MCP Tool without specifying `allowed_tool_names`, all tool definitions exposed by the MCP server are automatically injected into the model's context. This means the model gains access to every tool that the MCP server provides, allowing it to use any of them during the conversation.
For example, if an MCP server exposes 10 different tools and you don't specify `allowed_tool_names`, all 10 tool definitions will be available to the model. The model can then choose to call any of these tools based on the user's request and the tool descriptions.
Use the `allowed_tool_names` parameter to selectively enable only specific tools from an MCP server. This can give you several key benefits:
* **Better Performance**: Reduce context overhead by limiting tool definitions the model needs to consider
* **Reduced Risk**: For example, restrict access to tools that only perform read-only operations to prevent the model from modifying data
```pythonXAI
# Enable only specific tools from a server with many available tools
mcp(
server_url="https://comprehensive-tools.example.com/mcp",
allowed_tool_names=["search_database", "format_data"]
)
```
Instead of giving the model access to every tool the server offers, this approach keeps Grok focused and efficient while ensuring it has exactly the capabilities it needs.
## Multi-Server Support
Enable multiple MCP servers simultaneously to create a rich ecosystem of specialized tools:
```pythonXAI
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
mcp(server_url="https://mcp.deepwiki.com/mcp", server_label="deepwiki"),
mcp(server_url="https://your-custom-tools.com/mcp", server_label="custom"),
mcp(server_url="https://api.example.com/tools", server_label="api-tools"),
],
)
```
Each server can provide different capabilities - documentation tools, API integrations, custom business logic, or specialized data processing - all accessible within a single conversation.
## Best Practices
* **Provide clear server metadata**: Use descriptive `server_label` and `server_description` when configuring multiple MCP servers to help the model understand each server's purpose and select the right tools
* **Filter tools appropriately**: Use `allowed_tool_names` to restrict access to only necessary tools, especially when servers have many tools since the model must keep all available tool definitions in context
* **Use secure connections**: Always use HTTPS URLs and implement proper authentication mechanisms on your MCP server
* **Provide Examples**: While the model can generally figure out what tools to use based on the tool descriptions and the user request it may help to provide examples in the prompt
===/developers/tools/streaming-and-sync===
#### Tools
# Streaming & Synchronous Requests
Agentic requests can be executed in either streaming or synchronous mode. This page covers both approaches and how to use them effectively.
## Streaming Mode (Recommended)
We strongly recommend using streaming mode when using agentic tool calling. It provides:
* **Real-time observability** of tool calls as they happen
* **Immediate feedback** during potentially long-running requests
* **Reasoning token counts** as the model thinks
### Streaming Example
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import code_execution, web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(),
x_search(),
code_execution(),
],
include=["verbose_streaming"],
)
chat.append(user("What are the latest updates from xAI?"))
is_thinking = True
for response, chunk in chat.stream():
# View server-side tool calls in real-time
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\nCitations:", response.citations)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { streamText } from 'ai';
const { fullStream } = streamText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What are the latest updates from xAI?',
tools: {
web_search: xai.tools.webSearch(),
x_search: xai.tools.xSearch(),
code_execution: xai.tools.codeExecution(),
},
});
for await (const part of fullStream) {
if (part.type === 'tool-call') {
console.log(\`Calling tool: \${part.toolName}\`);
} else if (part.type === 'text-delta') {
process.stdout.write(part.text);
} else if (part.type === 'source' && part.sourceType === 'url') {
console.log(\`Citation: \${part.url}\`);
}
}
```
## Synchronous Mode
For simpler use cases or when you want to wait for the complete agentic workflow to finish before processing the response, you can use synchronous requests:
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import code_execution, web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(),
x_search(),
code_execution(),
],
)
chat.append(user("What is the latest update from xAI?"))
# Get the final response in one go once it's ready
response = chat.sample()
print("Final Response:")
print(response.content)
print("\\nCitations:")
print(response.citations)
print("\\nUsage:")
print(response.usage)
print(response.server_side_tool_usage)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
// Synchronous request - waits for complete response
const { text, sources } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is the latest update from xAI?',
tools: {
web_search: xai.tools.webSearch(),
x_search: xai.tools.xSearch(),
code_execution: xai.tools.codeExecution(),
},
});
console.log('Final Response:');
console.log(text);
console.log('\\nCitations:');
console.log(sources);
```
Synchronous requests will wait for the entire agentic process to complete before returning. This is simpler for basic use cases but provides less visibility into intermediate steps.
## Using Tools with Responses API
We also support using the Responses API in both streaming and non-streaming modes:
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
store_messages=True, # Enable Responses API
tools=[
web_search(),
x_search(),
],
)
chat.append(user("What is the latest update from xAI?"))
response = chat.sample()
print(response.content)
print(response.citations)
# The response id can be used to continue the conversation
print(response.id)
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "what is the latest update from xAI?",
},
],
tools=[
{
"type": "web_search",
},
{
"type": "x_search",
},
],
)
print(response)
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "what is the latest update from xAI?"
}
],
"tools": [
{
"type": "web_search"
},
{
"type": "x_search"
}
]
}'
```
## Accessing Tool Outputs
By default, server-side tool call outputs are not returned since they can be large. However, you can opt-in to receive them:
### xAI SDK
| Tool | Value for `include` field |
|------|---------------------------|
| `"web_search"` | `"web_search_call_output"` |
| `"x_search"` | `"x_search_call_output"` |
| `"code_execution"` | `"code_execution_call_output"` |
| `"collections_search"` | `"collections_search_call_output"` |
| `"attachment_search"` | `"attachment_search_call_output"` |
| `"mcp"` | `"mcp_call_output"` |
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import code_execution
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
code_execution(),
],
include=["code_execution_call_output"],
)
chat.append(user("What is the 100th Fibonacci number?"))
# stream or sample the response...
```
### Responses API
| Tool | Responses API tool name | Value for `include` field |
|------|-------------------------|---------------------------|
| `"web_search"` | `"web_search"` | `"web_search_call.action.sources"` |
| `"code_execution"` | `"code_interpreter"` | `"code_interpreter_call.outputs"` |
| `"collections_search"` | `"file_search"` | `"file_search_call.results"` |
| `"mcp"` | `"mcp"` | Always returned in Responses API |
===/developers/tools/tool-usage-details===
#### Tools
# Tool Usage Details
This page covers the technical details of how tool calls are tracked, billed, and how to understand token usage in agentic requests.
## Real-time Server-side Tool Calls
When streaming agentic requests, you can observe **every tool call decision** the model makes in real-time via the `tool_calls` attribute on the `chunk` object:
```pythonWithoutSDK
for tool_call in chunk.tool_calls:
print(f"\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
```
**Note**: Only the tool call invocations are shown — **server-side tool call outputs are not returned** in the API response. The agent uses these outputs internally to formulate its final response.
## Server-side Tool Calls vs Tool Usage
The API provides two related but distinct metrics for server-side tool executions:
### `tool_calls` - All Attempted Calls
```pythonWithoutSDK
response.tool_calls
```
Returns a list of all **attempted** tool calls made during the agentic process. Each entry contains:
* `id`: Unique identifier for the tool call
* `function.name`: The name of the specific server-side tool called
* `function.arguments`: The parameters passed to the server-side tool
This includes **every tool call attempt**, even if some fail.
### `server_side_tool_usage` - Successful Calls (Billable)
```pythonWithoutSDK
response.server_side_tool_usage
```
Returns a map of successfully executed tools and their invocation counts. This represents only the tool calls that returned meaningful responses and **determines your billing**.
```output
{'SERVER_SIDE_TOOL_X_SEARCH': 3, 'SERVER_SIDE_TOOL_WEB_SEARCH': 2}
```
## Tool Call Function Names vs Usage Categories
The function names in `tool_calls` represent the precise name of the tool invoked, while the entries in `server_side_tool_usage` provide a high-level categorization that aligns with the original tool passed in the `tools` array.
| Usage Category | Function Name(s) |
|----------------|------------------|
| `SERVER_SIDE_TOOL_WEB_SEARCH` | `web_search`, `web_search_with_snippets`, `browse_page` |
| `SERVER_SIDE_TOOL_X_SEARCH` | `x_user_search`, `x_keyword_search`, `x_semantic_search`, `x_thread_fetch` |
| `SERVER_SIDE_TOOL_CODE_EXECUTION` | `code_execution` |
| `SERVER_SIDE_TOOL_VIEW_X_VIDEO` | `view_x_video` |
| `SERVER_SIDE_TOOL_VIEW_IMAGE` | `view_image` |
| `SERVER_SIDE_TOOL_COLLECTIONS_SEARCH` | `collections_search` |
| `SERVER_SIDE_TOOL_MCP` | `{server_label}.{tool_name}` if `server_label` provided, otherwise `{tool_name}` |
## When Tool Calls and Usage Differ
In most cases, `tool_calls` and `server_side_tool_usage` will show the same tools. However, they can differ when:
* **Failed tool executions**: The model attempts to browse a non-existent webpage, fetch a deleted X post, or encounters other execution errors
* **Invalid parameters**: Tool calls with malformed arguments that can't be processed
* **Network or service issues**: Temporary failures in the tool execution pipeline
The agentic system handles these failures gracefully, updating its trajectory and continuing with alternative approaches when needed.
**Billing Note**: Only successful tool executions (`server_side_tool_usage`) are billed. Failed attempts are not charged.
## Understanding Token Usage
Agentic requests have unique token usage patterns compared to standard chat completions:
### `completion_tokens`
Represents **only the final text output** of the model. This is typically much smaller than you might expect, as the agent performs all its intermediate reasoning and tool orchestration internally.
### `prompt_tokens`
Represents the **cumulative input tokens** across all inference requests made during the agentic process. Each request includes the full conversation history up to that point, which grows as the agent progresses.
While this can result in higher `prompt_tokens` counts, agentic requests benefit significantly from **prompt caching**. The majority of the prompt remains unchanged between steps, allowing for efficient caching.
### `reasoning_tokens`
Represents the tokens used for the model's internal reasoning process. This includes planning tool calls, analyzing results, and formulating responses, but excludes the final output tokens.
### `cached_prompt_text_tokens`
Indicates how many prompt tokens were served from cache rather than recomputed. Higher values indicate better cache utilization and lower costs.
### `prompt_image_tokens`
Represents tokens from visual content that the agent processes. These are counted separately from text tokens. If no images or videos are processed, this value will be zero.
## Limiting Tool Call Turns
The `max_turns` parameter allows you to control the maximum number of assistant/tool-call turns the agent can perform during a single request.
### Understanding Turns vs Tool Calls
**Important**: `max_turns` does **not** directly limit the number of individual tool calls. Instead, it limits the number of assistant turns in the agentic loop. During a single turn, the model may invoke multiple tools in parallel.
A "turn" represents one iteration of the agentic reasoning loop:
1. The model analyzes the current context
2. The model decides to call one or more tools (potentially in parallel)
3. Tools execute and return results
4. The model processes the results
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search, x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(),
x_search(),
],
max_turns=3, # Limit to 3 assistant/tool-call turns
)
chat.append(user("What is the latest news from xAI?"))
response = chat.sample()
print(response.content)
```
### When to Use `max_turns`
| Use Case | Recommended `max_turns` | Tradeoff |
|----------|------------------------|----------|
| **Quick lookups** | 1-2 | Fastest response, may miss deeper insights |
| **Balanced research** | 3-5 | Good balance of speed and thoroughness |
| **Deep research** | 10+ or unset | Most comprehensive, longer latency and higher cost |
### Default Behavior
If `max_turns` is not specified, the server applies a global default cap. When the agent reaches the limit, it will stop making additional tool calls and generate a final response based on information gathered so far.
## Identifying Tool Call Types
To determine whether a returned tool call is a client-side tool that needs local execution:
### Using xAI SDK
Use the `get_tool_call_type` function:
```pythonXAI
from xai_sdk.tools import get_tool_call_type
for tool_call in response.tool_calls:
print(get_tool_call_type(tool_call))
```
| Tool call types | Description |
|---------------|-------------|
| `"client_side_tool"` | Client-side tool call - requires local execution |
| `"web_search_tool"` | Web-search tool - handled by xAI server |
| `"x_search_tool"` | X-search tool - handled by xAI server |
| `"code_execution_tool"` | Code-execution tool - handled by xAI server |
| `"collections_search_tool"` | Collections-search tool - handled by xAI server |
| `"mcp_tool"` | MCP tool - handled by xAI server |
### Using Responses API
Check the `type` field of output entries (`response.output[].type`):
| Types | Description |
|-------|-------------|
| `"function_call"` | Client-side tool - requires local execution |
| `"web_search_call"` | Web-search tool - handled by xAI server |
| `"x_search_call"` | X-search tool - handled by xAI server |
| `"code_interpreter_call"` | Code-execution tool - handled by xAI server |
| `"file_search_call"` | Collections-search tool - handled by xAI server |
| `"mcp_call"` | MCP tool - handled by xAI server |
===/developers/tools/web-search===
#### Tools
# Web Search
The Web Search tool enables Grok to search the web in real-time and browse web pages to find information. This powerful tool allows the model to search the internet, access web pages, and extract relevant information to answer queries with up-to-date content.
## SDK Support
| SDK/API | Tool Name |
|---------|-----------|
| xAI SDK | `web_search` |
| OpenAI Responses API | `web_search` |
| Vercel AI SDK | `xai.tools.webSearch()` |
This tool is also supported in all Responses API compatible SDKs.
## Basic Usage
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[web_search()],
include=["verbose_streaming"],
)
chat.append(user("What is xAI?"))
is_thinking = True
for response, chunk in chat.stream():
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "What is xAI?",
},
],
tools=[
{
"type": "web_search",
},
],
)
print(response)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text, sources } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is xAI?',
tools: {
web_search: xai.tools.webSearch(),
},
});
console.log(text);
console.log('Citations:', sources);
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What is xAI?"
}
],
"tools": [
{
"type": "web_search"
}
]
}'
```
## Web Search Parameters
| Parameter | Description |
|-----------|-------------|
| `allowed_domains` | Only search within specific domains (max 5) |
| `excluded_domains` | Exclude specific domains from search (max 5) |
| `enable_image_understanding` | Enable analysis of images found during browsing |
### Only Search in Specific Domains
Use `allowed_domains` to make the web search **only** perform the search and web browsing on web pages that fall within the specified domains.
`allowed_domains` cannot be set together with `excluded_domains` in the same request.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(allowed_domains=["grokipedia.com"]),
],
)
chat.append(user("What is xAI?"))
# stream or sample the response...
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "What is xAI?"}],
tools=[
{
"type": "web_search",
"filters": {"allowed_domains": ["grokipedia.com"]},
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is xAI?',
tools: {
web_search: xai.tools.webSearch({
allowedDomains: ['grokipedia.com'],
}),
},
});
```
### Exclude Specific Domains
Use `excluded_domains` to prevent the model from including the specified domains in any web search tool invocations.
```pythonXAI
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(excluded_domains=["grokipedia.com"]),
],
)
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "What is xAI?"}],
tools=[
{
"type": "web_search",
"filters": {"excluded_domains": ["grokipedia.com"]},
},
],
)
```
### Enable Image Understanding
Setting `enable_image_understanding` to true equips the agent with access to the `view_image` tool, allowing it to analyze images encountered during the search process.
When enabled, you will see `SERVER_SIDE_TOOL_VIEW_IMAGE` in `response.server_side_tool_usage` along with the number of times it was called.
Enabling this parameter for Web Search will also enable the image understanding for X Search tool if it's also included in the request.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import web_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
web_search(enable_image_understanding=True),
],
)
chat.append(user("What is included in the image in xAI's official website?"))
# stream or sample the response...
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "What is included in the image in xAI's official website?",
},
],
tools=[
{
"type": "web_search",
"enable_image_understanding": True,
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: "What is included in the image in xAI's official website?",
tools: {
web_search: xai.tools.webSearch({
enableImageUnderstanding: true,
}),
},
});
```
## Citations
For details on how to retrieve and use citations from search results, see the [Citations](/developers/tools/citations) page.
===/developers/tools/x-search===
#### Tools
# X Search
The X Search tool enables Grok to perform keyword search, semantic search, user search, and thread fetch on X (formerly Twitter). This powerful tool allows the model to access real-time social media content, analyze posts, and gather insights from X's vast data.
## SDK Support
| SDK/API | Tool Name |
|---------|-----------|
| xAI SDK | `x_search` |
| OpenAI Responses API | `x_search` |
| Vercel AI SDK | `xai.tools.xSearch()` |
This tool is also supported in all Responses API compatible SDKs.
## Basic Usage
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning", # reasoning model
tools=[x_search()],
include=["verbose_streaming"],
)
chat.append(user("What are people saying about xAI on X?"))
is_thinking = True
for response, chunk in chat.stream():
for tool_call in chunk.tool_calls:
print(f"\\nCalling tool: {tool_call.function.name} with arguments: {tool_call.function.arguments}")
if response.usage.reasoning_tokens and is_thinking:
print(f"\\rThinking... ({response.usage.reasoning_tokens} tokens)", end="", flush=True)
if chunk.content and is_thinking:
print("\\n\\nFinal Response:")
is_thinking = False
if chunk.content and not is_thinking:
print(chunk.content, end="", flush=True)
print("\\n\\nCitations:")
print(response.citations)
```
```pythonOpenAISDK
import os
from openai import OpenAI
api_key = os.getenv("XAI_API_KEY")
client = OpenAI(
api_key=api_key,
base_url="https://api.x.ai/v1",
)
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[
{
"role": "user",
"content": "What are people saying about xAI on X?",
},
],
tools=[
{
"type": "x_search",
},
],
)
print(response)
```
```javascriptAISDK
import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
const { text, sources } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What are people saying about xAI on X?',
tools: {
x_search: xai.tools.xSearch(),
},
});
console.log(text);
console.log('Citations:', sources);
```
```bash
curl https://api.x.ai/v1/responses \\
-H "Content-Type: application/json" \\
-H "Authorization: Bearer $XAI_API_KEY" \\
-d '{
"model": "grok-4-1-fast-reasoning",
"input": [
{
"role": "user",
"content": "What are people saying about xAI on X?"
}
],
"tools": [
{
"type": "x_search"
}
]
}'
```
## X Search Parameters
| Parameter | Description |
|-----------|-------------|
| `allowed_x_handles` | Only consider posts from specific X handles (max 10) |
| `excluded_x_handles` | Exclude posts from specific X handles (max 10) |
| `from_date` | Start date for search range (ISO8601 format) |
| `to_date` | End date for search range (ISO8601 format) |
| `enable_image_understanding` | Enable analysis of images in posts |
| `enable_video_understanding` | Enable analysis of videos in posts |
### Only Consider Posts from Specific Handles
Use `allowed_x_handles` to consider X posts only from a given list of X handles. The maximum number of handles you can include is 10.
`allowed_x_handles` cannot be set together with `excluded_x_handles` in the same request.
```pythonXAI
import os
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
x_search(allowed_x_handles=["elonmusk"]),
],
)
chat.append(user("What is the current status of xAI?"))
# stream or sample the response...
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "What is the current status of xAI?"}],
tools=[
{
"type": "x_search",
"allowed_x_handles": ["elonmusk"],
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is the current status of xAI?',
tools: {
x_search: xai.tools.xSearch({
allowedXHandles: ['elonmusk'],
}),
},
});
```
### Exclude Posts from Specific Handles
Use `excluded_x_handles` to prevent the model from including X posts from the specified handles in any X search tool invocations. The maximum number of handles you can exclude is 10.
```pythonXAI
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
x_search(excluded_x_handles=["elonmusk"]),
],
)
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "What is the current status of xAI?"}],
tools=[
{
"type": "x_search",
"excluded_x_handles": ["elonmusk"],
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is the current status of xAI?',
tools: {
x_search: xai.tools.xSearch({
excludedXHandles: ['elonmusk'],
}),
},
});
```
### Date Range
You can restrict the date range of search data used by specifying `from_date` and `to_date`. This limits the data to the period from `from_date` to `to_date`, including both dates.
Both fields need to be in ISO8601 format, e.g., "YYYY-MM-DD". If you're using the xAI Python SDK, the `from_date` and `to_date` fields can be passed as `datetime.datetime` objects.
```pythonXAI
import os
from datetime import datetime
from xai_sdk import Client
from xai_sdk.chat import user
from xai_sdk.tools import x_search
client = Client(api_key=os.getenv("XAI_API_KEY"))
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
x_search(
from_date=datetime(2025, 10, 1),
to_date=datetime(2025, 10, 10),
),
],
)
chat.append(user("What is the current status of xAI?"))
# stream or sample the response...
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "What is the current status of xAI?"}],
tools=[
{
"type": "x_search",
"from_date": "2025-10-01",
"to_date": "2025-10-10",
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'What is the current status of xAI?',
tools: {
x_search: xai.tools.xSearch({
fromDate: '2025-10-01',
toDate: '2025-10-10',
}),
},
});
```
### Enable Image Understanding
Setting `enable_image_understanding` to true allows the agent to analyze images in X posts encountered during the search process.
```pythonXAI
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
x_search(enable_image_understanding=True),
],
)
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "Find X posts with images about AI"}],
tools=[
{
"type": "x_search",
"enable_image_understanding": True,
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'Find X posts with images about AI',
tools: {
x_search: xai.tools.xSearch({
enableImageUnderstanding: true,
}),
},
});
```
### Enable Video Understanding
Setting `enable_video_understanding` to true allows the agent to analyze videos in X posts. This is only available for X Search (not Web Search).
```pythonXAI
chat = client.chat.create(
model="grok-4-1-fast-reasoning",
tools=[
x_search(enable_video_understanding=True),
],
)
```
```pythonOpenAISDK
response = client.responses.create(
model="grok-4-1-fast-reasoning",
input=[{"role": "user", "content": "Find X posts with videos about AI"}],
tools=[
{
"type": "x_search",
"enable_video_understanding": True,
},
],
)
```
```javascriptAISDK
const { text } = await generateText({
model: xai.responses('grok-4-1-fast-reasoning'),
prompt: 'Find X posts with videos about AI',
tools: {
x_search: xai.tools.xSearch({
enableVideoUnderstanding: true,
}),
},
});
```
## Citations
For details on how to retrieve and use citations from search results, see the [Citations](/developers/tools/citations) page.
===/grok/apps/google-drive===
#### Grok Business / Enterprise
# Google Drive Integration with Grok
## Overview: Connect Google Drive to Grok
Seamlessly search and reference your Google Drive files directly in Grok chats. This integration lets Grok access your team's shared files and your personal files to provide more accurate, grounded responses—reducing hallucinations and helping you work faster.
Powered by xAI's Collections API, the connector indexes files securely while respecting Google Drive permissions. Grok only retrieves content you can view. Files you don't have permission to view are never indexed or returned.
**Key benefits**:
* Get summaries, analyses, or answers with direct citations to your files.
* No need to manually upload or attach files. Grok searches automatically when relevant.
* Query files by content or metadata (filename, folder, owner, modification dates).
This feature is available in Grok Business and Enterprise plans. xAI doesn't use customer Google Drive data to train its models.
## Using Google Drive Files in Grok Chats
Once connected, Grok automatically searches relevant files—no extra steps needed.
**Examples of what to ask**:
* "Summarize the Q4 sales report from the Finance team documents."
* "What does our employee handbook say about remote work policies according to our company documents?"
* "Summarize my Go-to-market strategy document."
**Grok will**:
* Search content and metadata.
* Provide answers with inline citations linking back to the source file.
* Reason over multiple files when needed.
## Setting Up the Integration
Setup combines admin configuration for team shared files and optional user connections for personal files.
### Admin Setup: Enable for Shared Files
Team admins configure the connector once at the workspace level.
**Prerequisites**:
* You must be a Grok Business or Enterprise team admin.
* You must have purchased Grok Business or Enterprise licenses for your team.
**Steps**:
1. Log in to the xAI Console and go to **[Grok Business Apps](https://console.x.ai/team/default/grok-business/apps)**
2. Click **[Add to team](https://console.x.ai/team/default/grok-business/apps?add-connector-type=CONNECTOR_TYPE_GOOGLE_DRIVE)** for the Google Drive app.
3. Specify your Google Workspace domain.
4. Choose who can use the connector: everyone in the workspace or specific allowed users.
5. Sign in with your Google account and grant permissions. The OAuth authentication provides a secure way to allow access without sharing passwords.
Once connected, Grok immediately begins syncing files accessible to the admin's account. Shared files become available to authorized users right away.
Admins can later edit allowed users or remove the connector entirely from the same settings page.
### User Setup: Connect Your Personal Drive
End users can optionally connect their own Google Drive for searching their private files.
**Steps**:
1. On grok.com, go to **[Settings > Connected apps](https://grok.com/?_s=grok-business-connected-apps)**.
2. Select **Google Drive > Connect**.
3. Sign in with your Google account and grant permissions.
4. Your private files will sync and become searchable in your Grok chats.
To disconnect: Return to **[Settings > Connected apps](https://grok.com/?_s=grok-business-connected-apps)** and revoke access.
## Managing Your Integration
* Admins can view sync status and the list of users who have authenticated with Google Drive from **[Apps Settings page](https://console.x.ai/team/default/grok-business/apps)**.
* Admins or members can disconnect anytime to stop syncing their files.
## How Indexing and Syncing Works
* Initial sync starts immediately after admin setup.
* Ongoing: Grok checks for changes (new/updated/deleted files, permission changes) every hour.
* Permissions are always enforced. Grok only shows you files you can view in Google Drive.
* No inclusion/exclusion rules beyond the admin's initial access scope and user permissions.
**Supported file formats**:
Grok indexes a wide range of common file types from Google Drive, including native Google formats, Microsoft Office documents, PDFs, code files, and more.
| Category | File Formats |
|----------|--------------|
| Documents & Presentations | Google Docs, Sheets, and Slides, Microsoft Word (.doc, .docx), Microsoft Excel (.xls, .xlsx, including macro-enabled workbooks), Microsoft PowerPoint (.ppt, .pptx, including macro-enabled presentations and slideshows), Microsoft Outlook (.msg, .pst), PDFs, OpenDocument Text (.odt), Rich Text Format (.rtf), EPUB e-books |
| Data & Structured Files | CSV (comma-separated values), JSON, XML |
| Web & Markup | HTML, CSS, Markdown (.md) |
| Code Files | Python, JavaScript, TypeScript, C/C++ header and source files, SQL, YAML, TOML, Shell scripts, Ruby, Scala, Swift, Kotlin, Lua, PHP, Perl |
| Notebooks | Jupyter Notebooks (.ipynb), Google Colab notebooks |
| Email & Other | Email messages (.eml, RFC822 format), Plain text (.txt), TeXmacs |
## Limitations
* For files exceeding 128MB, Grok only indexes the first 128 MB of content.
* Sync checks hourly. Some recent changes may take up to an hour to appear.
* Only supported file types are indexed and are searchable (see list above)
## Frequently Asked Questions
**1. Why aren't my files appearing?**
Wait up to an hour for sync, or check permissions in Google Drive.
**2. Do I need to connect my personal Drive?**
No, shared files work via admin setup. Connect personal for your private files only.
**3. Can Grok edit files?**
No, read-only access for search and reference.
**4. How do I see which files were used?**
Grok includes citations in responses.
For troubleshooting or white-glove onboarding, please contact xAI support via .
===/grok/faq===
#### FAQ
# FAQ - Grok Website / Apps
While the documentation is mainly meant for our API users, you can find some commonly asked questions here for our consumer-facing website/apps.
## How can I link my X account sign-in/subscription to my xAI account?
On [Grok Website](https://grok.com), go to Settings -> Account. Click on Connect your X Account button. This will take you to X's SSO page to add X account as a sign-in method for xAI.
xAI will be able to retrieve your X subscription status and grant relevant benefits after linking.
You can manage your sign-in methods at https://accounts.x.ai.
## How do I add/remove other sign-in methods or link my X subscription?
You can add/remove your sign-in methods at https://accounts.x.ai. Your account must have at least one sign-in method.
Linking or signing up with X account will automatically link your X account subscription status with xAI, which can be used on https://grok.com.
## I signed-up to Grok / xAI API with my X account, why is xAI still asking for my email?
When you sign up with X, you will be prompted with the following:
As X does not provide the email address, you can have different emails on your X account and xAI account.
## I have issues using X, can I reach out to xAI for help?
While xAI provides the Grok in X service on X.com and X apps, it does not have operational oversight of X's service. You can contact X via their [Help Center](https://help.x.com/) or message [@premium on X](https://x.com/premium).
## How can I delete the account?
Your xAI account can be deleted by following the steps at [xAI Accounts](https://accounts.x.ai/account). If you are using the same account to access our API, your API access will be removed as well.
You can restore your account within 30 days by logging in again and confirming restoration.
## How do I unsubscribe?
If you have subscribed to SuperGrok, you can go to https://grok.com -> Settings -> Billing to manage your subscription (purchased from Grok Website), [Request a refund for app](https://support.apple.com/118223) (purchased from Apple App Store), or [Cancel, pause or change a subscription on Google Play](https://support.google.com/googleplay/answer/7018481) (purchased from Google Play).
If you have subscribed to X Premium, X (not xAI) would be responsible for processing refund where required by law. You can [submit a refund request from X](https://help.x.com/forms/x-refund-request). See more details regarding X Premium subscriptions on [X Help Center](https://help.x.com/using-x/x-premium).
===/grok/faq/team-management===
#### FAQ
# Team Management
## What are teams?
Teams are the level at which xAI tracks API usage, processes billing, and issues invoices.
* If you’re the team creator and don’t need a new team, you can rename your Personal Team and add members instead of creating a new one.
* Each team has **roles**:
* **Admin**: Can modify team name, billing details, and manage members.
* **Member**: Cannot make these changes.
* The team creator is automatically an Admin.
## Which team am I on?
When you sign up for xAI, you’re automatically assigned to a **Personal Team**, which you can view the top bar of [xAI Console](https://console.x.ai).
## How can I manage teams and team members?
### Create a Team
1. Click the dropdown menu in the xAI Console.
2. Select **+ Create Team**.
3. Follow the on-screen instructions. You can edit these details later.
### Rename or Describe a Team
Admins can update the team name and description on the [Settings page](https://console.x.ai/team/default/settings).
### Manage Team Members
Admins can add or remove members by email on the [Users page](https://console.x.ai/team/default/users).
* Assign members as **Admin** or **Member**.
* If a user is removed, their API keys remain with the team.
### Delete a Team
Deleting a team removes its prepaid credits.
To permanently delete a team:
1. Go to the [Settings page](https://console.x.ai/team/default/settings).
2. Follow the instructions under **Delete Team**.
## How to automatically add users to team with my organization's email domain?
Admins can enable automatic team joining for users with a shared email domain:
1. Go to the [Settings page](https://console.x.ai/team/default/settings).
2. Add the domain under **Verified Domains**.
3. Add a `domain-verification` key to your domain’s DNS TXT record to verify ownership.
Users signing up with a verified domain email will automatically join the team.
===/grok/management===
#### Grok Business / Enterprise
# License & User Management
**The Grok Business overview page at [console.x.ai](https://console.x.ai) is your central hub for handling team licenses and user invitations.** As a team admin or user with appropriate permissions, you can buy licenses, invite new members, and manage access to ensure smooth collaboration.
Access this page by logging into [console.x.ai](https://console.x.ai) and navigating to the overview section. Note that actions like purchasing or provisioning require specific permissions—see the [Permissions](#permissions-and-troubleshooting) section for details.
## Purchasing Licenses
Expand your team's capabilities by buying additional licenses directly from the overview page.
Available license types:
* **SuperGrok:** Standard business access with enhanced quotas and features.
* **SuperGrok Heavy:** Upgraded performance for demanding workloads.
To purchase:
1. On the overview page, select the license type and quantity.
2. Enter payment details if prompted (requires billing read-write permissions).
3. Confirm the purchase—licenses will be added to your available pool for assignment.
Purchased licenses become immediately available for provisioning to users.
Ensure your team's billing is set up correctly to avoid
interruptions. Visit [Billing Settings](https://console.x.ai/team/default/billing) for more
details.
## Inviting Users
Invite new team members to join your Grok Business workspace with a simple email invitation process.
To invite:
1. On the overview page, click "Invite users to Grok Business".
2. Enter the users' email addresses.
3. Select a license type to auto-provision upon acceptance (requires team read-write permissions).
4. Send the invitation—the user will receive an email with a link to activate their account.
Invited users gain access to the team workspace and basic team read permissions. (the latter is to allow for sharing conversations with your team members)
View invited users in the "Pending invitations" list on the overview page. As long as you have unassigned licenses available, they will be automatically provisioned when the user accepts.
## Assigning and Revoking Licenses
Once licenses are purchased or available, assign them to users for full team workspace access.
To assign:
1. From the overview page, select a user from your team list.
2. Choose an available license and assign it—access activates immediately.
To revoke:
1. Click the "..." for the user and choose "Unassign License" from the dropdown.
2. Confirm the action—the license returns to your available pool, and the user's will no longer have access to your team's workspace.
Revocations take effect instantly, so ensure that you communicate changes to affected users.
Revoking a license removes team workspace access. Users will
retain personal workspace functionality.
## Canceling Licenses
Reduce your team's commitment by canceling unused licenses.
To cancel:
1. On the overview page, select the license type and quantity to cancel.
2. Submit the cancellation request (requires billing read-write permissions).
Cancellations may take a few days to process, and eligible refunds will be issued to your billing method. Canceled licenses are removed from your pool once processed.
## Permissions and Troubleshooting
Most management actions require specific role-based permissions:
* **Billing Read-Write:** Needed to purchase or cancel licenses.
* **Team Read-Write:** Needed to invite users or assign/revoke licenses.
These are typically granted only to team admins. If you lack permissions:
* Contact your team admin to request actions like license assignment or purchases.
* Admins can adjust permissions via the overview page's role settings.
If you encounter issues, such as invitations not provisioning due to insufficient licenses, purchase more or revoke unused ones first.
For white-glove support, Enterprise upgrades, or permission issues, contact xAI sales at .
===/grok/organization===
#### Grok Business / Enterprise
# Organization Management
**Organizations provide a higher-level governance structure for enterprise customers, encompassing multiple console teams under unified IT controls.** Available only to Enterprise tier subscribers, organizations enable centralized management of users, teams, and security features like SSO.
Access the organization dashboard by visiting [console.x.ai/organization](https://console.x.ai/organization). This page is restricted to organization admins.
Organizations are exclusive to the Enterprise tier. Contact xAI
sales to upgrade if needed.
## Understanding Organizations
An organization acts as an overarching entity that groups related console teams, ideal for large enterprises with multiple business units or departments.
Key features:
* **Domain Association:** Link your organization to a specific email domain (e.g., @yourcompany.com). Any user signing up or logging in with an email from this domain is automatically associated with the organization.
* **User Visibility:** Organization admins can view a comprehensive list of all associated users across teams on the `/organization` page.
* **Team Association:** Teams created by organization members are automatically linked to the organization and displayed in the dashboard for oversight.
This structure supports a multi-team architecture, allowing independent Grok Business or API teams while maintaining centralized governance, such as uniform access controls and auditing.
## Viewing Users and Teams
To view users:
1. Navigate to [console.x.ai/organization](https://console.x.ai/organization).
2. Scroll to the "Users" section for a list of all domain-associated users, including their team affiliations and access status.
To view teams:
1. In the same dashboard, access the "Teams" section.
2. Review associated console teams, their members, and high-level usage metrics.
Use these views to ensure compliance, spot inactive accounts, or identify growth needs.
## Setting Up SSO
Secure and streamline logins by integrating Single Sign-On (SSO) with your preferred Identity Provider (IdP).
To configure SSO:
1. On the `/organization` page, click "Configure SSO".
2. Choose your IdP from the supported list (e.g., Okta, Azure AD, Google Workspace).
3. Follow the self-guided, IdP-specific instructions provided—each includes step-by-step setup, metadata exchange, and attribute mapping details.
4. Save your configuration and test SSO to confirm the functionality.
SSO setup is straightforward and tailored to common providers, ensuring quick deployment.
## Activating SSO and User Impact
Once configured, SSO will be activated and enforced organization-wide.
Post-activation:
* Users must log in via SSO on their next access.
* If a user selects "Log in with email" and enters a domain-associated address, (e.g., @yourcompany.com) the system automatically detects it and redirects to your IdP for authentication.
* Non-domain emails (e.g., @differentcompany.com) fall back to standard login methods.
This ensures seamless, secure access without disrupting workflows.
Notify your users in advance about the SSO rollout to minimize
support queries.
## Setting up SCIM
Automate user provisioning and deprovisioning by integrating System for Cross-domain Identity Management (SCIM) with your Identity Provider (IdP). Follow these steps to set up SCIM effectively.
### Step 1: Configure Groups in Your IdP
1. On the `/organization` page, click "Setup SCIM".
2. Follow the IdP-specific steps provided to sync your groups.
3. Create as many groups as needed in your IdP—typically named something like `xai-admin`, `xai-supergrok-heavy`, `xai-supergrok`, or whatever fits your organizational structure.
This step ensures your groups are ready for mapping and synchronization with xAI.
### Step 2: Map Groups to Roles in xAI
Map your IdP groups to the available roles in xAI.
* The out-of-the-box roles include Admin, SuperGrok Heavy, SuperGrok, and Member (the default role assigned to users without any specified group).
* If you require more granular roles beyond these, contact your xAI representative to explore custom setup options—these standard roles are usually sufficient for most organizations.
This mapping aligns your IdP groups with xAI's role-based access controls.
### Step 3: Map Roles to Teams, Permissions, and Licenses
Assign your mapped roles to appropriate resources.
* Map roles to teams (usually just one primary team, but you may have more).
* Assign any relevant permissions.
* Typically, associate a license with the role.
This step customizes access and entitlements based on your organizational needs.
### Step 4: Preview and Activate SCIM
Before finalizing, review the changes.
* We provide a preview of what your organization will look like after activation.
* Confirm that members are assigned to the correct roles, those roles have the appropriate level of authorization, and the right licenses are applied.
* Once you feel confident everything is correct, click **"Activate"** to make SCIM your default provisioning system.
This verification ensures a smooth transition.
SCIM is very disruptive. Users might lose or gain access to resources
they did not have before. Notify your organization that you are undergoing this transition and
**verify everything is correct during the preview stage before proceeding.**
## Need Help?
For assistance with organization setup, SSO troubleshooting, or Enterprise features, contact xAI sales at [x.ai/grok/business/enquire](https://x.ai/grok/business/enquire).
===/grok/user-guide===
#### Grok Business / Enterprise
# Grok.com User Guide
**Grok Business provides dedicated workspaces for personal and team use, with enhanced privacy and sharing controls.** Switch between workspaces to access team-specific features and ensure your conversations are protected under business plan terms.
A team workspace offers:
* Privacy guarantees as outlined in xAI's [terms of service](https://x.ai/legal/terms-of-service-enterprise).
* Full benefits of SuperGrok (or SuperGrok Heavy for upgraded licenses).
* Secure sharing of conversations limited to active team members.
## Workspaces Overview
Grok Business features two types of workspaces:
* **Personal Workspace:** For individual use, available unless disabled by your organization.
* **Team Workspace:** For collaborative work within your team, accessible only with an active license.
To switch between workspaces, use the workspace selector in the bottom left navigation on grok.com. Ensure you are in the correct workspace before starting new conversations.
You can only access the team workspace when you have an
active license. If you lack access, contact your team admin.
## Privacy and Benefits
In your team workspace, enjoy enterprise-grade privacy protections as detailed in xAI's [terms of service](https://x.ai/legal/terms-of-service-enterprise). This includes data handling and, for the Enterprise tier, custom retention policies tailored for business use.
Additionally, unlock the full capabilities of SuperGrok, including higher usage quotas and advanced features. If your organization has an upgraded license, you may access SuperGrok Heavy for even more powerful performance.
Some users may not see a personal workspace. This indicates your organization has disabled
personal workspaces via an enterprise license. To enable or disable personal workspaces, reach out
to xAI sales for an Enterprise plan.
## Sharing Conversations
Sharing is restricted to your team for security:
* Share conversations only with team members who have active licenses.
* Share links are only accessible to licensed team members.
* If sent to non-team members or unlicensed team members, the link will not open.
To share a conversation:
1. Open the conversation in your team workspace.
2. Click the share button and select team members.
3. Generate and distribute the secure link.
View all shared conversations in your history at [https://grok.com/history?tab=shared-with-me](https://grok.com/history?tab=shared-with-me).
## Activating Your License
To activate or manage your license:
1. Visit your Grok Business overview at [console.x.ai](https://console.x.ai).
2. Press "Assign license" and select your license type.
3. If you encounter access issues or lack permissions, contact your team admin for assistance.
Once activated, your team workspace will become available on grok.com.
For white-glove support and Enterprise features, contact xAI sales at .