Advanced self-hosting topics

This guide covers advanced topics related to self-hosting.

Enable or disable telemetry

Braintrust can send the following types of telemetry from your self-hosted data plane to Braintrust’s control plane:

Type	Description
`status`	Health check information (enabled by default)
`metrics`	System metrics (CPU/memory) and Braintrust-specific metrics like indexing lag (enabled by default)
`usage`	Billing usage telemetry for aggregate usage metrics (enabled by default)
`memprof`	Memory profiling statistics and heap usage patterns
`logs`	Application logs
`traces`	Distributed tracing data

By default, status, metrics, and usage are enabled. You can change the defaults as follows:

AWS
GCP / Azure

Add the monitoring_telemetry variable to your variables.tf file, and include the types of telemetry you want to send in the validation condition as a comma-separated list:

variable "monitoring_telemetry" {
  description = <<-EOT
    The telemetry to send to Braintrust's control plane to monitor your deployment. Should be in the form of comma-separated values.

    Available options:
    - status: Health check information (default)
    - metrics: System metrics (CPU/memory) and Braintrust-specific metrics like indexing lag (default)
    - usage: Billing usage telemetry for aggregate usage metrics
    - memprof: Memory profiling statistics and heap usage patterns
    - logs: Application logs
    - traces: Distributed tracing data
  EOT
  type        = string
  default     = "status,metrics,usage"

  validation {
    condition = var.monitoring_telemetry == "" || alltrue([
      for item in split(",", var.monitoring_telemetry) :
      contains(["metrics", "logs", "traces", "status", "memprof", "usage"], trimspace(item))
    ])
    error_message = "The monitoring_telemetry value must be a comma-separated list containing only: metrics, logs, traces, status, memprof, usage."
  }
}

Update the controlPlaneTelemetry setting in your Helm values.yaml file to include the types of telemetry you want to send:

# Global configs
global:
  orgName: "<your org name on Braintrust>"
  # When createNamespace is true, the namespace will be created and resources will be in global.namespace
  # When createNamespace is false, resources will use .Release.Namespace (the namespace specified during helm install/upgrade)
  createNamespace: false
  namespace: "braintrust"
  namespaceAnnotations: {}
  labels: {}
  controlPlaneTelemetry: "status,metrics,usage,logs,traces,memprof"

Braintrust also has access to endpoints reporting metrics about the backfill and compaction status of Brainstore segments. This is metadata only, no customer data. To disable these endpoints, set the DISABLE_SYSADMIN_TELEMETRY environment variable to true.

If you disable telemetry, Braintrust’s ability to proactively monitor your deployment and diagnose issues will be significantly limited. Before disabling, consider the impact on support response times.

Secure sensitive customer data

Braintrust’s servers and employees do not require access to your data plane for it to operate successfully. That means that you can protect it behind a firewall/VPN and physically isolate it from access. When you use the Braintrust web application, it communicates directly with the data plane (via CORS), and the data does not flow through any intermediate systems (the control plane, or otherwise) before reaching your browser. While the data plane does send metrics and status telemetry to the control plane, it does not send logs, traces, or customer data. Because of this architecture, our self-hosted customers do not generally list us as a subprocessor. Like any third-party software, it is important that you establish the appropriate controls to ensure that your deployment is secure, and we’re very happy to help you do so. Ultimately, the goal of the control plane and data plane split is to provide you with the highest levels of security and compliance.

Grant browser permissions

If your data plane is deployed behind a VPN or on a private network (not accessible from the public internet), users will need to grant browser permissions to access it. When you enable the Data plane is on a private network setting in organization settings, the Braintrust UI will check for Chrome’s Local Network Access permission. When users access your Braintrust org for the first time and a private network data plane has been configured, Chrome will display a permission prompt. Users must click Allow to grant permission for the Braintrust UI to communicate with your data plane. If permissions are blocked, we will display a modal with the steps to correct the issue.

Local Network Access is a Chrome security feature that protects users from malicious websites accessing resources on their private networks. For more information, see Chrome’s Private Network Access documentation.

If you encounter connectivity issues, ensure that your browser is up to date, any corporate proxies or network policies allow browser access to the data plane, and CORS is properly configured on your data plane deployment (automatically handled by Braintrust Terraform modules).

Customize the webapp URL

The SDKs guide users to https://www.braintrust.dev (or the BRAINTRUST_APP_URL variable) to view their experiments. However, in certain advanced configurations, you may want to reverse proxy traffic to the BRAINTRUST_APP_URL from the SDKs while pointing users to a different URL. To do this, you can set the BRAINTRUST_APP_PUBLIC_URL environment variable to the URL of your webapp. By default, this variable is set to the value of BRAINTRUST_APP_URL, but you can customize it as you wish. This variable is only used to display information, so even its destination does not need to be accessible from the SDK.

Constrain SDKs to the data plane

If you’re self-hosting the data plane, it may also be advantageous to constrain the SDKs to only communicate with your data plane. Normally, they communicate with the control plane to:

Get your data plane’s URL
Register and retrieve metadata (e.g. about experiments)
Print URLs to the webapp

The data plane can proxy the endpoints that the SDKs use to communicate with the control plane, allowing your SDKs to only communicate with the data plane directly. Set the BRAINTRUST_APP_URL environment variable to the URL of your data plane and BRAINTRUST_APP_PUBLIC_URL to “https://www.braintrust.dev” (or the URL of your webapp).

Restrict URLs

In some cases, you may want to restrict the URLs that the SDKs or API server can communicate with. If so, you should include the following URLs:

www.braintrust.dev
braintrust.dev

Configure rate limits

By default, the Braintrust API server imposes rate limits against any external domains it reaches out to, such as the BRAINTRUST_APP_URL. The purpose of rate-limiting is to prevent unintentionally overloading any external domains, which may block the API server IP in response. By default, the rate limit is 100 requests per minute per user auth token. The API server exposes the following variables to configure the rate limits:

OUTBOUND_RATE_LIMIT_MAX_REQUESTS: Configure the number of requests per time window. This can be set to 0 to disable rate limiting.
OUTBOUND_RATE_LIMIT_WINDOW_MINUTES: Configure the time window in minutes before the rate limit resets.

Configure HTTP keep-alive timeout

When the API server runs behind a load balancer, you may need to configure the HTTP keep-alive timeout to prevent connection resets. Load balancers typically have an idle timeout for connections, and if the API server’s keep-alive timeout is shorter than the load balancer’s timeout, the API server closes the connection while the load balancer still considers it open. When the load balancer tries to reuse that backend connection, it encounters a closed socket, resulting in connection reset errors and 502 responses. The API server exposes the following environment variable to configure the keep-alive timeout:

TS_API_KEEP_ALIVE_TIMEOUT_SECONDS: The HTTP keep-alive timeout in seconds. Default: 65

The default value of 65 seconds is designed to work with most load balancers, including AWS Application Load Balancer (which has a default idle timeout of 60 seconds). However, if your load balancer has a longer idle timeout, you should set this value to match or exceed your load balancer’s timeout. For example, if you have an AWS ALB configured with a 300-second idle timeout, set:

TS_API_KEEP_ALIVE_TIMEOUT_SECONDS=300

Enable audit headers

When integrating with Braintrust, especially in environments where actions need to be attributed to specific users or for compliance purposes, you might want to enable audit headers. These headers provide additional metadata about the request and the resources it touched. To enable audit headers, include the x-bt-enable-audit: true header in your API request. When this header is present, the API response will include the following additional headers:

x-bt-audit-user-id: The ID of the user who made the request (based on the provided API key or impersonation).
x-bt-audit-user-email: The email of the user who made the request.
x-bt-audit-normalized-url: A normalized representation of the API endpoint path that was called. Path parameters like object IDs are replaced with placeholders (for example, /v1/project/[id]).
x-bt-audit-resources: A JSON-encoded, gzipped, and base64-encoded string containing a list of Braintrust resources (like projects, experiments, datasets, etc.) that were accessed or modified by the request. Each resource object includes its type, id, and name.

The x-bt-audit-resources header requires specific parsing due to its encoding. Here’s an example of how to parse it using the Python SDK:

import os

import braintrust
import requests

API_URL = "https://api.braintrust.dev/v1"
# Ensure BRAINTRUST_API_KEY is set in your environment.
headers = {
    "Authorization": "Bearer " + os.environ["BRAINTRUST_API_KEY"],
    "x-bt-enable-audit": "true",  # Enable audit headers
}

# Example: Create a project.
response = requests.post(f"{API_URL}/project", headers=headers, json={"name": "audit-test-project"})
response.raise_for_status()

project_data = response.json()
print(f"Project created: {project_data['name']} (ID: {project_data['id']})")

# Access and parse audit headers.
user_id = response.headers.get("x-bt-audit-user-id")
user_email = response.headers.get("x-bt-audit-user-email")
normalized_url = response.headers.get("x-bt-audit-normalized-url")
resources_header = response.headers.get("x-bt-audit-resources")

print(f"Audit User ID: {user_id}")
print(f"Audit User Email: {user_email}")
print(f"Normalized URL: {normalized_url}")

if resources_header:
    try:
        # Use the provided utility to parse the resources header.
        resources = braintrust.parse_audit_resources(resources_header)
        print("Accessed/Modified Resources:")
        for resource in resources:
            print(f"  - Type: {resource['type']}, ID: {resource['id']}, Name: {resource['name']}")
    except Exception as e:
        print(f"Error parsing resources header: {e}")
else:
    print("No resources header found.")

This feature is useful for building audit logs or understanding resource usage patterns within your applications that interact with the Braintrust API.

Configure Brainstore fast readers

Fast readers are isolated Brainstore nodes dedicated to serving predictable UI queries (paginated viewers, span and trace lookups), preventing resource-intensive ad-hoc queries from making the UI unresponsive.

GCP and Azure: Fast readers are enabled by default starting in Helm chart v5.0.0. See the configuration reference below.
AWS: Fast readers are disabled by default. Set brainstore_fast_reader_instance_count in your Terraform configuration to enable them.

Upgrading to Helm chart v5.0.0 from an earlier version automatically creates fast reader nodes. By default, 2 fast reader nodes are created with the same resource profile as standard reader nodes (CPU: 16, memory: 32Gi). Verify that your cluster has capacity for these additional nodes before upgrading.

If you have custom brainstore.readinessProbe overrides pointing to /status, remove them before upgrading to Helm chart v5.0.0+. The /status readiness endpoint has a bug where it never recovers after a failure, which can permanently mark Brainstore nodes as not ready. Remove any brainstore.readinessProbe or brainstore.fastreader.readinessProbe customizations and rely on the chart defaults.

Configuration reference

Fast readers are configured under the brainstore.fastreader key in your values.yaml. If you have customized brainstore.reader settings, mirror those customizations to brainstore.fastreader.

Key	Default	Description
`brainstore.fastreader.name`	`brainstore-fastreader`	Name of the Deployment, Service, and ConfigMap
`brainstore.fastreader.replicas`	`2`	Number of fast reader pod replicas
`brainstore.fastreader.service.port`	`4000`	Service port
`brainstore.fastreader.service.type`	`ClusterIP`	Kubernetes service type
`brainstore.fastreader.resources.requests.cpu`	`16`	CPU request
`brainstore.fastreader.resources.requests.memory`	`32Gi`	Memory request
`brainstore.fastreader.resources.limits.cpu`	`16`	CPU limit
`brainstore.fastreader.resources.limits.memory`	`32Gi`	Memory limit
`brainstore.fastreader.objectStoreCacheMemoryLimit`	`1Gi`	Object store memory cache limit
`brainstore.fastreader.objectStoreCacheFileSize`	`1000Gi`	Object store file cache size
`brainstore.fastreader.cacheDir`	`/mnt/tmp/brainstore`	Local cache mount path
`brainstore.fastreader.volume.size`	`""`	Ephemeral storage size; required for Azure ACS
`brainstore.fastreader.extraEnvVars`	`[]`	Additional environment variables
`brainstore.fastreader.nodeSelector`	`{}`	Node selector for scheduling
`brainstore.fastreader.tolerations`	`[]`	Pod tolerations
`brainstore.fastreader.affinity`	`{}`	Pod affinity rules

Azure
GCP Autopilot

Azure users must explicitly set brainstore.fastreader.volume.size when using Azure Container Storage (enableAzureContainerStorageDriver: true):

brainstore:
  fastreader:
    volume:
      size: "100Gi"

On GKE Autopilot, set brainstore.fastreader.volume.size to configure ephemeral storage requests. Match the resource profile of your standard reader nodes:

brainstore:
  fastreader:
    resources:
      requests:
        cpu: "16"
        memory: "32Gi"
      limits:
        cpu: "16"
        memory: "32Gi"
    objectStoreCacheFileSize: "900Gi"
    volume:
      size: "1000Gi"

Brainstore resource configuration

This section applies to GCP and Azure deployments using the Helm chart (v5.0.1+). AWS deployments manage Brainstore resources automatically.

Starting in Helm chart v5.0.1, the resources block for each Brainstore component (brainstore.reader, brainstore.writer, brainstore.fastreader) is passed through as-is to the Kubernetes pod spec. You can omit limits entirely, set them to {}, or supply any valid Kubernetes resource spec.

Omitting limits sets the pod QoS class to Burstable, which prevents CPU throttling and allows pods to use available node capacity. This can improve query performance on nodes with spare capacity, but increases the risk of resource contention if multiple pods compete for the same node.

# With limits (Guaranteed QoS — default)
brainstore:
  reader:
    resources:
      requests:
        cpu: "16"
        memory: "32Gi"
      limits:
        cpu: "16"
        memory: "32Gi"

# Without limits (Burstable QoS — allows burst CPU/memory)
brainstore:
  reader:
    resources:
      requests:
        cpu: "16"
        memory: "32Gi"
  writer:
    resources:
      requests:
        cpu: "4"
        memory: "16Gi"
  fastreader:
    resources:
      requests:
        cpu: "16"
        memory: "32Gi"

Auto-derived Brainstore environment variables

As of Helm chart v5.1.0, BRAINSTORE_RESPONSE_CACHE_URI and BRAINSTORE_CODE_BUNDLE_URI are automatically populated from your objectStorage configuration and do not need to be set manually. The chart derives these values as follows:

Cloud	Source key	`BRAINSTORE_RESPONSE_CACHE_URI`	`BRAINSTORE_CODE_BUNDLE_URI`
AWS	`objectStorage.aws.responseBucket` / `objectStorage.aws.codeBundleBucket`	`s3://<responseBucket>/brainstore-cache`	`s3://<codeBundleBucket>`
Azure	`objectStorage.azure.responseContainer` / `objectStorage.azure.codeBundleContainer`	`az://<responseContainer>/brainstore-cache`	`az://<codeBundleContainer>`
GCP	`objectStorage.google.apiBucket`	`gs://<apiBucket>/brainstore-cache`	`gs://<apiBucket>`

If you previously configured these via extraEnvVars, remove those overrides after upgrading to v5.1.0 to avoid conflicts.

Data retention

Data retention requires a service token so the data plane can query object metadata and look up retention policies configured in your organization. Braintrust automatically provisions this token when you configure your self-hosted data plane URL in organization settings. It is created with read-only permissions on projects and stored securely in your data plane. To verify or refresh the token, go to Settings > Service tokens. If the token doesn’t exist, click Create. To rotate it, click Refresh — the data plane will start using the new token automatically.

Before configuring retention, review your cloud provider’s bucket retention policies. See Cloud provider retention policies for details.

​Enable or disable telemetry

​Secure sensitive customer data

​Grant browser permissions

​Customize the webapp URL

​Constrain SDKs to the data plane

​Restrict URLs

​Configure rate limits

​Configure HTTP keep-alive timeout

​Enable audit headers

​Configure Brainstore fast readers

​Configuration reference

​Brainstore resource configuration

​Auto-derived Brainstore environment variables

​Data retention

Enable or disable telemetry

Secure sensitive customer data

Grant browser permissions

Customize the webapp URL

Constrain SDKs to the data plane

Restrict URLs

Configure rate limits

Configure HTTP keep-alive timeout

Enable audit headers

Configure Brainstore fast readers

Configuration reference

Brainstore resource configuration

Auto-derived Brainstore environment variables

Data retention