AI Gateway

An intelligent control plane for your AI applications

Connect to any model, dynamically route requests, and manage usage, billing, and logs from one unified gateway.

Start building for free View docs

Reduce Costs & Latency

Easily cache responses and reduces redundant API calls — leading to direct cost savings.

Improve Reliability with Dynamic Controls

Configure how and when model providers APIs are called based on specific attributes or fallbacks

Add Observability

Enables rich usage insights such as token counts, prompt performance, and pattern analysis.

Dynamic Routing

Automatically route requests based on latency, cost, or availability. Adjust rules instantly from the dashboard or API — no redeploys, no downtime.

Core Capabilities

Global Network Performance

Built on Cloudflare's infrastructure. Ensures low-latency, globally distributed access with automatic scalability and built-in security.

Caching

Reduces redundant API calls. Saves money and improves response time by storing and reusing frequent requests automatically.

Built-in Observability

Logs, metrics, and usage analytics. Includes fallback routing, rate limiting, and safety guardrails to manage cost, behavior, and compliance across multiple providers.

Security controls and guardrails

Protect your AI applications from leaking or sending sensitive information. Protects your AI app from malicious traffic without needing to configure or maintain anything extra.

Unified Billing

Manage all your costs with one simple bill and access every provider through a single API. Spend less time managing and more time shipping.

AI Gateway

Built for AI Application Control

You can use AI Gateway to:

View docs

Reducing latency and cost of AI apps by caching API responses

Optimize your AI application performance and reduce costs by intelligently caching responses from AI providers.

Usage analytics — monitoring prompt performance, token counts, and behavior

Gain deep insights into your AI usage patterns, token consumption, and prompt performance across all providers.

Building custom dashboards and alerting systems directly from AI Gateway logs

Create comprehensive monitoring and alerting systems using AI Gateway's rich logging and metrics data.

Control your AI infrastructure

Examples showing how to configure caching, routing, and monitoring for AI workloads.

// wrangler.jsonc
// Simple configuration
{
  "ai": {
    "binding": "AI"
  }
}

// Pass through the Gateway from your Worker with Workers AI
// index.js
const resp = await env.AI.run(
  "@cf/meta/llama-3.1-8b-instruct",
  {
    prompt: "tell me a joke",
  },
  {
    gateway: {
      id: "my-gateway",
    },
  },
);

// Use with OpenAI SDK
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: "my api key", // defaults to process.env["OPENAI_API_KEY"]
  baseURL: await env.AI.gateway("my-gateway").getUrl("openai"),
});

Rightblogger

Without AI Gateway, it’s difficult to see which applications are driving the majority of the costs with the OpenAI API … We can choose to limit the number of requests used by certain tools to control costs.

Powerful primitives, seamlessly integrated

Built on systems powering 20% of the Internet, AI Gateway runs on the same infrastructure Cloudflare uses to build Cloudflare. Enterprise-grade reliability, security, and performance are standard.

Compute

Workers

Global serverless functions

Containers

Any language, anywhere

Sandboxes

Secure code execution

Durable Objects

Stateful compute

Browser Rendering

Automated browsers

Workflows

Process orchestration

Storage

Egress-free storage

Data Platform

Ingest, Catalog & Query

Hyperdrive

Global databases

Serverless SQL

Key-value speed

Queues

Message processing

Workers AI

Edge AI models

Agents

Build stateful AI agents

AI Gateway

AI observability

Vectorize

Vector database

AI Search

Instant retrieval

Media

Images

Image optimization

Stream

Video streaming

RealtimeKit

Live comms

TURN / SFU

Real-time infra

Network

DNS

Fast DNS

CDN

Faster delivery

WAF

App protection

Load Balancing

Zero downtime

Rate Limiting

Abuse prevention

Bot Mitigation

Block bots

Turnstile

Privacy-first bot checks

DDoS Protection

DDoS mitigation

SASE / Zero Trust

SASE

Unified zero trust platform

无边界构建

加入数以千计的开发者，他们通过 Cloudflare 消除了基础设施复杂性并实现了全球部署。免费开始构建 —— 无需信用卡。

开始构建免费

查看文档