Prompt Caching

djangosdk supports native prompt caching for Anthropic and OpenAI, which can significantly reduce costs and latency on repeated requests.

How It Works

PromptCacheMiddleware is applied automatically inside LiteLLMProvider when:

  • AI_SDK.CACHE.ENABLED is True

  • The provider is listed in AI_SDK.CACHE.PROVIDERS

  • Agent.enable_cache is True (the default)

The middleware adds the appropriate cache control headers to the system prompt and the first few user messages.

Configuration

AI_SDK = {
    "CACHE": {
        "ENABLED": True,
        "PROVIDERS": ["anthropic", "openai"],
    },
}

Disabling Cache for a Specific Agent

Monitoring Cache Performance

The SDK fires signals for cache hits and misses. Connect to them to collect metrics:

Cache token counts are also available on AgentResponse.usage:

Provider Notes

Anthropic — Cache prefixes are added using the cache_control field. Long system prompts and repeated context are automatically marked as cacheable.

OpenAI — Prompt caching is handled transparently by the OpenAI API when the enable_cache flag is set. No special parameters are needed.

Last updated

Was this helpful?