• Settings → Billing → Usage shows your current month's token consumption against your allowance. • The dashboard shows a small chip when you're past 80%.

Token usage & caps — Relm Pro

Q: What counts as token usage

• Chat queries. Each turn of the AI chat panel uses tokens for input + output. • Deep Research runs. Each agent's reasoning consumes tokens. • AI Summary generation. Tokens for the summary's reasoning + output. • Per-unit research. Same. What doesn't count: • Pro-forma generation. That's accounted via credits, not tokens. • Document indexing (embedding) — tracked separately and not capped at the user level. • Static UI rendering — no AI involved.

Q: Why we have a cap at all

Two reasons: Cost control — frontier-model tokens cost real money. Without a cap, a runaway script (or a bug) could consume orders of magnitude more than expected. Quality — heavy users sometimes hit reasoning latency budgets. The cap is a soft signal that the workload is unusual and worth a conversation about Enterprise.

Beyond the pro-forma credit system, Relm has a soft "fair use" cap on AI token usage to prevent runaway spend. For most users this is invisible — you'll never approach the cap. This page documents how it works in case you do.

What counts as token usage

Chat queries. Each turn of the AI chat panel uses tokens for input + output.
Deep Research runs. Each agent's reasoning consumes tokens.
AI Summary generation. Tokens for the summary's reasoning + output.
Per-unit research. Same.

What doesn't count:

Pro-forma generation. That's accounted via credits, not tokens.
Document indexing (embedding) — tracked separately and not capped at the user level.
Static UI rendering — no AI involved.

Self-Serve cap

Self-Serve has a monthly token allowance that resets each cycle, set high enough that typical usage (heavy chat + several deep-research runs) never approaches it. If you do approach it, the UI shows a warning at 80% and 95% of the allowance.

If you exceed:

New chat / Deep Research / AI Summary actions are throttled or temporarily disabled.
Existing pro-formas, sections, and chat history remain readable.
The cap resets at the next billing cycle.

Enterprise

Enterprise contracts include a much higher allowance (or, for some contracts, no cap). Specific terms are in your contract.

If your team is approaching its allowance, your account contact gets notified at 80% — there's no surprise lockout.

Where to see usage

Settings → Billing → Usage shows your current month's token consumption against your allowance.
The dashboard shows a small chip when you're past 80%.

Why we have a cap at all

Two reasons:

Cost control — frontier-model tokens cost real money. Without a cap, a runaway script (or a bug) could consume orders of magnitude more than expected.
Quality — heavy users sometimes hit reasoning latency budgets. The cap is a soft signal that the workload is unusual and worth a conversation about Enterprise.

Adjusting

If you legitimately need more tokens than Self-Serve allows, the cleanest path is to upgrade to Enterprise. We don't sell token packs as a Self-Serve add-on.

Token usage & caps

What counts as token usage

Self-Serve cap

Enterprise

Where to see usage

Why we have a cap at all

Adjusting

What's next

Was this article helpful?

Still need help?

What counts as token usage

Self-Serve cap

Enterprise

Where to see usage

Why we have a cap at all

Adjusting

What's next

Was this article helpful?

Still need help?

Related articles