Understanding rate limiting
Rate limiting is a protective state that can temporarily restrict credit-consuming activity when a tenant, user role, model, or high-volume request reaches an applicable usage limit.
The most common reason is that a tenant has used its monthly credit allowance. Limits can also depend on role-level credit controls, the model selected, or the type of request being run.
What happens when rate limiting is active?
Users may see reduced access to AI actions that consume credits. Depending on the situation, this can include:
Some models being unavailable or marked as restricted
Workflow, app, or agent runs failing with a usage or limit message
File actions, tool actions, or API requests being paused or rejected
A user being blocked by a role-level limit even when other users in the tenant still have access
The exact behavior can vary by feature and configuration. If a limit message looks unexpected, collect the details in the support checklist below.
Tenant limits and role limits
Tenant credit limits apply to the tenant's shared monthly allowance. When the tenant allowance is reached, multiple users and AI features may be affected until the calendar-month allowance resets, the package is upgraded, or an extra credit pack is added.
Role-level limits apply to users assigned to a role with its own credit limit. A user can be limited by their role even if the tenant still has credits available. Users may see an in-product warning as they approach their monthly role limit, and a stronger message when the limit is reached.
How will users know?
Users may see in-product banners or messages, unavailable models, failed workflow runs, disabled chat submission, or API errors that reference usage limits or rate limiting.
Admins can review the dashboard, tenant management, billing-period usage, and users and roles views available to their role. Admins may also receive email notifications when tenant usage approaches or reaches the monthly allowance.
Availability of alerts and notifications can vary by tenant configuration, user role, and feature. The usage views should be used to investigate the current state before assuming which limit was reached.
Can users still log in?
Users should generally still be able to sign in and use chat while rate limiting is active. Depending on tenant settings and model availability, they may be routed to a less expensive model or asked to choose an available model. Other credit-consuming functionality may be limited, including higher-cost models, Workshop actions, API requests, file or tool actions, workflows, and follow-up automation.
When does rate limiting stop?
Rate limiting can clear when the applicable calendar-month allowance resets, a role limit is changed, the tenant's package is upgraded, an extra credit pack is added, or the request no longer hits the active limit.
If a tenant is repeatedly hitting limits before the end of the billing cycle, review the tenant's usage patterns and package fit rather than treating the event as a one-time error.
How can admins reduce the chance of rate limiting?
Monitor tenant and user usage after onboarding new users, workflows, integrations, or API use cases.
Use Auto Mode when available so Hatz can route requests to an appropriate model for the task.
Use the model that matches the task. Smaller models are often a better fit for simple, high-volume tasks.
Disable higher-multiplier models for users or groups that do not need them.
Keep prompts, files, and workflow inputs focused on the information needed for the task.
Review workflows, apps, and agents that run multiple model calls or tool actions.
Use role-level credit limits when a group of users needs a defined monthly cap.
Review package upgrades or extra credit packs in Billing when normal usage consistently exceeds the current allowance.
How do I check credit usage?
Admins can review usage in the dashboard, tenant management, billing-period usage, and users and roles views available to their role. These views can help identify whether usage is coming from a tenant-wide pattern, a specific user, a workflow, API activity, or a recent rollout.
What if one user or workflow uses most of the credits?
Look at the model, model multiplier, files, prompt size, workflow steps, tool calls, and API activity tied to that usage. High usage can be legitimate for complex tasks, but it may also point to a workflow that should be simplified, Auto Mode that should be used, a higher-multiplier model that should be disabled for a group, or a role-level limit that should be adjusted.
What should I send support?
If you need help investigating rate limiting, send the tenant name, affected user, date and time, feature used, model selected, workflow/app/agent name, API endpoint if applicable, and the exact error message or screenshot.
If the issue is urgent, include whether the user expected a human escalation, whether the request is affecting production work, and whether the tenant is currently after hours.
