Platform Usage Restrictions

  1. API Access Limits

To ensure the stability, security, and fair use of resources on the BrillAI platform, we implement certain access limits on API requests, including rate limits and concurrency limits. Below is a detailed explanation of these two limiting strategies.

What is Rate Limiting?

Rate limiting refers to the restriction on the number of times a user can access BrillAI platform services within a specified period.

Why Implement Rate Limiting?

Rate limiting is a common practice for APIs and is implemented for several reasons:

  • Ensuring Fair and Efficient Resource Use: Ensures fair use of resources by preventing some users from making excessive requests that could impact the normal usage experience of other users.

  • Preventing Request Overload: Helps manage overall load and avoid server performance issues caused by sudden spikes in requests, thereby improving service reliability.

  • Security Protection: Protects against malicious attacks that could overload the platform or cause service disruptions.

Various Preset Service Rate Limit Indicators (For Non-recharge Users Only)

Type Category
Current Rate Limit Metric
Remark

Dialogue Model (Chat)

RPM

Current: RPM=6

Image model (Image)

RPM

Current: RPM=6

Audio model (Voice)

RPM

Current: RPM=6

BriAl currently uses PRM (Requests per minute, requests per minute) to measure the rate limit of each service. The initial rate limit for paid users is 3000 RPM.

What is Concurrency Limiting?

Concurrency limiting refers to restricting the number of requests a user can execute concurrently at the same time. The goal of concurrency limiting is to ensure that the platform can efficiently handle requests and avoid system performance degradation or service unavailability due to too many simultaneous requests.

Why Implement Concurrency Limiting?

Concurrency limiting is a common practice for APIs and is implemented for several reasons:

  • Optimizing System Performance: Concurrency limiting helps manage system resources to ensure that each request is processed in a timely manner, thus optimizing the overall performance of the platform.

  • Preventing Resource Exhaustion: Concurrency limiting prevents system resources from being overused, avoiding service interruptions due to resource depletion.

  • Improving Service Reliability: By controlling the number of concurrent requests, concurrency limiting reduces the likelihood of system failures caused by overload, thereby enhancing the stability and reliability of the service.

BrillAI uses concurrency levels to measure the concurrency limits of its services.

User Type
Free User
Paid User (Effective Oct. 16, 2024)

Initial Rate Limit

RPM=6

RPM=3000

Concurrency

1

1000

Change Notice: Starting from October 16, 2024, the initial rate limit for paid users will be increased from 18 RPM to 3000 RPM, and the concurrency will be upgraded from 3 to 1000.

2. File Upload Limits

Currently, BrillAI supports users uploading their own images or audio files for models such as Image and Voice. The file size limit for uploads is set at 15MB.

Last updated