-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[BUG] Retry chroma-load upserts when rate limited #4485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When the upsert operations are rate-limited, the "block" of 100 records in the operation is never inserted. Therefore, the cardinality of the test data set can never grow beyond the point of the failed (rate limited) upsert. Also, when starting a new workload, we need to reset the cardinality and heap for the data set, so that it starts again from cardinality of 0.
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
Fix Rate-Limited Upserts in Chroma-Load with Retry Mechanism This PR addresses a bug where upsert operations that are rate-limited would prevent the test data set from growing beyond the point of failure. The solution implements a retry mechanism with exponential backoff for rate-limited upsert operations, and ensures cardinality is reset when starting a new workload. Key Changes: Affected Areas: This summary was automatically generated by @propel-code-bot |
}; | ||
let result = collection.upsert(entries, None).await; | ||
if let Err(err) = result { | ||
if format!("{err:?}").contains("429") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Using format!("{err:?}").contains("429")
to check for rate limiting is fragile and depends on the specific error format. It would be more robust to check for specific error types or error codes. If the ChromaClient has a specific error type for rate limiting, consider matching on that instead.
let entries = CollectionEntries { | ||
ids: keys.clone(), | ||
metadatas: res.metadatas.clone(), | ||
documents: Some(documents.clone()), | ||
embeddings: Some(embeddings.clone()), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[PerformanceOptimization]
The clone()
operations inside the loop can lead to unnecessary memory allocations on each retry. Consider moving the creation of entries
outside the loop and only clone when necessary for retries.
|
||
tokio::time::sleep(delay).await; | ||
|
||
// Exponential backoff with max delay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Consider using rand
to add jitter to your backoff delay to prevent thundering herd problems when multiple clients retry at the same time:
// Exponential backoff with max delay | |
// Exponential backoff with jitter and max delay | |
let jitter = rand::random::<f32>() * delay.as_millis() as f32 * 0.1; | |
let jittered_delay = delay.checked_add(std::time::Duration::from_millis(jitter as u64)).unwrap_or(delay); | |
delay = std::cmp::min(delay * 2, max_delay); |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
38abc51
to
1e6a027
Compare
Description of changes
When the upsert operations are rate-limited, the "block" of 100 records in the operation is never inserted. Therefore, the cardinality of the test data set can never grow beyond the point of the failed (rate limited) upsert.
Also, when starting a new workload, we need to reset the cardinality and heap for the data set, so that it starts again from cardinality of 0.
Test plan
Tested locally in tilt. Verified that the cardinality of the test data set continues to grow, even after hitting rate limit errors on previous upsert attempts. Also verified that the cardinality is reset to 0 when initializing a new workload.