-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[CLN] Rename $matches to $regex #4506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Rename $matches to $regex for Improved Clarity and Optimized Performance This PR renames document operators from $matches/$not_matches to $regex/$not_regex for better clarity and semantic meaning. It also improves regex matching performance through optimized implementation using HashSet instead of RoaringBitmap, and adds proper validation for regex patterns using a literal expression check. Key Changes: Affected Areas: Potential Impact: Functionality: Functionality remains the same but with improved naming for better semantic understanding. API consumers will need to update from $matches to $regex in their queries. Performance: Significant performance improvement in regex matching by using HashSet instead of RoaringBitmap, which reduces overhead in handling matching positions. Security: Improved validation for regex patterns helps prevent overly permissive patterns that could cause performance issues. Scalability: Better performance in regex matching improves scalability for document search operations, especially for large collections. Review Focus: Testing Needed• Verify regex matching still works correctly with the renamed operators Code Quality Assessmentrust/types/src/regex/literal_expr.rs: Added a useful method to check for ngram literals with parametrized max width rust/worker/src/execution/operators/filter.rs: Improved filtering threshold logic to be proportional to collection size rust/index/src/fulltext/types.rs: Simplified implementation by avoiding unnecessary copying of position data Best PracticesNaming: Performance: Code Organization: Potential Issues• Breaking change for existing clients using $matches/$not_matches operators This summary was automatically generated by @propel-code-bot |
b5ba11c
to
f77fb3d
Compare
f2751c5
to
c89f762
Compare
f77fb3d
to
e157e69
Compare
c89f762
to
6902fa0
Compare
e157e69
to
75486e3
Compare
6902fa0
to
e13f8df
Compare
75486e3
to
a32ca6b
Compare
## Description of changes _Summarize the changes made by this PR._ - Improvements & Bug fixes - Renames `$matches` and `$not_matches` to `$regex` and `$not_regex` - New functionality - N/A ## Test plan _How are these changes tested?_ - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes _Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_
Description of changes
Summarize the changes made by this PR.
$matches
and$not_matches
to$regex
and$not_regex
Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?