-
Notifications
You must be signed in to change notification settings - Fork 185
Bug: unsigned uint8 misbehaves when building an index #595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@liquidcarbon hey! Try explicitly setting the preferred metric and internal representation type in the constructor of the index 🤗 |
Is it same for types like f32 and f16? |
Yes
|
Additional context: I have a large Parquet dataset with vector column written as 1024-dim np.uint8 vectors, of which typically around 50-100 are non-zeroes. I was trying to build an index with usearch, and the search results didn't make sense. Then I noticed that in the index there remained only a few (under 10) non-zero values in the vectors. Amazon Linux 2023.6.20241010; r7i-large instance, if this helps |
The reason for uint8 was to use feature counts; I have no intuition whether using counts is any better than using bits (seems to be the go-to method). But I figured one can always turn uint counts to bits, but not the other way around. |
Fun fact: |
That's a good hint, @liquidcarbon! The |
I'll take a look but if the root cause is in on the C side I must bow out :) |
I'll take over the C patches, but having it covered with tests on the Python will be a good starting point for me. Thanks, @liquidcarbon! |
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
Why does the index and and distance calculations become all zeroes?
Steps to reproduce
Expected behavior
If you do this with DuckDB:
USearch version
2.17.7
Operating System
Amazon Linux
Hardware architecture
x86
Which interface are you using?
Python bindings
Contact Details
No response
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct
The text was updated successfully, but these errors were encountered: