-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Add a flat vector format for bfloat16 vector storage #132533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: lucene_snapshot
Are you sure you want to change the base?
Conversation
bbq_flat float32:
bbq_flat bfloat16:
bbq_hnsw float32:
bbq_hnsw bfloat16:
|
From esbench runs, around a 5% performance drop, for a 40-50% reduction in disk usage |
public static float bFloat16ToFloat(short bf) { | ||
return Float.intBitsToFloat(bf << 16); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to worry about the short
being signed? Since we are just storing the bits, do we have to do a Short.toUnsignedInt
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, as its a left shift, so there's no sign-extension to worry about
import java.nio.ByteOrder; | ||
import java.nio.ShortBuffer; | ||
|
||
class BFloat16 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a silly question mostly for my own knowledge. Why not use jdk.incubator.vector.Float16? Just because it's in incubator and we can pretty easily write / maintain our own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the IEEE 16-bit float, which is different to bfloat16. IEEE 16-bit has less range than 32-bit floats. BFloat16 has the same range as 32-bit, but less precision - https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@john-wagster utilizing Float16
pollutes our type space with something that actually isn't supported. It also does boxing.
I think there will be space for panamavector when it comes to scoring between two bfloat16 arrays as right now this PR decodes to float, then compares them, which is not only an extra step, but also loses any potential speed improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this PR is here for the disk space and memory usage reductions. Passing round the bfloat16 values directly as short[]
is a much more involved piece of work, and you would still need to convert to 32-bit floats at some point as panama doesn't support working with 16-bit bfloat16s.
I wonder @thecoop if it would be better to add a Since you are adding a new flat vector format, that means we could add it to ALL our formats almost for "free". Then for diskbbq, hnsw_bbq, etc. when choosing the element_type, we can adjust the higher level format, which just calls the same readers/writers, but with a different flat vector format. |
I'm not sure how that would work with the lucene I have a separate PR that adds |
looks like merge times went way up too? around 50% more time in merge if I'm reading this right. |
We don't need to do anything with Lucene. Our element types don't directly translate. We can just tell Lucene whatever we want (maybe just float). our format will just know that |
@thecoop see what I did for |
yup, but it actually showed a decrease in merge time in esbench runs. Needs a closer check. |
I wonder if its a difference of incremental merge times vs forcemerge time. We may just be copying fewer bytes and the reduction in overall off-heap memory requirements might off-set the impact of requiring to decode the floats for regular merges. |
I've defined a new |
0fa437b
to
175cd80
Compare
…esql/40_unsupported_types/nested declared inline} elastic#126606
175cd80
to
f979e89
Compare
// write vector values | ||
long vectorDataOffset = vectorData.alignFilePointer(Float.BYTES); | ||
switch (fieldData.fieldInfo.getVectorEncoding()) { | ||
case BYTE -> writeByteVectors(fieldData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since now we are creating this with element_type
, we should just throw if byte
is used when creating a field and remove all the byte support here on the writer & on the reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, quite right - done. This code obviously needs a proper pass-through after prototyping.
@thecoop I am planning to take a look at this on behalf of the Kibana Security team. Could you point me to changes that are relevant for our team, or anywhere specific that you'd like us to focus on? |
This was done as part of spacetime, and it's still WIP. I'm not sure why you were automatically pinged by the bots for this PR. |
@thecoop Ah, ok. I am going to to remove us for now. If you want us to take a look if/when it is ready for review/merge let us know. |
Add a bfloat16 flat vector storage format for bbq indices. This needs to be selected by specifying
element_type: bfloat16
index option, currently enabled forbbq_flat
andbbq_hnsw
. This only changes the bytes stored on disk, vectors are still processed in-memory asfloat[]
.