Skip to content

libexpr: Reduce the size of Value down to 16 bytes (on 64 bit systems) #13407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

xokdvium
Copy link
Contributor

Motivation

This shaves off a very significant (~20% percent reduction in maximum heap size and ~17% in total bytes) amount of memory used for evaluation as well as reduces the GC-managed heap. See the results below. The fraction of collected memory is not affected.

Memory usage comparison

Below are the comparisons for NIX_SHOW_STATS=1 nix-env --query --available --out-path --file ../nixpkgs --eval-system x86_64-linux > /dev/null

Before

{
  ....
  "gc": {
    "cycles": 14,
    "heapSize": 14818668544,
    "totalBytes": 27457473440
  },
  ....
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 8,
    "Value": 24
  },
  ....
  "values": {
    "bytes": 7345779024,
    "number": 306074126
  }
}

After

{
  ....
  "gc": {
    "cycles": 14,
    "heapSize": 11916210176,
    "totalBytes": 22553098160
  },
  ....
  "sizes": {
    "Attr": 16,
    "Bindings": 16,
    "Env": 8,
    "Value": 16
  },
  ....
  "values": {
    "bytes": 4897186016,
    "number": 306074126
  }
}

So the saving are around 2.45 GiB on a full nixpkgs eval.

Previously the union discriminator (InternalType) was
stored as a separate field in the Value, which takes up
whole 8 bytes due to padding needed for member alignment.
This effectively wasted 7 whole bytes of memory. Instead
of doing that InternalType is instead packed into pointer
alignment niches. As it turns out, there's more than enough
unused bits there for the bit packing to be effective.

See the doxygen comment in the ValueStorage specialization
for more details.

This does not add any performance overhead, even though
we now consistently assert the InternalType in all getters.

This can also be made atomic with a double width compare-and-swap
instruction on x86_64 (CMPXCHG16B instruction) for parallel evaluation.

Context

This is the best we can do with the current data model to address #8621.
The improvement is very significant and should give more leeway to the ever growing nixpkgs.


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@github-actions github-actions bot added new-cli Relating to the "nix" command c api Nix as a C library with a stable interface labels Jun 28, 2025
@xokdvium xokdvium changed the title libexpr: Reduce the size of Value down to 16 bytes libexpr: Reduce the size of Value down to 16 bytes (on 64 bit systems) Jun 28, 2025
Copy link
Member

@Mic92 Mic92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. Have only went over briefly. But I am interested in a review from @NaN-git

xokdvium added 3 commits June 28, 2025 15:30
The following commits will touch this file significantly, so
it's better to get the formatting out of the way first.
@@ -50,11 +50,11 @@ void printAmbiguous(
break;
}
case nList:
if (seen && v.listSize() && !seen->insert(v.listElems()).second)
if (seen && v.listSize() && !seen->insert(&v /* FIXME: What to do, pointer to elements is not stable. */).second)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previosly this code used a pointer to the listElements (either stored inline or in a big list on the heap). Since now inline values are tagged we have to copy them to the listView, so the pointer to elements would point to the stack.

For now I've changed this to a Value * instead (that is actually the same pointer, since the smallList is the first element in a Value structure). This should work fine, but maybe I'm missing something. @roberth do you think this is ok to do?

If there are no objections I'll remove the FIXME.

xokdvium added 3 commits June 28, 2025 16:18
…alueStorage`

This factors out most of the value representation into a mixin class.
`finishValue` is now gone for good and replaced with a simple template
function `setStorage` which derives the type information/disriminator from
the type of the argument. Likewise, reading of the value goes through function
template `getStorage`.

An empty type `Null` is introduced to make the bijection InternalType <-> C++ type
complete.
This shaves off a very significand amount of memory used
for evaluation as well as reduces the GC-managed heap.

Previously the union discriminator (InternalType) was
stored as a separate field in the Value, which takes up
whole 8 bytes due to padding needed for member alignment.
This effectively wasted 7 whole bytes of memory. Instead
of doing that InternalType is instead packed into pointer
alignment niches. As it turns out, there's more than enough
unused bits there for the bit packing to be effective.

See the doxygen comment in the ValueStorage specialization
for more details.

This does not add any performance overhead, even though
we now consistently assert the InternalType in all getters.

This can also be made atomic with a double width compare-and-swap
instruction on x86_64 (CMPXCHG16B instruction) for parallel evaluation.
@xokdvium
Copy link
Contributor Author

Also fixed i686-linux builds. Needed some SFINAE trickery to make the compiler not instantiate ValueStorage member functions that would be ill-formed on a non-64 bit system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c api Nix as a C library with a stable interface new-cli Relating to the "nix" command
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants