Maximum number of nodes exceeded in constant #120173

juntyr · 2024-01-20T20:05:57Z

In

rust/compiler/rustc_const_eval/src/const_eval/mod.rs

Line 24 in 1828461

const VALTREE_MAX_NODES: usize = 100000;

the VALTREE_MAX_NODES is hardcoded. I managed to hit this limit with the rough following setup, producing the "Maximum number of nodes exceeded in constant" error:

https://docs.rs/const-type-layout/latest/const_type_layout/fn.serialise_type_graph.html is a const fn generating a byte array that describes the layout of a type
I define the following helper trait to check whether two byte arrays are the same:

pub trait Same<const A: &'static [u8], const B: &'static [u8]> {}

impl<const AB: &'static [u8]> Same<AB, AB> for () {}

pub const fn check<const A: &'static [u8], const B: &'static [u8]>() where (): Same<A, B> {}

I define the following const to assert that the produced const byte array matches a byte string (doing so in a const fn also works but taking a while and crashes cargo check in my GitHub CI with exit code 143)

const _: () = check::<&{serialise_type_graph::<MyVeryComplexType>()}, b"my-very-long-byte-string">();

Is there a way to rewrite my code to not this limit as easily, would it be possible to make it configurable with an unstable attribute similar to #![recursion_limit], or is there an even better way to check that two byte arrays are the same at compile time (where one is built from a very complex constant)?

Thank you so much for your help!

The text was updated successfully, but these errors were encountered:

oli-obk · 2024-01-20T22:30:45Z

3. doing so in a const fn also works but taking a while and crashes cargo check in my GitHub CI with exit code 143)

Do you still have the code somewhere for that const fn? Even for 10k bytes it shouldn't take overly long

juntyr · 2024-01-20T22:43:41Z

I tried to minimise the error and got down to the following playground:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=8bb81ea05705676e7b5b58be1b48c9e7

It seems that just having a const byte array of a large size is enough to trigger the error - it seems that every element of the array is counted as a node? Amusingly, changing the length to a slightly smaller value makes the compiler on the playground crash, and changing it to a small value makes everything compile without any errors or warnings about comparing byte slices.

juntyr · 2024-01-20T22:49:38Z

3. doing so in a const fn also works but taking a while and crashes cargo check in my GitHub CI with exit code 143)

Do you still have the code somewhere for that const fn? Even for 10k bytes it shouldn't take overly long

Yes - here's the complete module that does the equality check. One single comparison probably only takes on the order of a minute, but I am somehow crazy enough to do that 48 times for type checking ... so even small changes drastically change my compilation time.

use const_type_layout::{serialise_type_graph, serialised_type_graph_len, TypeGraphLayout};

#[derive(PartialEq, Eq, core::marker::ConstParamTy)]
pub enum HostAndDeviceKernelSignatureTypeLayout {
    Match,
    Mismatch,
}

pub struct Assert<const MATCH: HostAndDeviceKernelSignatureTypeLayout>;

#[must_use]
pub const fn check<T: TypeGraphLayout>(
    device: &'static [u8],
) -> HostAndDeviceKernelSignatureTypeLayout
where
    [u8; serialised_type_graph_len::<T>()]:,
{
    const SIGNATURE_ERROR_MESSAGE: &[u8] = b"ERROR in this PTX compilation";

    // Short-circuit to avoid extra errors when PTX compilation fails
    if equals(device, SIGNATURE_ERROR_MESSAGE) {
        return HostAndDeviceKernelSignatureTypeLayout::Match;
    }

    let host = serialise_type_graph::<T>();

    if equals(device, &host) {
        HostAndDeviceKernelSignatureTypeLayout::Match
    } else {
        HostAndDeviceKernelSignatureTypeLayout::Mismatch
    }
}

const fn equals(device: &[u8], host: &[u8]) -> bool {
    if host.len() != device.len() {
        return false;
    }

    let mut i = 0;

    while i < host.len() {
        if host[i] != device[i] {
            return false;
        }

        i += 1;
    }

    true
}

For context, this code is run to check that my type layout description, serialised into a byte array, is the same when compiling a CUDA kernel as when compiling the host code.

juntyr · 2024-01-20T22:52:34Z

From what I can guesstimate right now, constants can be quite large if I mainly materialise them inside const fns that just take a long while to run. Forcing the compiler to do a byte array comparison for me, e.g. by using the const generic trait trick or using match against a literal seems to run a lot faster but runs against the valtree size limit.

oli-obk · 2024-01-21T11:14:30Z

One single comparison probably only takes on the order of a minute,

How large is the specific device string that you are testing here?

oli-obk · 2024-01-21T11:17:25Z

Amusingly, changing the length to a slightly smaller value makes the compiler on the playground crash,

Yea i expect our limit to be too large. We should probably try to lower it.

The issue is that you'll end up with symbol names that contain all bytes of the constant, encoded in an inefficient way. While we could probably special case byte strings (I think we special case strings already), the root issue persists: LLVM and the linker do not like large symbol names

Though in this specific case it looks like a compiler panic or potentially a stack overflow

juntyr · 2024-01-21T15:57:33Z

The issue is that you'll end up with symbol names that contain all bytes of the constant, encoded in an inefficient way. While we could probably special case byte strings (I think we special case strings already), the root issue persists: LLVM and the linker do not like large symbol names

Hmmm, why is the const given a symbol name that includes all bytes? In my use case, all materialisations of the long byte strings happen either as variables inside const fns or parameters to const fns that feed into const generic params (as in const _: () = check::<&{serialise_type_graph::<MyVeryComplexType>()}, b"my-very-long-byte-string">();) inside trait function impl bodies. As far as I know, none of these constants is exposed as a pub const.

I tested a few more variations of the issue. If I use

match serialise_type_graph::<MyVeryComplexType>() {
    b"my-very-long-byte-string" => ...,
    _ => ...,
}

inside a const, cargo check inside CI crashes very soon after generating all byte strings inside a proc macro.

If I instead use core::intrinsics::compare_bytes (which I picked since it seems like const eval does the comparison on the memory itself, unlike my byte-loop that needs to be interpreted), cargo check spends another 30min doing something before crashing with exit code 143.

So perhaps I am indeed reaching some memory limit, just at different speeds with different const bytestring comparison codepaths.

oli-obk · 2024-01-22T12:39:37Z

Hmmm, why is the const given a symbol name that includes all bytes? In my use case, all materialisations of the long byte strings happen either as variables inside const fns or parameters to const fns that feed into const generic params (as in const _: () = check::<&{serialise_type_graph::<MyVeryComplexType>()}, b"my-very-long-byte-string">();) inside trait function impl bodies. As far as I know, none of these constants is exposed as a pub const.

yea in this case that won't happen, only if you end up with the constant in a signature that actually goes to codegen. We don't want to have code that is only allowed in const eval, but not outside of it, so we use the same limits for it. If you ran the const's body in fn main instead, then the serialize_type_graph would get monomorphized and sent to codegen

due to #119214 runtime code is in fact generated for your constants, but that's a bug obviously

oli-obk · 2024-01-22T14:47:52Z

I did some debugging: Looks like we're recursing very very deeply in compute_exhaustiveness_and_usefulness:

┐rustc_pattern_analysis::usefulness::compute_match_usefulness scrut_ty=&ReErased [u8], scrut_validity=ValidOnly
├─┐rustc_pattern_analysis::usefulness::compute_exhaustiveness_and_usefulness matrix=
│ + &[0..1, 0..1, 0..1, 0..1, 0..1, 0..1, 0..1, 0..1, 0..1] +
│ + _ : &ReErased [u8]                                      +
│ | ✓                                                       | // column validity
│ 
│ ├─  0ms DEBUG rustc_pattern_analysis::usefulness ty: &ReErased [u8]
│ ├─┐rustc_pattern_analysis::rustc::ctors_for_ty ty=&ReErased [u8]
│ │ ├─  0ms TRACE rustc_middle::ty::visit has_type_flags, self=&ReErased [u8], flags=TypeFlags(HAS_ERROR), res=false
│ │ ├─  0ms DEBUG rustc_pattern_analysis::rustc return=Ok(Ref)
│ ├─┘
│ ├─┐rustc_pattern_analysis::constructor::split 
│ │ ├─  0ms DEBUG rustc_pattern_analysis::constructor return=SplitConstructorSet { present: [Ref], missing: [], missing_empty: [] }
│ ├─┘
│ ├─  0ms DEBUG rustc_pattern_analysis::usefulness specialize(Ref)
│ ├─┐rustc_pattern_analysis::rustc::ctor_sub_tys ctor=Ref, ty=&ReErased [u8]
│ ├─┘
│ ├─┐rustc_pattern_analysis::usefulness::compute_exhaustiveness_and_usefulness matrix=
│ │ + [0..1, 0..1, 0..1, 0..1, 0..1, 0..1, 0..1, 0..1, 0..1] +
│ │ + _                                                      +
│ │ | ?                                                      | // column validity
│ │ 
│ │ ├─  0ms DEBUG rustc_pattern_analysis::usefulness ty: [u8]
│ │ ├─┐rustc_pattern_analysis::rustc::ctors_for_ty ty=[u8]
│ │ │ ├─  0ms TRACE rustc_middle::ty::visit has_type_flags, self=[u8], flags=TypeFlags(HAS_ERROR), res=false
│ │ │ ├─┐rustc_middle::ty::inhabitedness::inhabited_predicate self=u8
│ │ │ │ ├─  0ms TRACE rustc_middle::ty::visit has_type_flags, self=u8, flags=TypeFlags(HAS_TY_INFER | HAS_RE_INFER | HAS_CT_INFER), res=false
│ │ │ │ ├─  0ms DEBUG rustc_middle::ty::inhabitedness return=True
│ │ │ ├─┘
│ │ │ ├─┐rustc_middle::ty::inhabitedness::inhabited_predicate::apply_inner self=True, eval_stack=[]
│ │ │ │ ├─  0ms DEBUG rustc_middle::ty::inhabitedness::inhabited_predicate return=Ok(true)
│ │ │ ├─┘
│ │ │ ├─  0ms DEBUG rustc_pattern_analysis::rustc return=Ok(Slice { array_len: None, subtype_is_empty: false })
│ │ ├─┘
│ │ ├─┐rustc_pattern_analysis::constructor::split 
│ │ │ ├─  0ms DEBUG rustc_pattern_analysis::constructor return=SplitConstructorSet { present: [Slice(Slice { array_len: None, kind: FixedLen(9) })], missing: [Slice(Slice { array_len: None, kind: FixedLen(0) }), Slice(Slice { array_len: None, kind: FixedLen(1) }), Slice(Slice { array_len: None, kind: FixedLen(2) }), Slice(Slice { array_len: None, kind: FixedLen(3) }), Slice(Slice { array_len: None, kind: FixedLen(4) }), Slice(Slice { array_len: None, kind: FixedLen(5) }), Slice(Slice { array_len: None, kind: FixedLen(6) }), Slice(Slice { array_len: None, kind: FixedLen(7) }), Slice(Slice { array_len: None, kind: FixedLen(8) }), Slice(Slice { array_len: None, kind: VarLen(10, 0) })], missing_empty: [] }
│ │ ├─┘
│ │ ├─  0ms DEBUG rustc_pattern_analysis::usefulness specialize(Slice(Slice { array_len: None, kind: FixedLen(9) }))
│ │ ├─┐rustc_pattern_analysis::rustc::ctor_sub_tys ctor=Slice(Slice { array_len: None, kind: FixedLen(9) }), ty=[u8]
│ │ ├─┘
│ │ ├─┐rustc_pattern_analysis::usefulness::compute_exhaustiveness_and_usefulness matrix=
│ │ │ + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 +
│ │ │ + _    + _    + _    + _    + _    + _    + _    + _    + _    +
│ │ │ | ?    | ?    | ?    | ?    | ?    | ?    | ?    | ?    | ?    | // column validity
│ │ │ 
│ │ │ ├─  0ms DEBUG rustc_pattern_analysis::usefulness ty: u8
│ │ │ ├─┐rustc_pattern_analysis::rustc::ctors_for_ty ty=u8
│ │ │ │ ├─  0ms TRACE rustc_middle::ty::visit has_type_flags, self=u8, flags=TypeFlags(HAS_ERROR), res=false
│ │ │ │ ├─  0ms DEBUG rustc_pattern_analysis::rustc return=Ok(Integers { range_1: 0..256, range_2: None })
│ │ │ ├─┘
│ │ │ ├─┐rustc_pattern_analysis::constructor::split 
│ │ │ │ ├─  0ms DEBUG rustc_pattern_analysis::constructor return=SplitConstructorSet { present: [IntRange(0..1)], missing: [IntRange(1..256)], missing_empty: [] }
│ │ │ ├─┘
│ │ │ ├─  0ms DEBUG rustc_pattern_analysis::usefulness specialize(IntRange(0..1))
│ │ │ ├─┐rustc_pattern_analysis::rustc::ctor_sub_tys ctor=IntRange(0..1), ty=u8
│ │ │ ├─┘
│ │ │ ├─┐rustc_pattern_analysis::usefulness::compute_exhaustiveness_and_usefulness matrix=
│ │ │ │ + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 + 0..1 +
│ │ │ │ + _    + _    + _    + _    + _    + _    + _    + _    +
│ │ │ │ | ?    | ?    | ?    | ?    | ?    | ?    | ?    | ?    | // column validity
│ │ │ │ 
│ │ │ │ ├─  0ms DEBUG rustc_pattern_analysis::usefulness ty: u8
│ │ │ │ ├─┐rustc_pattern_analysis::rustc::ctors_for_ty ty=u8
│ │ │ │ │ ├─  0ms TRACE rustc_middle::ty::visit has_type_flags, self=u8, flags=TypeFlags(HAS_ERROR), res=false
│ │ │ │ │ ├─  0ms DEBUG rustc_pattern_analysis::rustc return=Ok(Integers { range_1: 0..256, range_2: None })
│ │ │ │ ├─┘
│ │ │ │ ├─┐rustc_pattern_analysis::constructor::split 
│ │ │ │ │ ├─  0ms DEBUG rustc_pattern_analysis::constructor return=SplitConstructorSet { present: [IntRange(0..1)], missing: [IntRange(1..256)], missing_empty: [] }
│ │ │ │ ├─┘
│ │ │ │ ├─  0ms DEBUG rustc_pattern_analysis::usefulness specialize(IntRange(0..1))
│ │ │ │ ├─┐rustc_pattern_analysis::rustc::ctor_sub_tys ctor=IntRange(0..1), ty=u8
│ │ │ │ ├─┘
│ │ │ │ ├─┐rustc_pattern_analysis::usefulness::compute_exhaustiveness_and_usefulness matrix=

oli-obk · 2024-01-22T14:54:10Z

cc @Nadrieril do you have some idea why ctor specialization would give us a single case per element in a slice? Each recursion shrinks the passed matrix by one element.

Nadrieril · 2024-01-22T15:07:59Z

I presume this is the match-checking of the matches!(a, A) from above? The constant A expands to a pattern that's literally [0, 0, 0, 0, 0, ....], which we then check for reachability and exhaustiveness. There are currently no shortcuts that would figure out "yeah that's reachable and there's a wildcard so it's exhaustive" unfortunately, so we do the normal checking which goes field-by-field.

Nadrieril · 2024-01-22T15:10:34Z

Note that this is different from what we'd to with a variable-length slice pattern like [0, .., 0]. There we only have to check 3 fields. In your case though there are 1000 explicit fields and I don't know of a good way to detect that we can skip through them.

juntyr · 2024-01-30T14:36:56Z

One single comparison probably only takes on the order of a minute,

How large is the specific device string that you are testing here?

Sorry for the late reply - the byte strings are all between 100kB and 200kB long. I trying out some domain-specific compression right now (much of this length comes from repeated instances of core::any::type_names, so I'm testing some simple de-duplication, though this almost triples my const eval time in producing the byte string) to see if that is enough to get my code to compile again without making rustc error or hang or crash.

juntyr · 2024-01-30T14:38:24Z

Note that this is different from what we'd to with a variable-length slice pattern like [0, .., 0]. There we only have to check 3 fields. In your case though there are 1000 explicit fields and I don't know of a good way to detect that we can skip through them.

Thank you for that explanation - I hadn't thought of that but it makes total sense now that a match statement needs to be checked for reachability and exhaustiveness.

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Maximum number of nodes exceeded in constant #120173

Maximum number of nodes exceeded in constant #120173

juntyr commented Jan 20, 2024 •

edited

Loading

oli-obk commented Jan 20, 2024

Uh oh!

juntyr commented Jan 20, 2024

Uh oh!

juntyr commented Jan 20, 2024

Uh oh!

juntyr commented Jan 20, 2024

Uh oh!

oli-obk commented Jan 21, 2024

Uh oh!

oli-obk commented Jan 21, 2024 •

edited

Loading

Uh oh!

juntyr commented Jan 21, 2024

Uh oh!

oli-obk commented Jan 22, 2024

Uh oh!

oli-obk commented Jan 22, 2024 •

edited

Loading

Uh oh!

oli-obk commented Jan 22, 2024 •

edited

Loading

Uh oh!

Nadrieril commented Jan 22, 2024

Uh oh!

Nadrieril commented Jan 22, 2024 •

edited

Loading

Uh oh!

juntyr commented Jan 30, 2024

Uh oh!

juntyr commented Jan 30, 2024

Uh oh!

Maximum number of nodes exceeded in constant #120173

Maximum number of nodes exceeded in constant #120173

Comments

juntyr commented Jan 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

oli-obk commented Jan 20, 2024

Uh oh!

juntyr commented Jan 20, 2024

Uh oh!

juntyr commented Jan 20, 2024

Uh oh!

juntyr commented Jan 20, 2024

Uh oh!

oli-obk commented Jan 21, 2024

Uh oh!

oli-obk commented Jan 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juntyr commented Jan 21, 2024

Uh oh!

oli-obk commented Jan 22, 2024

Uh oh!

oli-obk commented Jan 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oli-obk commented Jan 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nadrieril commented Jan 22, 2024

Uh oh!

Nadrieril commented Jan 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juntyr commented Jan 30, 2024

Uh oh!

juntyr commented Jan 30, 2024

Uh oh!

juntyr commented Jan 20, 2024 •

edited

Loading

oli-obk commented Jan 21, 2024 •

edited

Loading

oli-obk commented Jan 22, 2024 •

edited

Loading

oli-obk commented Jan 22, 2024 •

edited

Loading

Nadrieril commented Jan 22, 2024 •

edited

Loading