Skip to content

LangRef: storing poison in memory is UB #141339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

RalfJung
Copy link
Contributor

Based on discussion with @nikic. Cc @nunoplopes.

I'm not happy that storing poison would be UB -- there are some potential Rust features we cannot implement due to this -- but if that's how things currently are, it's better to document them.

This also means that reg2mem is technically unsound since not all SSA values can actually be stored in memory; that pass should add freeze to avoid storing poison. Should I file an issue for that? (Unfortunately I won't be able to try and fix this myself.)

@llvmbot
Copy link
Member

llvmbot commented May 24, 2025

@llvm/pr-subscribers-llvm-ir

Author: Ralf Jung (RalfJung)

Changes

Based on discussion with @nikic. Cc @nunoplopes.

I'm not happy that storing poison would be UB -- there are some potential Rust features we cannot implement due to this -- but if that's how things currently are, it's better to document them.

This also means that reg2mem is technically unsound since not all SSA values can actually be stored in memory; that pass should add freeze to avoid storing poison. Should I file an issue for that? (Unfortunately I won't be able to try and fix this myself.)


Full diff: https://github.com/llvm/llvm-project/pull/141339.diff

1 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+6)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index ad0755e1531df..242471cf7f5a4 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11433,6 +11433,12 @@ If ``<value>`` is of aggregate type, padding is filled with
 :ref:`undef <undefvalues>`.
 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
 
+If ``<value>`` is poison, then behavior currently is effectively undefined due
+to the fact that many LLVM passes assume they can do load widening, and having
+poison in memory would "infect" the entire widened load. This can hopefully be
+alleviated in the future, but until then, frontends should refrain from
+generating code that stores poison values in memory.
+
 Example:
 """"""""
 

@nunoplopes
Copy link
Member

I'm totally against this. Widening in IR is a problem, yes, but this is not the way to go. Load widening can be solved by using vector loads or struct loads (if the new size is not a multiple of the old one).

There are implications about making stores UB, such as making movement of stores and function calls even harder.

@RalfJung
Copy link
Contributor Author

I don't like it either -- I just think if that's the reality right now, we should at least write it down. Currently there is no way for frontends to know that they are not supposed to put poison into memory; I was extremely surprised when I first heard about this.

I guess another way to frame this would be to file an issue saying that putting poison in memory can lead to surprising issues, and referencing this in the LangRef as a known issue without making it look like intended behavior? However, from what @nikic told me, this is apparently quite hard to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants