-
Notifications
You must be signed in to change notification settings - Fork 1.9k
libspl: nice assertion output #16140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
197165e
to
8be04d4
Compare
Looking into the FreeBSD 13 build failure at the moment. Possibly a bug in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool. It'll be nice to have more useful debug information along with the ASSERT.
It looks like the build failure on FreeBSD just needs to be sorted out.
8be04d4
to
f9cba1b
Compare
The "not found" path is attempting to clear SOMELIB_CFLAGS and SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`, which is defined as "if the first arg is non-empty". The m4 "empty" construction is [], therefore, the existing AC_SUBST calls never modify the variables at all. The effect of this is that leftovers from the library test can leak out. At least, if a library header is found in the first stage, but the library itself is not, -lsomelib is added to SOMELIB_LIBS and further tests done. If that library is not found, SOMELIB_LIBS will not be cleared. For most of our library tests this hasn't been a problem, as they're either always found properly via pkg-config or set directly, or the calling test immediately aborts configure. For an optional dependency however, an apparent "partial" result where the header is found but no corresponding library causes link errors later. I think a complete fix should probably not be setting SOMELIB_xxx until the final result is known, but for now, adjusting the AC_SUBST calls to explictly set the empty shell string (which is not "empty" to m4) at least restores the intent. Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/
Arrange for the thread/task name to be set when new threads are created. This makes them visible in the process table etc. pthread_setname_np() is generally available in glibc, musl and FreeBSD, so no test is required. Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/
f9cba1b
to
4705367
Compare
Last push should take care of it, lets see. The problem was that FreeBSD has @behlendorf 28b110d fixes this. The commit message explains more. I'm 50/50 on a separate PR; it's not really related to this PR, and it could have a wider impact, but it doesn't come up in any normal build that I can imagine, so I'm not sure if more testing/review is really likely/possible. If you'd prefer separate, let me know! |
Thanks for running that down. Let's go ahead and include that change as-is with this PR, with can always follow up with additional cleanup/improvements in another PR. |
Arrange for the thread/task name to be set when new threads are created. This makes them visible in the process table etc. pthread_setname_np() is generally available in glibc, musl and FreeBSD, so no test is required. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
The "not found" path is attempting to clear SOMELIB_CFLAGS and SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`, which is defined as "if the first arg is non-empty". The m4 "empty" construction is [], therefore, the existing AC_SUBST calls never modify the variables at all. The effect of this is that leftovers from the library test can leak out. At least, if a library header is found in the first stage, but the library itself is not, -lsomelib is added to SOMELIB_LIBS and further tests done. If that library is not found, SOMELIB_LIBS will not be cleared. For most of our library tests this hasn't been a problem, as they're either always found properly via pkg-config or set directly, or the calling test immediately aborts configure. For an optional dependency however, an apparent "partial" result where the header is found but no corresponding library causes link errors later. I think a complete fix should probably not be setting SOMELIB_xxx until the final result is known, but for now, adjusting the AC_SUBST calls to explictly set the empty shell string (which is not "empty" to m4) at least restores the intent. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 7ac00d3)
Arrange for the thread/task name to be set when new threads are created. This makes them visible in the process table etc. pthread_setname_np() is generally available in glibc, musl and FreeBSD, so no test is required. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 4429ad9)
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 3948002)
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit dec697a)
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 2152c40)
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 051460b)
The "not found" path is attempting to clear SOMELIB_CFLAGS and SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`, which is defined as "if the first arg is non-empty". The m4 "empty" construction is [], therefore, the existing AC_SUBST calls never modify the variables at all. The effect of this is that leftovers from the library test can leak out. At least, if a library header is found in the first stage, but the library itself is not, -lsomelib is added to SOMELIB_LIBS and further tests done. If that library is not found, SOMELIB_LIBS will not be cleared. For most of our library tests this hasn't been a problem, as they're either always found properly via pkg-config or set directly, or the calling test immediately aborts configure. For an optional dependency however, an apparent "partial" result where the header is found but no corresponding library causes link errors later. I think a complete fix should probably not be setting SOMELIB_xxx until the final result is known, but for now, adjusting the AC_SUBST calls to explictly set the empty shell string (which is not "empty" to m4) at least restores the intent. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 7ac00d3)
Arrange for the thread/task name to be set when new threads are created. This makes them visible in the process table etc. pthread_setname_np() is generally available in glibc, musl and FreeBSD, so no test is required. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 4429ad9)
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 3948002)
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit dec697a)
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 2152c40)
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 051460b)
The "not found" path is attempting to clear SOMELIB_CFLAGS and SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`, which is defined as "if the first arg is non-empty". The m4 "empty" construction is [], therefore, the existing AC_SUBST calls never modify the variables at all. The effect of this is that leftovers from the library test can leak out. At least, if a library header is found in the first stage, but the library itself is not, -lsomelib is added to SOMELIB_LIBS and further tests done. If that library is not found, SOMELIB_LIBS will not be cleared. For most of our library tests this hasn't been a problem, as they're either always found properly via pkg-config or set directly, or the calling test immediately aborts configure. For an optional dependency however, an apparent "partial" result where the header is found but no corresponding library causes link errors later. I think a complete fix should probably not be setting SOMELIB_xxx until the final result is known, but for now, adjusting the AC_SUBST calls to explictly set the empty shell string (which is not "empty" to m4) at least restores the intent. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 7ac00d3)
Arrange for the thread/task name to be set when new threads are created. This makes them visible in the process table etc. pthread_setname_np() is generally available in glibc, musl and FreeBSD, so no test is required. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 4429ad9)
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 3948002)
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit dec697a)
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 2152c40)
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140 (cherry picked from commit 051460b)
The "not found" path is attempting to clear SOMELIB_CFLAGS and SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`, which is defined as "if the first arg is non-empty". The m4 "empty" construction is [], therefore, the existing AC_SUBST calls never modify the variables at all. The effect of this is that leftovers from the library test can leak out. At least, if a library header is found in the first stage, but the library itself is not, -lsomelib is added to SOMELIB_LIBS and further tests done. If that library is not found, SOMELIB_LIBS will not be cleared. For most of our library tests this hasn't been a problem, as they're either always found properly via pkg-config or set directly, or the calling test immediately aborts configure. For an optional dependency however, an apparent "partial" result where the header is found but no corresponding library causes link errors later. I think a complete fix should probably not be setting SOMELIB_xxx until the final result is known, but for now, adjusting the AC_SUBST calls to explictly set the empty shell string (which is not "empty" to m4) at least restores the intent. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
The "not found" path is attempting to clear SOMELIB_CFLAGS and SOMELIB_LIBS by resetting them in AC_SUBST(). However, the second arg to AC_SUBST is expanded in autoconf with `m4_ifvaln([$2], [[$1]=$2])`, which is defined as "if the first arg is non-empty". The m4 "empty" construction is [], therefore, the existing AC_SUBST calls never modify the variables at all. The effect of this is that leftovers from the library test can leak out. At least, if a library header is found in the first stage, but the library itself is not, -lsomelib is added to SOMELIB_LIBS and further tests done. If that library is not found, SOMELIB_LIBS will not be cleared. For most of our library tests this hasn't been a problem, as they're either always found properly via pkg-config or set directly, or the calling test immediately aborts configure. For an optional dependency however, an apparent "partial" result where the header is found but no corresponding library causes link errors later. I think a complete fix should probably not be setting SOMELIB_xxx until the final result is known, but for now, adjusting the AC_SUBST calls to explictly set the empty shell string (which is not "empty" to m4) at least restores the intent. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
Arrange for the thread/task name to be set when new threads are created. This makes them visible in the process table etc. pthread_setname_np() is generally available in glibc, musl and FreeBSD, so no test is required. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes openzfs#16140
Motivation and Context
The output when a userspace SPL assertion fires is pretty anemic, which makes it harder than it needs to be to understand where to look when
zdb
orztest
blows up. I finally got annoyed enough to do something about it.Description
See commit messages for detail. In short:
thread names
Just make its easier to see everything:
pstree:
top

htop

assertion output
Neatly presents the location and content of the assertion, process & thread name & id and, where possible, stack backtrace. Below shows examples on various platforms (with
ASSERT0("oh no")
added to top ofarc_evict_cb_check()
, a convenient thread that starts early):Typical output on Linux + glibc (using glibc's
backtrace()
):glibc's unwinder is not always the most accurate, and often can't resolve some addresses. So if libunwind/libunwind is detected, it will be used instead (disabled with
--without-libunwind
). Using libunwind 1.6.2 (Debian 12), it looks like:Newer libunwind can also inspect the ELF object the symbol lives in. With 1.8.2, it looks like:
If neither
backtrace()
nor libunwind is available, the output will just omit the call trace. On Void 20240314 using musl (which doesn't havebacktrace()
), it's:Meanwhile, FreeBSD 14 does have
backtrace()
, but in libexecinfo instead (a port of this is sometimes used with musl). We find that and use it just fine:Of course, libunwind is nice there. With devel/unwind 20240211:
How Has This Been Tested?
See above. At least, compile checked against recent glibc, musl, FreeBSD, with multiple versions of libunwind. And of course, this is assertion response, so if nothing blows up then nothing sees it.
Types of changes
Checklist:
Signed-off-by
.