Description
System information
Type | Version/Name |
---|---|
Distribution Name | debian |
Distribution Version | testing |
Linux Kernel | 5.4.88-1 |
Architecture | amd64 |
ZFS Version | 2.0.3-8 |
SPL Version | n/a |
Describe the problem you're observing
When printing a string that ends in an invalid UTF-8 sequence, nvlist_print_json_string goes into an infinite loop because mbrtowc
will return a negative result when passed a &mbr
that has detected an error, without looking at input
or the other parameters.
It's quite possible that mbrtowc
behaves differently in other implementations, I'm using glibc
.
The conditional below is true
when when the result is negative because size_t
is unsigned (maybe it should be ssize_t
?), meaning it stays in the loop whenever sz != 0
:
size_t sz;
while ((sz = mbrtowc(&c, input, MB_CUR_MAX, &mbr)) > 0) {
I'm not sure how this should be fixed: either changing the type of sz
or ensuring that input
does stay smaller than the end of input
seem like decent candidates for a solution.
I would have expected that the function should either fail, or print a truncated version of the string.
Describe how to reproduce the problem
hang.hex
:
000100000000000001000000200000000300000001000000090000006162
000000000000418000000900000000000000
This is the nvlist encoding of
{
"ab": "A\x80"
}
trigger.c
:
#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <assert.h>
#include <stdio.h>
#include <stdint.h>
#include <stdarg.h>
#include <strings.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
typedef unsigned int uint_t;
typedef int boolean_t;
typedef unsigned char uchar_t;
typedef int hrtime_t; /* TODO wrong */
#include <libzfs/sys/nvpair.h>
#include <libzfs/libnvpair.h>
int
main(int argc, char **argv)
{
size_t nv_sz = 0;
FILE *fd = NULL;
struct stat statbuf;
if (argc != 2) return (10);
fd = fopen(argv[1], "r");
if (!fd) return (1);
if(fstat(fileno(fd), &statbuf)) return (2);
nv_sz = statbuf.st_size;
char *outbuf = malloc(nv_sz);
if (fread(outbuf, nv_sz, 1, fd) != 1) return (3);
nvlist_t *check = fnvlist_unpack(outbuf, nv_sz);
// nvlist_print(stdout, check);
nvlist_print_json(stdout, check);
fflush(stdout);
return 0;
}
xxd -ps -r hang.hex > hang.bin
gcc -std=c18 -Wall -pie -I/usr/include/libzfs/ trigger.c -lnvpair -luutil -o trigger.exe
./trigger.exe hang.bin
This (on linux with glibc) outputs
{"ab":"AAAAAAA .... infinite A's
Here is ldd ./trigger.exe
:
linux-vdso.so.1 (0x00007ffdfb444000)
libnvpair.so.3 => /lib/x86_64-linux-gnu/libnvpair.so.3 (0x000079b9258c9000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x000079b925704000)
/lib64/ld-linux-x86-64.so.2 (0x000079b925904000)
Background
I ran into this when using afl-fuzz
to look for cases where my own implementation of libnvlist
disagrees with the upstream implementation, using the JSON output to compare outputs from decoding.
The function is not part of the standard Solaris interface, but I was unable to find much documentation on the expected semantics in the serialization format, so I thought this would be an easy way to find inconsistencies.
I don't rely on nvlist_print_json()
for anything important, and don't know if anybody else does, but I guess it would be neat if this behaviour could be changed.