Style: checked integer arithmetic? #126

sffc · 2020-06-13T18:08:04Z

ICU has had numerous bugs involving integer overflow wrapping around and causing crashes when indexing into arrays, etc. Rust integer types have functions like checked_add that allow you to perform failure logic if an integer overflow would occur. I was wondering whether we should adopt a best practice to avoid "raw" integer arithmetic and always use the "checked" versions.

Should we make an exception for + on two usizes, especially if one of the usizes is fixed (like i += 1)?

The text was updated successfully, but these errors were encountered:

Manishearth · 2020-06-15T01:16:07Z

Rust panics on overflow in debug mode, so overflow is not as common a bug in well-tested rust code in my experience. It's typical practice to use regular integer types and test in debug mode.

We should definitely use checked methods in code where overflow may occur based on user input.

sffc · 2020-06-17T01:46:50Z

Should we use checked arithmetic when doing something like pulling items from a user-provided iterator? The iterator could in theory be longer than usize, for example.

Manishearth · 2020-06-17T15:24:41Z

When does the iterator length come up when pulling from an iterator? You just pull. It's only when you collect into a vector that it matters, and iirc .collect() handles this

sffc · 2020-06-17T17:30:22Z

It came up for me a couple days ago when I was pulling items from an int iterator and wanted to keep a usize tally of the number of zeros versus non-zeros.

zbraniecki · 2020-06-17T17:36:32Z

enumerate?

Manishearth · 2020-06-17T19:43:07Z

Ah yeah in that case you should.

sffc · 2020-06-17T20:00:39Z

Ah yeah in that case you should.

OK. So in that case, going back to my original question: what do you think about requiring that all arithmetic is explicit about which type of arithmetic you want? Rust standard library gives you several choices:

checked_add: returns None if overflow occurred
overflowing_add: returns the wrapped number and a boolean if overflow occurred
saturating_add: stops at the max integer instead of overflowing
wrapping_add: returns the wrapped number without a boolean

I assume wrapping_add is the default if you use the + operator. Is that correct? If so, is calling the wrapping_add function a zero-cost abstraction over the + operator?

Manishearth · 2020-06-18T04:47:49Z

what do you think about requiring that all arithmetic is explicit about which type of arithmetic you want?

No, this is highly unusual in Rust. I think we should do this for select cases where user-provided numbers come into play.

sffc · 2020-06-18T04:54:57Z

Do we rely on human code review to catch arithmetic that could cause an unchecked overflow?

It seems that the Rust style is generally to enforce correctness via static analysis of code. I would expect that there would be, e.g., a Clippy warning that we could enable that forbids the use of an unchecked + operator. Why is this different?

Manishearth · 2020-06-18T04:58:25Z

Do we rely on human code review to catch arithmetic that could cause an unchecked overflow?

Typically debug assertions in tests catch these. Fuzzing helps.

Why is this different?

It tends to be that most cases of doing addition are going to be okay, so the tradeoff is between a cognitive overload with low marginal payoff. Nor does this have the complexity-compounding issues that memory safety has.

EvanJP · 2020-06-18T18:07:15Z

Using checked_add everywhere might add code to the point of worry, so I am in favor of having human code reviews/occasional usage.

rust-lang/rust#9469 (From 2013) brings up the difference in generated code when using + vs. checked_add

sffc · 2020-06-19T08:01:42Z

Additional data on the code size of checked_add: I wrote the following functions:

#[no_mangle]
pub fn checked_add(a: i32) -> i32 {
	match a.checked_add(100) {
		Some(v) => v,
		None => 0,
	}
}

#[no_mangle]
pub fn saturating_add(a: i32) -> i32 {
	a.saturating_add(100)
}

#[no_mangle]
pub fn wrapping_add(a: i32) -> i32 {
	a.wrapping_add(100)
}

#[no_mangle]
pub fn default_add(a: i32) -> i32 {
	a + 100
}

Here is the WASM output:

  (func $checked_add (type 1) (param i32) (result i32)
    (local i32)
    i32.const 0
    local.get 0
    i32.const 100
    i32.add
    local.tee 1
    local.get 0
    i32.const -1
    i32.gt_s
    local.tee 0
    local.get 0
    local.get 1
    i32.const -1
    i32.gt_s
    i32.ne
    i32.and
    select)
  (func $saturating_add (type 1) (param i32) (result i32)
    (local i32)
    i32.const 2147483647
    i32.const -2147483648
    local.get 0
    i32.const 100
    i32.add
    local.tee 1
    i32.const 0
    i32.lt_s
    select
    local.get 1
    local.get 0
    i32.const -1
    i32.gt_s
    local.tee 0
    local.get 0
    local.get 1
    i32.const -1
    i32.gt_s
    i32.ne
    i32.and
    select)
  (func $wrapping_add (type 1) (param i32) (result i32)
    local.get 0
    i32.const 100
    i32.add)
  (func $default_add (type 1) (param i32) (result i32)
    local.get 0
    i32.const 100
    i32.add)

Bytes of WASM:

saturating_add: 49
checked_add: 33
wrapping_add: 10
default_add: 10

So, the safe methods are bigger, but not as big as they were in 2013.

To reproduce, see sffc/rust-wasm-i18n@f31c30f

sffc · 2020-06-19T08:33:55Z

My attempt at decompiling checked_add by hand:

pub fn checked_add(a: i32) -> i32 {
  let b = a + 100;
  let c1 = (a > -1);
  let c2 = (b > -1);
  if c1 && c1 != c2 {
    // overflow
    return 0;
  } else {
    // no overflow
    return b;
  }
}

I've convinced myself that this is correct as long as I am adding a positive constant.

If you don't tell the compiler that the thing you are adding is guaranteed to be positive, the code gets a bit longer, from 33 bytes to 44 bytes:

  (func $checked_add (type 1) (param i32 i32) (result i32)
    (local i32)
    i32.const 0
    local.get 0
    local.get 1
    i32.add
    local.tee 2
    local.get 0
    i32.const -1
    i32.gt_s
    local.tee 0
    local.get 1
    i32.const -1
    i32.gt_s
    i32.eq
    local.get 0
    local.get 2
    i32.const -1
    i32.gt_s
    i32.ne
    i32.and
    select)

In any case, I think this is a good enough reason to not recommend using checked_add everywhere.

hsivonen · 2020-06-23T09:43:52Z

We should require checked math for the functions that take input size and compute the worst-case output size. While safe-only Rust code would catch the issue of a too short buffer (due to overflow) at the time of trying to index into the buffer, when the application code is C or C++, an unexpectedly short buffer could be a problem for operations performed on the application side.

sffc added this to the 2020 Q2 milestone Jun 17, 2020

sffc added the discuss Discuss at a future ICU4X-SC meeting label Jun 17, 2020

sffc removed the discuss Discuss at a future ICU4X-SC meeting label Jun 18, 2020

sffc self-assigned this Jun 18, 2020

sffc added C-meta Component: Relating to ICU4X as a whole T-docs-tests Type: Code change outside core library labels Jun 18, 2020

sffc mentioned this issue Jun 25, 2020

Style guide: Add error test coverage, integer overflow, and no_std. #154

Merged

sffc closed this as completed in #154 Jun 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Style: checked integer arithmetic? #126

Style: checked integer arithmetic? #126

sffc commented Jun 13, 2020

Manishearth commented Jun 15, 2020

Uh oh!

sffc commented Jun 17, 2020

Uh oh!

Manishearth commented Jun 17, 2020

Uh oh!

sffc commented Jun 17, 2020

Uh oh!

zbraniecki commented Jun 17, 2020

Uh oh!

Manishearth commented Jun 17, 2020

Uh oh!

sffc commented Jun 17, 2020

Uh oh!

Manishearth commented Jun 18, 2020

Uh oh!

sffc commented Jun 18, 2020

Uh oh!

Manishearth commented Jun 18, 2020

Uh oh!

EvanJP commented Jun 18, 2020

Uh oh!

sffc commented Jun 19, 2020 •

edited

Loading

Uh oh!

sffc commented Jun 19, 2020

Uh oh!

hsivonen commented Jun 23, 2020

Uh oh!

Style: checked integer arithmetic? #126

Style: checked integer arithmetic? #126

Comments

sffc commented Jun 13, 2020

Manishearth commented Jun 15, 2020

Uh oh!

sffc commented Jun 17, 2020

Uh oh!

Manishearth commented Jun 17, 2020

Uh oh!

sffc commented Jun 17, 2020

Uh oh!

zbraniecki commented Jun 17, 2020

Uh oh!

Manishearth commented Jun 17, 2020

Uh oh!

sffc commented Jun 17, 2020

Uh oh!

Manishearth commented Jun 18, 2020

Uh oh!

sffc commented Jun 18, 2020

Uh oh!

Manishearth commented Jun 18, 2020

Uh oh!

EvanJP commented Jun 18, 2020

Uh oh!

sffc commented Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sffc commented Jun 19, 2020

Uh oh!

hsivonen commented Jun 23, 2020

Uh oh!

sffc commented Jun 19, 2020 •

edited

Loading