Skip to content

It's impossible to reliably upload a complete file #2975

Open
@viliam-durina

Description

@viliam-durina

We're trying to upload a file to GCS, and make sure the file is only published if the upload succeeds. Consider this code:

BlobInfo blobInfo = ...;
ByteBuffer buf1, buf2;
WriteChannel writer = storage.writer(blobInfo, Storage.BlobWriteOption.crc32cMatch());
try {
  writer.write(buf1);
  writer.write(buf2); // network goes down before this call
} finally {
  writer.close(); // network is back up before this call
}

The write of buf1 is successful, but before writing buf2, the network goes down. As expected, an IOException is thrown (both my buffers are 32MB). The execution proceeds to the finally clause to call close(). At this point, the network is up again. The close call finishes the file upload, and the file is published with partial contents!

However, the problem is more generic. While writing the file, any error can occur (e.g. an IO error while reading the file to upload). In this case, the close() call will also publish a partial file.

I have tested moving the close() call to the try block, so that the WriteChannel is abandoned without closing it. This seems to have the desired effect, but since the close() method is inherited from AutoCloseable, it's very unintuitive, and some code quality tools will suggest or force calling the close() method.

One possible backward-compatible fix to this problem would be to add abort() method, that will clean up the resources without finishing the upload, and after which close() will be no-op. That method could be automatically called for internal exceptions.

I'm using version 2.50.0 on Java 21.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: storageIssues related to the googleapis/java-storage API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions