enhance: optimize Storage File Format #26

tinswzy · 2025-05-22T07:45:32Z

Currently, Woodpecker writes logs sequentially to files to ensure high throughput and persistence guarantees. However, the current file format either does not utilize compression or loses the ability to efficiently perform partial reads when compression is enabled. This limits performance tuning and restricts future extensibility (e.g., range-based recovery, fast seeking).

The goal of this issue is to design and implement a new storage file format that supports both compression and efficient partial reads.
Goals:

Support compression: Reduce storage footprint and improve I/O efficiency.
Enable partial reads: Allow reading specific data blocks without decompressing the entire file.
Preserve sequential write performance: Writing must remain high-throughput and append-friendly.
Ensure extensibility: Format should be designed to support future metadata like checksums, versioning, and block indices.

Technical Suggestions:

Use block-based compression (e.g., compress every N records as a block).
Record metadata for each block: offset, length, compression type.
Add a lightweight block index section to enable fast seeks.
Consider compression algorithms like Snappy or Zstd for a balance of speed and ratio.

Any better solution is also acceptable

tinswzy added the enhancement New feature or request label May 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enhance: optimize Storage File Format #26

enhance: optimize Storage File Format #26

tinswzy commented May 22, 2025

enhance: optimize Storage File Format #26

enhance: optimize Storage File Format #26

Comments

tinswzy commented May 22, 2025