Skip to content

enhance: optimize Storage File Format #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tinswzy opened this issue May 22, 2025 · 0 comments
Open

enhance: optimize Storage File Format #26

tinswzy opened this issue May 22, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@tinswzy
Copy link
Collaborator

tinswzy commented May 22, 2025

Currently, Woodpecker writes logs sequentially to files to ensure high throughput and persistence guarantees. However, the current file format either does not utilize compression or loses the ability to efficiently perform partial reads when compression is enabled. This limits performance tuning and restricts future extensibility (e.g., range-based recovery, fast seeking).

The goal of this issue is to design and implement a new storage file format that supports both compression and efficient partial reads.
Goals:

  • Support compression: Reduce storage footprint and improve I/O efficiency.
  • Enable partial reads: Allow reading specific data blocks without decompressing the entire file.
  • Preserve sequential write performance: Writing must remain high-throughput and append-friendly.
  • Ensure extensibility: Format should be designed to support future metadata like checksums, versioning, and block indices.

Technical Suggestions:

  • Use block-based compression (e.g., compress every N records as a block).
  • Record metadata for each block: offset, length, compression type.
  • Add a lightweight block index section to enable fast seeks.
  • Consider compression algorithms like Snappy or Zstd for a balance of speed and ratio.

Any better solution is also acceptable

@tinswzy tinswzy added the enhancement New feature or request label May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant