Skip to content

High CPU usage due to repetitive batch build/compression in re-insertion #1457

Open
@byte-sourcerer

Description

@byte-sourcerer

Observed

Upon attempting to re-insert into Clickhouse, we noticed a heightened CPU usage attributed to (*batch).Append and (*batch).Send. As it stands, the current design integrates the construction and compression of the batch with its transmission. Consequently, each attempt to resend a batch mandates a fresh construction and compression execution, resulting in significant strains on CPU performance.

Even if we can call batch.Send(..) multiple times, the context.Context is shared, which prevents us to control timeout.

Solution

Decoupling of the two linked processes: the formation of a batch and its delivery.

Expected behaviour

The end goal is to enable the re-transmission of batches without the need for repetitive construction and compression operations.

Code example

retry(func() {
    batch := conn.PrepareBatch(...)
    for ... {
        batch.Append(...)
    }
    batch.Send()
})

Details

Profiling flamegraph:

CleanShot 2024-12-20 at 18 57 35@2x-2

Environment

  • clickhouse-go version: v2.13.0
  • Interface: ClickHouse API / database/sql compatible driver
  • Go version: 1.22.4
  • Operating system:
  • ClickHouse version:
  • Is it a ClickHouse Cloud? No
  • ClickHouse Server non-default settings, if any: No
  • CREATE TABLE statements for tables involved: No
  • Sample data for all these tables, use clickhouse-obfuscator if necessary

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions