Open
Description
Observed
Upon attempting to re-insert into Clickhouse, we noticed a heightened CPU usage attributed to (*batch).Append
and (*batch).Send
. As it stands, the current design integrates the construction and compression of the batch with its transmission. Consequently, each attempt to resend a batch mandates a fresh construction and compression execution, resulting in significant strains on CPU performance.
Even if we can call batch.Send(..)
multiple times, the context.Context
is shared, which prevents us to control timeout.
Solution
Decoupling of the two linked processes: the formation of a batch and its delivery.
Expected behaviour
The end goal is to enable the re-transmission of batches without the need for repetitive construction and compression operations.
Code example
retry(func() {
batch := conn.PrepareBatch(...)
for ... {
batch.Append(...)
}
batch.Send()
})
Details
Profiling flamegraph:
Environment
-
clickhouse-go
version: v2.13.0 - Interface: ClickHouse API /
database/sql
compatible driver - Go version: 1.22.4
- Operating system:
- ClickHouse version:
- Is it a ClickHouse Cloud? No
- ClickHouse Server non-default settings, if any: No
-
CREATE TABLE
statements for tables involved: No - Sample data for all these tables, use clickhouse-obfuscator if necessary