Skip to content

Fix Nested Array #171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
May 4, 2022
Merged

Fix Nested Array #171

merged 24 commits into from
May 4, 2022

Conversation

Enmk
Copy link
Contributor

@Enmk Enmk commented Apr 23, 2022

Fixes

  • Fixed ColumnArray::Slice() and ColumnArray::GetAsColumn() that worked poorly with 2D-arrays (Array(Array(X)))
    Previous implementation was using Append, rather than AppendAsColumn, flattening nested arrays.

New APIs

  • Column::CloneEmpty - idiomatic way to make an empty clone of a column instead of c->Slice(0, 0)
  • ColumnArray::GetAsColumnTyped<T> - syntax sugar to simplify accessing array values.
  • ColumnArray<T> template that simplifies and optimizes both reading and appending data to Array.
    • For more optimal access to the Array's data, both ColumnArrayT<X>::operator[] and ColumnArrayT<X>::At() provide a wrapper rather than a full-blown column. So the underlying column is not copied, but rather accessed via read-only "view".
    • ColumnArrayT<X>::Append() overloads work for anything iteratable. Users don't have to wrap values into ColumnX anymore.
    • For convenience, values obtained from both ColumnArrayT<X>::operator[] and ColumnArrayT<X>::At() provide type-safe access to the typed data.
    • Existing ColumnArray instance can be type-safe wrapped into appropriate ColumnArrayT<NestedColumnType> so users can benefit from features above for columns obtained from Block.

Examples

  • Array(UInt64):
auto array = std::make_shared<clickhouse::ColumnArrayT<ColumnUInt64>>();
array->Append({0, 1, 2});

EXPECT_EQ(0u, array->At(0).At(0)); // 0
EXPECT_EQ(1u, (*array)[0][1]);     // 1
  • Array(FixedString(6)):
using namespace std::literals;
auto array = std::make_shared<clickhouse::ColumnArrayT<ColumnFixedString>>(6); // argument is passed to ColumnFixedString constructor
array->Append({"hello", "world"});

// Additional \0 since strings are padded from right with zeros in FixedString(6).
EXPECT_EQ("hello\0"sv, array->At(0).At(0));

auto row = array->At(0);
EXPECT_EQ("hello\0"sv, row.At(0));
EXPECT_EQ(6u, row[0].length());
EXPECT_EQ("hello", row[0].substr(0, 5));

EXPECT_EQ("world\0"sv, (*array)[0][1]);
  • Array(Array(UInt64))
// Nested 2D-arrays are supported too:
auto array = std::make_shared<clickhouse::ColumnArrayT<clickhouse::ColumnArrayT<ColumnUInt64>>>();
array->Append(std::vector<std::vector<unsigned int>>{{0}, {1, 1}, {2, 2, 2}});

EXPECT_EQ(0u, array->At(0).At(0).At(0)); // 0
EXPECT_EQ(1u, (*array)[0][1][1]);        // 1

See unit-tests for more examples.

Closes: #43

@Enmk Enmk changed the title Optimize array access Fix Array Apr 24, 2022
@Enmk Enmk changed the title Fix Array Fix Nested Array Apr 26, 2022
Copy link
Contributor

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to go over some details and the tests. Submeting partial review so we can work in parallel.

@Enmk Enmk force-pushed the optimize_array_access branch from 4fa4c9e to 17ddbdb Compare May 4, 2022 11:33
Enmk added 23 commits May 4, 2022 14:52
Known limitations:
- Doesn't work with nested arrays.
More tests
Stub for wrapping ColumnArray into ColumnArrayT
Added new and updated existing tests
utils.h(214,1): error C2872: 'details': ambiguous symbol
Using it at appropriate places instead of Slice(0, 0), since later may cause issues in some context (like empty arrays)
Array(Array(Array(UInt64)))
Added tests for wrapping 2D Array
- Fixed ColumnArray::Slice
- Added tests for ColumnArray::GetAsColumn, ColumnArray::Slice
- Fixed construction of ColumnArrayT, and updated ColumnArrayT::Wrap
- Renamed and rearranged some other Array-related tests
@Enmk Enmk force-pushed the optimize_array_access branch from 17ddbdb to 6708b82 Compare May 4, 2022 13:08
@Enmk Enmk merged commit 211f115 into ClickHouse:master May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Selecting elements of nested array
2 participants