Skip to content

python DeltaTable.schema().to_pyarrow() does not support data type void #1947

Open
@leo-schick

Description

@leo-schick

Environment

pip datalake==0.14.0

Environment:

  • Cloud provider: /
  • OS: Linux

Bug

What happened:
I wanted to read the schema from a delta table in pyarrow format via

from deltalake import DeltaTable

file_uri = "..."
deltaTable = DeltaTable(file_uri, storage_options=...)
pyarrow_schema = deltaTable.schema().to_pyarrow()

It failed with error:

Traceback (most recent call last):
[...]
  File "my_python_file.py", line 63, in __
    pyarrow_schema = deltaTable.schema().to_pyarrow()
Exception: Schema error: Invalid data type for Arrow: void

What you expected to happen:
pyarrow has a null data type. I would expect that in case of a void type in delta, the pyarrow schema returns the column with null data type instead of throwing an exception.

How to reproduce it:

More details:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomershelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions