Skip to content

[BUG] JSON reader fails to parse files with empty rows #5712

@vuule

Description

@vuule

The following test fails in both cases:

@pytest.mark.parametrize(
    "buffer",
    [
        "[ ]\n[ ]",
        '{ }\n{ }',
    ],
)
def test_json_empty(buffer):
    cu_df = cudf.read_json(buffer, lines=True)
    pd_df = pd.read_json(buffer, lines=True)

    np.testing.assert_array_equal(pd_df.dtypes, cu_df.dtypes)

With array rows, the reader creates a table with int8 column instead of an empty table.
With object rows, a CUDA error happens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    0 - BacklogIn queue waiting for assignmentPythonAffects Python cuDF API.SparkFunctionality that helps Spark RAPIDSbugSomething isn't workingcuIOcuIO issuelibcudfAffects libcudf (C++/CUDA) code.

    Type

    No type

    Projects

    Status

    Needs owner

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions