Simplify csv reader/writer

The stumbling stone for many users is to create the proper SQL table structure before CSV data is loaded.

If the CSV file contains a header, then the database table creation can be based on a simple analysis of the header line and a sample of the remaining rows. Consider a simple wrapper around the SQL CREATE + COPY (INTO/TO) action.

monetdbe.read_csv(<filename>, sep=',', quotechar='"', null='')
The parallel MonetDB CSV reader should make a difference compared to plain Python and Pandas

It could also be aligned with the pandas.read_csv() interface
data = pd.read_csv(
    "data/files/complex_data_example.tsv",      # relative python path to subdirectory
    sep='\t'           # Tab-separated value file.
    quotechar="'",        # single quote allowed as quote character
    dtype={"salary": int},             # Parse the salary column as an integer 
    usecols=['name', 'birth_date', 'salary'].   # Only load the three columns specified.
    parse_dates=['birth_date'],     # Intepret the birth_date column as a date
    skiprows=10,         # Skip the first 10 rows of the file
    na_values=['.', '??']       # Take any '.' or '??' values as NA
)

The complimentary action is monetdb.write_csv()


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Simplify csv reader/writer #97

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Simplify csv reader/writer #97

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions