Skip to content

Default na values list doc is missing the empty string #10700

Closed
@frlnx

Description

@frlnx

Recreating my original post from here: http://stackoverflow.com/questions/26659941/what-are-the-default-na-values-when-pandas-loads-data/31705571#31705571

This documentation http://pandas.pydata.org/pandas-docs/stable/io.html#na-values states:

The default NaN recognized values are ['-1.#IND', '1.#QNAN', '1.#IND', '-1.#QNAN', '#N/A','N/A', 'NA', '#NA', 'NULL', 'NaN', '-NaN', 'nan', '-nan'].

However, this list is not complete.

If it was, these two pieces of code should produce the same result

The actual default values:

import pandas as pd
from StringIO import StringIO

sio = StringIO()
sio.write('"foo","bar"\n"1",""\n"NA","4"')
sio.seek(0)
pd.read_csv(sio, sep=",", quotechar='"')
   foo  bar
0    1  NaN
1  NaN    4

The default values copied and given:

sio = StringIO()
sio.write('"foo","bar"\n"1",""\n"NA","4"')
sio.seek(0)
pd.read_csv(sio, sep=",", quotechar='"',
            keep_default_na=False,
            na_values=['-1.#IND', '1.#QNAN', '1.#IND',
                       '-1.#QNAN', '#N/A','N/A', '#NA', 'NA'
                       'NULL', 'NaN', '-NaN', 'nan', '-nan'])

  foo bar
0   1    
1 NaN   4

Pandas version:

pd.__version__
'0.15.2'

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsMissing-datanp.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions