Skip to content

gh-67230: add quoting rules to csv module #29469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 12, 2023
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 19 additions & 1 deletion Doc/library/csv.rst
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,7 @@ The :mod:`csv` module defines the following constants:

Instructs :class:`writer` objects to quote all non-numeric fields.

Instructs the reader to convert all non-quoted fields to type *float*.
Instructs :class:`reader` to convert all non-quoted fields to type *float*.


.. data:: QUOTE_NONE
Expand All @@ -339,6 +339,24 @@ The :mod:`csv` module defines the following constants:

Instructs :class:`reader` to perform no special processing of quote characters.

.. data:: QUOTE_NOTNULL

Instructs :class:`writer` objects to quote all fields which are not
``None``. This is similar to QUOTE_ALL, except that if a
field value is ``None`` an empty (unquoted) string is written.

Instructs :class:`reader` to interpret an empty (unquoted) field as None, and
otherwise behave as QUOTE_ALL.

.. data:: QUOTE_STRINGS

Instructs :class:`writer` objects to always place quotes around fields
which are strings. This is similar to QUOTE_NONNUMERIC, except that if a
field value is ``None`` an empty (unquoted) string is written.

Instructs :class:`reader` to interpret an empty (unquoted) string as None, and
otherwise behave as QUOTE_NONNUMERIC.

The :mod:`csv` module defines the following exception:


Expand Down
2 changes: 2 additions & 0 deletions Lib/csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@
unregister_dialect, get_dialect, list_dialects, \
field_size_limit, \
QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE, \
QUOTE_STRINGS, QUOTE_NOTNULL, \
__doc__
from _csv import Dialect as _Dialect

from io import StringIO

__all__ = ["QUOTE_MINIMAL", "QUOTE_ALL", "QUOTE_NONNUMERIC", "QUOTE_NONE",
"QUOTE_STRINGS", "QUOTE_NOTNULL",
"Error", "Dialect", "__doc__", "excel", "excel_tab",
"field_size_limit", "reader", "writer",
"register_dialect", "get_dialect", "list_dialects", "Sniffer",
Expand Down
4 changes: 4 additions & 0 deletions Lib/test/test_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,10 @@ def test_write_quoting(self):
quoting = csv.QUOTE_ALL)
self._write_test(['a\nb',1], '"a\nb","1"',
quoting = csv.QUOTE_ALL)
self._write_test(['a','',None,1], '"a","",,1',
quoting = csv.QUOTE_STRINGS)
self._write_test(['a','',None,1], '"a","",,"1"',
quoting = csv.QUOTE_NOTNULL)

def test_write_escape(self):
self._write_test(['a',1,'p,q'], 'a,1,"p,q"',
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add QUOTE_STRINGS and QUOTE_NOTNULL to the suite of csv module quoting
styles.
16 changes: 15 additions & 1 deletion Modules/_csv.c
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,8 @@ typedef enum {
} ParserState;

typedef enum {
QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE
QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE,
QUOTE_STRINGS, QUOTE_NOTNULL
} QuoteStyle;

typedef struct {
Expand All @@ -95,6 +96,8 @@ static const StyleDesc quote_styles[] = {
{ QUOTE_ALL, "QUOTE_ALL" },
{ QUOTE_NONNUMERIC, "QUOTE_NONNUMERIC" },
{ QUOTE_NONE, "QUOTE_NONE" },
{ QUOTE_STRINGS, "QUOTE_STRINGS" },
{ QUOTE_NOTNULL, "QUOTE_NOTNULL" },
{ 0 }
};

Expand Down Expand Up @@ -1264,6 +1267,12 @@ csv_writerow(WriterObj *self, PyObject *seq)
case QUOTE_ALL:
quoted = 1;
break;
case QUOTE_STRINGS:
quoted = PyUnicode_Check(field);
break;
case QUOTE_NOTNULL:
quoted = field != Py_None;
break;
default:
quoted = 0;
break;
Expand Down Expand Up @@ -1659,6 +1668,11 @@ PyDoc_STRVAR(csv_module_doc,
" csv.QUOTE_NONNUMERIC means that quotes are always placed around\n"
" fields which do not parse as integers or floating point\n"
" numbers.\n"
" csv.QUOTE_STRINGS means that quotes are always placed around\n"
" fields which are strings. Note that the Python value None\n"
" is not a string.\n"
" csv.QUOTE_NOTNULL means that quotes are only placed around fields\n"
" that are not the Python value None.\n"
" csv.QUOTE_NONE means that quotes are never placed around fields.\n"
" * escapechar - specifies a one-character string used to escape\n"
" the delimiter when quoting is set to QUOTE_NONE.\n"
Expand Down