Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/reference/style.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Builtin styles
Styler.highlight_max
Styler.highlight_min
Styler.highlight_null
Styler.format_null
Styler.background_gradient
Styler.bar

Expand Down
16 changes: 16 additions & 0 deletions doc/source/user_guide/style.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -492,6 +492,22 @@
"df.style.highlight_max(axis=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can format the text displayed for missing values by `.format_null`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df.style.highlight_max(axis=0).format_null(na_rep='-')"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ Other enhancements
- :meth:`DataFrame.to_json` now accepts an ``indent`` integer argument to enable pretty printing of JSON output (:issue:`12004`)
- :meth:`read_stata` can read Stata 119 dta files. (:issue:`28250`)
- Added ``encoding`` argument to :func:`DataFrame.to_html` for non-ascii text (:issue:`28663`)
- :meth:`Styler.format_null` is now added into the built-in functions to help formatting missing values (:issue:`28358`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add this into the user guide as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like the name format_nans better, to be similar to fillna, hasnans etc.

@jreback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll be glad to!


Build Changes
^^^^^^^^^^^^^
Expand Down
19 changes: 19 additions & 0 deletions pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -930,6 +930,25 @@ def hide_columns(self, subset):
# A collection of "builtin" styles
# -----------------------------------------------------------------------

def format_null(self, na_rep="-"):
"""
Format the text displayed for missing values.

.. versionadded:: 1.0.0

Parameters
----------
na_rep : str

Returns
-------
self : Styler
"""
self.format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will overwrite the formatting of a previously applied formatter for non-NA values. Something like

df.style.format("hi-{}".format).format_null()

is that the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TomAugspurger, I like the name format_na! and yes, I was intended to make an overwriting implementation. Actually, I have considered the set_* approach as you, but it seems confusing for the case:

df.style.format('{:.2%}').set_na_format('-') # got 'nan%' instead of '-'

I've got a new idea, how about interface like this?

.format_na('-', subset=['col1','col2'])
.format('{:.2%}', na_rep='-', subset=['col3','col4'])

And the docstring for format_na rephrase to:

Format the text display value using default formatter but represent nan as `na_rep`.
For more advanced formatting, use Styler.format() with your custom formatter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not have been clear about my concern. It's fine that na_format overwrites the formatting for NA values. I'm concerned tht it overwrites the formatting for non-NA values. In my .format("hi-{}".format).format_na('NA') example, the NA values should be formatted as 'NA' and the non-NA values should be formatted as hi-<value>. But I suspect that right now the non-NA formatting is lost (though perhaps it's not).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about adding an na_rep to the .format function... That's probably fine. I think it'd still be useful for users to have a way to control the default NA formatting at the table level.

But if we add an na_rep to format, then we wouldn't need a new format_na method, right?

Copy link
Contributor Author

@immaxchen immaxchen Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, so a setting at the table level: self.na_rep
and the def format(self, formatter, subset=None): becomes
def format(self, formatter=None, subset=None, na_rep=None):
drop .format_na('-'), use .format(na_rep='-') instead, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that sounds correct. I'm not sure what the default should be, but probably just None (no special formatting for NA values).

lambda x: na_rep if pd.isna(x) else self._display_funcs.default_factory()(x)
)
return self

@staticmethod
def _highlight_null(v, null_color):
return (
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/io/formats/test_style.py
Original file line number Diff line number Diff line change
Expand Up @@ -990,6 +990,14 @@ def test_bar_bad_align_raises(self):
with pytest.raises(ValueError):
df.style.bar(align="poorly", color=["#d65f5f", "#5fba7d"])

def test_format_null(self, na_rep="-"):
# GH 28358
df = pd.DataFrame({"A": [0, np.nan]})
ctx = df.style.format_null()._translate()
result = ctx["body"][1][1]["display_value"]
expected = "-"
assert result == expected

def test_highlight_null(self, null_color="red"):
df = pd.DataFrame({"A": [0, np.nan]})
result = df.style.highlight_null()._compute().ctx
Expand Down