Skip to content

KeyError: 'date' on synthetic data #281

@marrrcin

Description

@marrrcin

Hi,
I'm exploring the use of your library and I've stumped across an error when working with my data.

Popmon version: 1.4.5
Error:

 in <lambda>(plot)
    157             # filter out potential empty plots
    158             plots = [e for e in plots if len(e)]
--> 159             plots = sorted(plots, key=lambda plot: plot["date"])
    160 
    161             # basic checks for histograms

KeyError: 'date'
Full stack trace: ⬇️
KeyError                                  Traceback (most recent call last)
[<ipython-input-39-c55c117796f8>](https://localhost:8080/#) in <cell line: 1>()
----> 1 report = popmon.df_stability_report(
      2     df,
      3     time_axis="time",
      4     time_width="1w",
      5 )

7 frames
[/usr/local/lib/python3.10/dist-packages/popmon/pipeline/report.py](https://localhost:8080/#) in df_stability_report(df, settings, time_width, time_offset, var_dtype, reference, split, **kwargs)
    196 
    197     # generate data stability report
--> 198     return stability_report(
    199         hists=hists,
    200         settings=settings,

[/usr/local/lib/python3.10/dist-packages/popmon/pipeline/report.py](https://localhost:8080/#) in stability_report(hists, settings, reference, **kwargs)
     73     # execute reporting pipeline
     74     pipeline = get_report_pipeline_class(settings.reference_type, reference)(**cfg)
---> 75     result = pipeline.transform(datastore)
     76 
     77     stability_report_result = StabilityReport(datastore=result)

[/usr/local/lib/python3.10/dist-packages/popmon/base/pipeline.py](https://localhost:8080/#) in transform(self, datastore)
     65         for module in self.modules:
     66             self.logger.debug(f"transform {module.__class__.__name__}")
---> 67             datastore = module.transform(datastore)
     68         return datastore
     69 

[/usr/local/lib/python3.10/dist-packages/popmon/pipeline/report_pipelines.py](https://localhost:8080/#) in transform(self, datastore)
    255     def transform(self, datastore):
    256         self.logger.info(f'Generating report "{self.store_key}".')
--> 257         return super().transform(datastore)

[/usr/local/lib/python3.10/dist-packages/popmon/base/pipeline.py](https://localhost:8080/#) in transform(self, datastore)
     65         for module in self.modules:
     66             self.logger.debug(f"transform {module.__class__.__name__}")
---> 67             datastore = module.transform(datastore)
     68         return datastore
     69 

[/usr/local/lib/python3.10/dist-packages/popmon/base/module.py](https://localhost:8080/#) in _transform(self, datastore)
     49 
     50         # transformation
---> 51         outputs = func(self, *list(inputs.values()))
     52 
     53         # transform returns None if no update needs to be made

[/usr/local/lib/python3.10/dist-packages/popmon/visualization/histogram_section.py](https://localhost:8080/#) in transform(self, data_obj, sections)
    157             # filter out potential empty plots
    158             plots = [e for e in plots if len(e)]
--> 159             plots = sorted(plots, key=lambda plot: plot["date"])
    160 
    161             # basic checks for histograms

[/usr/local/lib/python3.10/dist-packages/popmon/visualization/histogram_section.py](https://localhost:8080/#) in <lambda>(plot)
    157             # filter out potential empty plots
    158             plots = [e for e in plots if len(e)]
--> 159             plots = sorted(plots, key=lambda plot: plot["date"])
    160 
    161             # basic checks for histograms

KeyError: 'date'

Reproduction steps:
https://colab.research.google.com/drive/1N59kn7C9LN6W9AJkfz9SougiZoOMM0bn?usp=sharing

Additional information:
I'm using a function to generate synthetic data (see colab). When I generate "less" data - e.g. for 200 days, the code works fine, but after some unknown threshold (like 360 days), it breaks.
I've also tried changing the time_width parameter - sometimes it starts to work with 2w, sometimes it works with 1d but I haven't figured out any pattern.

Also note that it happens both for self-referencing data as well as data with a reference set (see second part of the colab).

Expected result:
Monitoring report generates properly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions