PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by generating helpful detailed plots using matplotlib
. It automatically generates appropriate plots based on the dataset you feed it.
Bug Fixes and Robustness:
- Corrected calculation of missing values in
_generate_analysis
. - Added error handling for statistical functions (
skewness
,kurtosis
,autocorrelation
) to prevent crashes with empty or zero-variance data, returning NaN instead. - Added warnings for columns not classified by
_detect_column_types
. - Included input validation in the
plot()
method to check for valid column names and required arguments, raising ValueErrors for invalid input.
Performance Enhancements:
- Modified
auto_plot()
to display and close plot figures section by section (or individually for looped plots like pie/line charts). This significantly reduces memory consumption when generating many plots. - Added a warning in the
auto_plot()
docstring regarding potential performance issues of pairwise scatter plots with many numeric columns.
Code Cleanup:
- Refined static method definitions (
_calculate_skewness
,_calculate_kurtosis
, etc.) by removing unnecessaryself
arguments and ensuring calls are updated.
Testing:
- Replaced the existing
test.py
with a new script that:- Creates a
test_output
directory for plot outputs. - Uses
energy_consumption_dataset.csv
(with a dummy fallback). - Demonstrates
AutoPlot
initialization. - Runs
auto_plot()
with default and custom configurations. - Runs manual
plot()
for scatter, distribution, boxplot, and bar types. - Shows usage of the
customize()
method. - Includes a test with a small, inline dataset.
- Saves all generated plots for visual inspection.
- Creates a
- Added package dependencies to PyAutoPlot:
matplotlib>=3.0.0
,pandas>=1.0.0
, andnumpy>=1.18.0
.
AutoPlot
class, along withauto_plot
andplot
functions.auto_plot
automatically generates suitable plots based on values in your dataset.
Important
The AutoPlot
object needs to be initialized with a CSV dataset before any plots can be generated using either plot
or auto_plot
.
You can install PyAutoPlot using pip:
pip install pyautoplot
PyAutoPlot supports the following Python versions:
- Python 3.6
- Python 3.7
- Python 3.8
- Python 3.9
- Python 3.10
- Python 3.11/Later (Preferred)
Please ensure that you have one of these Python versions installed before using PyAutoPlot. PyAutoPlot may not work as expected on lower versions of Python than the supported.
- Automatic Plotting: PyAutoPlot automatically generates appropriate plots based on values present in your dataset. Currently, PyAutoPlot supports the following types of data:
numeric
,categorical
, andtime-series
. - Customization: Due to matplotlib's high customizability, you can create and use custom themes for PyAutoPlot, just pass in your theme as a dictionary of RCParams (See
plot.rcParams.keys()
for a list of valid parameters) or choose from our predefined themes. You can also pass in additional parameters to the function thatmatplotlib
can recognize (e.g.color='#5d17eb'
).
from pyautoplot import AutoPlot
# Initialize with a CSV file
plotter = AutoPlot("path/to/dataset.csv")
# Automatically analyze and plot
plotter.auto_plot(output_file='test', theme="dark", color='orange', excludes=['detailed_analysis'])
Note
PyAutoPlot may not work well with certain datasets. If you find any issues please open an issue.
from pyautoplot import AutoPlot
# Define your custom theme
custom_theme = {
"axes.facecolor": "#ffffff",
"axes.edgecolor": "#000000",
"axes.labelcolor": "#000000",
"figure.facecolor": "#ffffff",
"grid.color": "#dddddd",
"text.color": "#000000",
"xtick.color": "#000000",
"ytick.color": "#000000",
"legend.frameon": True,
}
# Initialize with a CSV file
plotter = AutoPlot("path/to/dataset.csv")
# Automatically analyze and plot using the custom theme
plotter.auto_plot(output_file='test', theme=custom_theme, color='orange', excludes=['detailed_analysis'])
Contributions are welcome! If you encounter any issues, have suggestions, or want to contribute to PyAutoPlot, please open an issue or submit a pull request on GitHub.
PyAutoPlot is released under the terms of the MIT License (Modified). Please see the LICENSE file for the full text.
Modified License Clause
The modified license clause grants users the permission to make derivative works based on the PyAutoPlot software. However, it requires any substantial changes to the software to be clearly distinguished from the original work and distributed under a different name.