Top 5 Automated EDA Libraries

Top 5 Automated EDA Libraries

eda.jpg

1. D-Tale

Web Client for Visualizing Pandas Objects. D-Tale was the product of a SAS to Python conversion. D-Tale is the combination of a Flask back-end and a React front-end to bring you an easy way to view & analyze Pandas data structures. It integrates seamlessly with ipython notebooks & python/ipython terminals. Currently this tool supports such Pandas objects as DataFrame, Series, MultiIndex, DatetimeIndex & RangeIndex.

pip install dtale

2. Autoviz

Automatically Visualize any dataset, any size with a single line of code.

pip install autoviz==0.0.6

3. Dataprep

DataPrep lets you prepare your data using a single library with a few lines of code. Collect data from common data sources (through dataprep.connector).Do your exploratory data analysis (through dataprep.eda).Clean and standardize data (through dataprep.clean).

pip install dataprep

4. Pandas-profiling

Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great but a little basic for serious exploratory data analysis. pandas_profiling extends the pandas DataFrame with df.profile_report() for quick data analysis.

pip install pandas-profiling

5. Sweetviz

Sweetviz is an open-source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code. Output is a fully self-contained HTML application. The system is built around quickly visualizing target values and comparing datasets. Its goal is to help quick analysis of target characteristics, training vs testing data, and other such data characterization tasks.

 pip install sweetviz