A Python package for automated univariate and bivariate data analysis and visualization to streamline machine learning workflows.
visualize_ML is a Python package that automates univariate and bivariate data analysis and visualization for machine learning tasks. It helps users explore datasets, understand variable distributions, and identify relationships between features through automated plotting and statistical testing, streamlining the initial steps of a machine learning workflow.
Data scientists, machine learning practitioners, and analysts who need to perform exploratory data analysis and feature selection quickly and efficiently, especially those working with structured datasets in Python.
It consolidates multiple visualization and statistical analysis tasks into a simple, unified interface, saving time compared to manually coding plots and tests with libraries like matplotlib and scikit-learn individually.
Python package for consolidated and extensive Univariate,Bivariate Data Analysis and Visualization catering to both categorical and continuous datasets.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Consolidates univariate and bivariate analysis into simple functions like explore.plot and relation.plot, saving time compared to manually coding plots with matplotlib and scikit-learn.
Includes correlation coefficients, Chi-square, and ANOVA tests directly in visualizations, providing immediate statistical insights into variable relationships without extra coding.
Allows adjustment of parameters such as PLOT_COLUMNS_SIZE, bin_size, and padding for tailored visual output, as detailed in the function documentation.
Automatically filters out NaN and non-numeric values during plotting, ensuring analysis is performed on clean, numeric data without requiring separate preprocessing steps.
Only supports pandas DataFrames as input, with no current support for numpy arrays or other formats, as acknowledged in the 'Tasks To Do' section, restricting data compatibility.
Relies solely on matplotlib for static plots, lacking interactive features or integration with modern web-based tools like Plotly or Bokeh, which limits exploratory flexibility.
Focused on fundamental EDA and feature selection; misses advanced statistical methods, model visualization, or real-time data handling, making it insufficient for complex analytical needs.