Complete function reference for InsightfulPy v0.2.0.
Import pattern for all examples:
import pandas as pd
import insightfulpy as ipy
df = pd.read_csv('data.csv')- Helper Functions
- Basic Analysis
- Statistical Functions
- Data Quality
- Visualization
- Individual Analysis
- Dataset Comparison
Display help information with function categories.
ipy.help()List all available functions organized by category.
ipy.list_all()Show quick start examples.
ipy.quick_start()Show practical usage examples.
ipy.examples()Display dataset structure with column details.
ipy.columns_info('Sales Data', df)General analysis for numerical and categorical columns.
ipy.analyze_data(df)Statistical summary for numerical columns. Returns DataFrame with count, mean, std, min, quartiles, max, mode, range, IQR, variance, skewness, kurtosis, and Shapiro-Wilk test.
summary = ipy.num_summary(df)Statistical summary for categorical columns. Returns DataFrame with count, unique values, top category, frequency, and percentage.
summary = ipy.cat_summary(df)Summary statistics grouped by categorical variable. Returns TableOne object.
summary = ipy.grouped_summary(df, groupby='category')Calculate statistical measures for a series. Returns dictionary with count, mean, trimmed mean, MAD, std, min, quartiles, max, mode, range, IQR, variance, skewness, and kurtosis.
stats = ipy.calc_stats(df['price'])Calculate skewness and kurtosis for numerical columns.
dist_shape = ipy.calculate_skewness_kurtosis(df)Calculate trimmed mean using IQR method (excludes outliers beyond 1.5*IQR).
trimmed = ipy.iqr_trimmed_mean(df['price'])Calculate Median Absolute Deviation.
deviation = ipy.mad(df['price'])Detect missing and infinite values. If both missing and inf are False, checks both.
ipy.missing_inf_values(df) # Both
ipy.missing_inf_values(df, missing=True) # Only missing
ipy.missing_inf_values(df, df_table=True) # Return DataFrameDetect outliers using IQR method. Returns DataFrame with Q1, Q3, IQR, bounds, outlier count and values.
outliers = ipy.detect_outliers(df)
outliers = ipy.detect_outliers(df, max_display=20)Detect columns with mixed data types.
ipy.detect_mixed_data_types(df)Identify categorical columns with high cardinality (default threshold: 20 unique values).
high_card = ipy.cat_high_cardinality(df, threshold=100)Batch Functions: Many visualization functions support batch processing. Call without batch_num to see available batches, then specify batch_num to plot. See User Guide - Working with Batches for workflow details.
Visualize missing data patterns using matrix and bar charts.
ipy.show_missing(df)Create box plots for all numerical columns.
ipy.plot_boxplots(df)Display KDE plots in batches.
batches = ipy.kde_batches(df)
ipy.kde_batches(df, batch_num=1)Display box plots in batches.
ipy.box_plot_batches(df, batch_num=1)Display QQ plots in batches.
ipy.qq_plot_batches(df, batch_num=1)Display bar charts for categorical columns in batches.
ipy.cat_bar_batches(df, batch_num=1)
ipy.cat_bar_batches(df, batch_num=1, show_percentage=True, high_cardinality_limit=20)Display pie charts for categorical columns in batches.
ipy.cat_pie_chart_batches(df, batch_num=1)
ipy.cat_pie_chart_batches(df, batch_num=1, high_cardinality_limit=10)Create scatter plots for numerical column pairs in batches.
pairs = ipy.num_vs_num_scatterplot_pair_batch(df)
ipy.num_vs_num_scatterplot_pair_batch(df, pair_num=0, batch_num=1)
ipy.num_vs_num_scatterplot_pair_batch(df, pair_num=0, batch_num=1, hue_column='category')Create heatmaps for categorical column pairs in batches.
pairs = ipy.cat_vs_cat_pair_batch(df)
ipy.cat_vs_cat_pair_batch(df, pair_num=0, batch_num=1)
ipy.cat_vs_cat_pair_batch(df, pair_num=0, batch_num=1, high_cardinality_limit=15)Create box and violin plots for numerical vs categorical pairs in batches.
pairs = ipy.num_vs_cat_box_violin_pair_batch(df)
ipy.num_vs_cat_box_violin_pair_batch(df, pair_num=0, batch_num=1)Analyze and visualize individual numerical column with histogram, KDE, and box plot.
ipy.num_analysis_and_plot(df, 'price')
ipy.num_analysis_and_plot(df, 'price', target='category')
ipy.num_analysis_and_plot(df, 'price', visualize=True, return_df=True)Analyze and visualize individual categorical column with bar charts.
ipy.cat_analyze_and_plot(df, 'category')
ipy.cat_analyze_and_plot(df, 'category', target='status')Compare columns across multiple DataFrames. Returns base profile and linked profiles.
dfs = {'sales': df1, 'inventory': df2, 'orders': df3}
base, linked = ipy.compare_df_columns('sales', dfs)Identify common columns across multiple DataFrames.
dfs = {'sales': df1, 'inventory': df2}
ipy.linked_key(dfs)Display linked columns from base DataFrame.
ipy.display_key_columns('sales', dfs)Identify rows with outliers in multiple columns.
outliers = ipy.interconnected_outliers(df, ['price', 'quantity', 'discount'])Analysis for categorical columns with optional missing value separation.
summary = ipy.comp_cat_analysis(df)
missing, non_missing = ipy.comp_cat_analysis(df, missing_df=True)Analysis for numerical columns with optional missing/outlier separation. Includes normality tests (Shapiro-Wilk for n<=5000, Kolmogorov-Smirnov for n>5000).
summary = ipy.comp_num_analysis(df)
outlier, non_outlier = ipy.comp_num_analysis(df, outlier_df=True)- User Guide - Usage examples and workflows
- Configuration - Constants reference
Version: 0.2.0 | Status: Beta | Python: 3.8-3.12
Copyright 2025 dhaneshbb | License: MIT | Homepage: https://github.com/dhaneshbb/insightfulpy