descriptive package

Submodules

descriptive.pyeda module

descriptive.pyeda.display_column_types(data)[source]

Separate numerical columns and categorical columns.

Parameters:

data – variable

Returns:

list of numerical and categorical feature names

descriptive.pyeda.display_dataset_detail(data)[source]

Print dataset details.

Parameters:

data – variable

Returns:

dataset details

descriptive.pyeda.display_dataset_info(data) None[source]

Print dataset info.

Parameters:

data – variable

Returns:

total rows and columns

descriptive.pyeda.display_describe_data(data) None[source]

Calculate basic statistics for the whole data.

Parameters:

data – variable

Returns:

prints out describe()

descriptive.pyeda.display_summary_data(data)[source]

The function prints out a summary table of columns.

  • number of unique.

  • Null values.

  • Null Percentage.

  • DataType.

Parameters:

data – variable

Returns:

Summarize columns

descriptive.pyeda.import_dataset(file_name: str)[source]

Read cvs data file.

Parameters:

file_name – string contain the csv file name

Returns:

pandas dataframe

descriptive.pyeda.read_dataset(data)[source]

Reading data from a variable.

Parameters:

data – variable

Returns:

data

descriptive.pyeda.save_data_to_csv_file(data, filename: str)[source]

Save data to a csv file.

Parameters:
  • data – variable

  • filename – string

Returns:

updated csv data file

descriptive.pyeda.select_categorical_variables(data) list[source]

Selecting categorical variables.

Parameters:

data – variable

Returns:

all categorical features in a dataset

descriptive.pyeda.select_numeric_variables(data) list[source]

Selecting numerical variables.

Parameters:

data – variable

Returns:

all numeric features in a dataset

descriptive.pyeda.show_values(axs, orient='v', space=0.01)[source]
descriptive.pyeda.vis_advanced_stack_bar(data, first_categorical: str, second_categorical: str, third_categorical: str, title: str = 'Add Chart Title', subtitle: str = 'explain ur data viz by subtitle')[source]

Visualize percentage relationship using three categorical variables.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_categorical – str

  • second_categorical – str

  • third_categorical – str

  • title – str

  • subtitle – str

Returns:

stacked bar plot

descriptive.pyeda.vis_heatmap(data)[source]

Visualize the correlation between multiple numeric columns.

Parameters:

data – variable

Returns:

heatmap

descriptive.pyeda.vis_highest_percentage_datapoints(data, first_categorical_col: str, second_categorical_col: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the highest percentage of datapoint values,

for categorical variable grouped by second categorical variable.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_categorical_col – str

  • second_categorical_col – str

  • title – str

  • subtitle – str

Returns:

count plot

descriptive.pyeda.vis_lowest_percentage_datapoints(data, first_categorical: str, second_categorical: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the lowest percentage of datapoint values,

for categorical variable grouped by second categorical variable.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_categorical – str

  • second_categorical – str

  • title – str

  • subtitle – str

Returns:

count plot

descriptive.pyeda.vis_the_highest_label_pie_chart(data, categorical_col: str, numerical_col: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle')[source]

Visualize the highest label in pie chart.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • categorical_col – str

  • numerical_col – str

  • title – str

  • subtitle – str

Returns:

pie chart

descriptive.pyeda.vis_top_highest_average(data_frame, categorical_column: list, numerical_column: str, avg_numbers: list, title: str = 'add chart title', subtitle: str = 'Explain ur data viz by subtitle ')[source]

Visualize the highest average values.

add avg_numbers to determine which bars to be colored.

Add new title and subtitle content as a string.

Parameters:
  • data_frame – variable

  • categorical_column – list

  • numerical_column – str

  • avg_numbers – list

  • title – str

  • subtitle – str

Returns:

bar chart

descriptive.pyeda.vis_top_ten_values(data, first_column: str, by_second_column: str, color_bar: list, title: str = 'explain ur data viz by subtitle', subtitle: str = 'explain ur data viz by subtitle')[source]

Visualize top ten values, add color_bar as integer list to determine which bars to be colored.

Add new title and subtitle content as a string.

Parameters:
  • data – Variable

  • first_column – str

  • by_second_column – str

  • color_bar – list

  • title – str

  • subtitle – str

Returns:

Bar plot

descriptive.pyeda.visualize_advanced_bar_plot(data, categorical_col: str, numerical_col: str, second_categorical_col: str, title: str = 'add chart title', subtitle: str = 'Explain ur data viz by subtitle ')[source]

Summarize two categorical columns by numerical column .

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • categorical_col – str

  • numerical_col – str

  • second_categorical_col – str

  • title – str

  • subtitle – str

Returns:

bar chart

descriptive.pyeda.visualize_advanced_kde(data, first_numeric: str, second_numeric: str, categorical_col: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle')[source]

Visualize the kernel density estimate of a numeric columns by categorical column.

Parameters:
  • data – variable

  • first_numeric – str

  • second_numeric – str

  • categorical_col – str

  • title – str

  • subtitle – str

Returns:

KDE Plot

descriptive.pyeda.visualize_advanced_scatter_plot(data, first_numeric: str, second_numeric: str, categorical_col: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle')[source]

Vis the relationship between two numeric variable’s by a third categorical variable, to dictate the color of data point’s.

Add new title and subtitle content as a string.

Parameters:
  • data – Data frame variable

  • first_numeric – str

  • second_numeric – str

  • categorical_col – str

  • title – str

  • subtitle – str

Returns:

Advanced scatter plot

descriptive.pyeda.visualize_basic_bar_plot(data, numerical_col: str, categorical_col: str, title: str = 'Add chart title ', subtitle: str = 'Explain ur data viz by subtitle ')[source]

Visualize the mean of a numeric column by the categories of a categorical column.

Add new Title and subtitle content to describe your chart.

Parameters:
  • data – variable

  • numerical_col – str

  • categorical_col – str

  • title – str

  • subtitle – str

Returns:

Bar chart

descriptive.pyeda.visualize_basic_kde(data, first_numeric: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle')[source]

Visualize the kernel density estimate of a numeric column.

Parameters:
  • data – variable

  • first_numeric – str

  • title – str

  • subtitle – str

Returns:

KDE Plot

descriptive.pyeda.visualize_basic_scatter_plot(data, first_column: str, second_column: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle')[source]

Visualize the relationship between two numeric columns.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_column – str

  • second_column – str

  • title – str

  • subtitle – str

Returns:

Basic scatter plot

descriptive.pyeda.visualize_boxplot(data, numeric_column: str, categorical_column: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle')[source]

Visualize the distribution of a numeric column by the categories of a categorical column.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • numeric_column – str

  • categorical_column – str

  • title – str

  • subtitle – str

Returns:

Box Plot

descriptive.pyeda.visualize_causation(data, first_numeric: str, second_numeric: str, category_col: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the relationship between two numeric column by a categorical column.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_numeric – str

  • second_numeric – str

  • category_col – str

  • title – str

  • subtitle – str

Returns:

regression plot

descriptive.pyeda.visualize_countplot(data, first_categorical_col: str, second_categorical_col: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the count of a categorical column by the categories of another categorical column.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_categorical_col – str

  • second_categorical_col – str

  • title – str

  • subtitle – str

Returns:

count plot

descriptive.pyeda.visualize_distribution_of_categorical_col(data_frame, column_name: str)[source]

Visualize the distribution of a categorical column.

Parameters:
  • data_frame – variable

  • column_name – str

Returns:

histogram, and pie charts

descriptive.pyeda.visualize_distribution_of_numeric_col(data_frame, column_name: str, bins: int) None[source]

Visualize the distribution of a numeric column.

Parameters:
  • data_frame – variable

  • column_name – str

  • bins – int

Returns:

Histogram, boxplot, q-q plot, skewness and kurtosis values

descriptive.pyeda.visualize_kde(data, first_numeric: str, categorical_col: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle')[source]

Visualize the kernel density estimate of a numeric column per categorical column.

Parameters:
  • data – variable

  • first_numeric – str

  • categorical_col – str

  • title – str

  • subtitle – str

Returns:

KDE Plot

descriptive.pyeda.visualize_line_plot(data, first_numeric: str, second_numeric: str, categorical_col: str, title: str = 'Add Chart Title')[source]

Visualize the relationship between two numeric columns and the categories of a categorical column.

Add title to chart as a string.

Parameters:
  • data – variable

  • first_numeric – str

  • second_numeric – str

  • categorical_col – str

  • title – str

Returns:

lm plot

descriptive.pyeda.visualize_linear_regression(data, first_numeric: str, second_numeric: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the relationship between two numeric column.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • first_numeric – str

  • second_numeric – str

  • title – str

  • subtitle – str

Returns:

regression plot

descriptive.pyeda.visualize_multi_numeric_columns_avg(data, numerical_col: list, categorical_col: str, title: str = 'Add chart title ', subtitle: str = 'Explain ur data viz by subtitle ')[source]

Visualize the mean of a multiple numeric columns by the categories of a categorical column. Add new Title and subtitle content to describe your chart.

Parameters:
  • data – variable

  • numerical_col – list

  • categorical_col – str

  • title – str

  • subtitle – str

Returns:

point plot chart

descriptive.pyeda.visualize_pair_plot(data, numerical_columns: list, by_categorical_col: str)[source]

Visualize the relationship between multiple numeric columns by categorical column.

Parameters:
  • data – variable

  • numerical_columns – list

  • by_categorical_col – str

Returns:

pair plot

descriptive.pyeda.visualize_pie_chart(data, categorical_col: str, numerical_col: str, title: str = 'Add Chart Title', subtitle: str = 'explain ur data viz by subtitle')[source]

Visualize pie chart.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • categorical_col – str

  • numerical_col – str

  • title – str

  • subtitle – str

Returns:

pie chart

descriptive.pyeda.visualize_point_plot(data, numerical_col: str, categorical_col: str, title: str = 'Add Chart Title', subtitle: str = 'Explain ur data viz by subtitle ')[source]

Visualize the mean of a numerical column by the categories of a categorical column.

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • numerical_col – str

  • categorical_col – str

  • title – str

  • subtitle – str

Returns:

point plot

descriptive.pyeda.visualize_stack_bar(data, first_categorical: str, second_categorical: str, title: str = 'Add Chart Title', subtitle: str = 'explain ur data viz by subtitle')[source]

Visualize percentage relationship using two categorical variables.

Add new title and subtitle content as a string.

Parameters:
  • data – Data fame variable

  • first_categorical – str

  • second_categorical – str

  • title – str

  • subtitle – str

Returns:

stacked bar plot

descriptive.pyeda.visualize_time_relationship(data, date_colum: str, numerical_column: str, filter_by: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the sum of all shared data points for continuous variable by date variable, filter by :

  • Year

  • Month

  • Day

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • date_colum – str

  • numerical_column – str

  • filter_by – str

  • title – str

  • subtitle – str

Returns:

Line Chart

descriptive.pyeda.visualize_time_relationship_by_categorical_variable(data, date_colum: str, numerical_col: str, categorical_colum: str, filter_by: str, title: str = 'add chart title', subtitle: str = 'explain ur data viz by subtitle ')[source]

Visualize the sum between date column and continues column with categorical column, filter by :

  • Year

  • Month

  • Day

Add new title and subtitle content as a string.

Parameters:
  • data – variable

  • date_colum – str

  • numerical_col – str

  • filter_by – str

  • title – str

  • categorical_colum – str

  • subtitle – str

Returns:

Line Chart

Module contents