py_simple_report package¶

Submodules¶

py_simple_report.main module¶

py_simple_report.main.barplot_multi_binaries_with_strat(df_sum, vis_var, percentage=False, legend=True)¶

Plot multiple binary items simultaneously.

Return type: None

py_simple_report.main.create_one_question_data_container(var_name, str_items, desc, missing='missing')¶

Create question data container object.

Parameters

var_name (str) – question variable name.
str_items (str) – strings which contain a corresponding matches for all numerical values. Ex. ‘1=very good,2=moderately good,3=moderately bad,4=bad’
desc (str) – a question description.
missing (str) – np.nan is replaced with this value.

Return type

QuestionDataContainer

py_simple_report.main.crosstab_cate_barplot(tab, qdc, vis_var, percentage=False, legend=True, transpose=False)¶

Plot crosstabulational data.

Return type: None

py_simple_report.main.crosstab_cate_stacked_barplot(tab, qdc, vis_var, percentage=False, legend=True)¶

Plot crosstabulational data.

Return type: None

py_simple_report.main.crosstab_data(df, qdc, qdc_strf, percentage=True, skip_miss=False, crosstab_kwgs=None)¶

Crosstabulation of data given original dataframe, qdc and qdc_strf. Percentage and skip_miss are adjusted by parameters.

Cross tabulate a specific column with a column for stratification..
Reorder index according to “order” in QuestionDataContainer..
Skip missing variables or not.
Adjsut percentage as 100% or not.

Parameters: corsstab_kwgs – a dictionary passed to pd.crosstab. The “percentage” parameter edit this dictionary.
Return type: DataFrame

py_simple_report.main.heatmap_crosstab_from_df(df, qdc_row, qdc_col, normalize=None, fontsize=8, title=None, xlabel=None, ylabel=None, save_fig_path=None, skip_all=False, show=True, vis_var=None)¶

Create crosstab heatmap from dataframe.

Parameters

df (DataFrame) – dataframe.
qdc_row (QuestionDataContainer) – a qdc for a row direction.
qdc_col (QuestionDataContainer) – a qdc for a column direction.
normalize (Optional[str]) – passed to pd.crosstab.
title (Optional[str]) – title.
xlabel (Optional[str]) – xlabel.
ylabel (Optional[str]) – ylabel.
save_fig_path (Optional[str]) – to save file path.
skip_all (bool) – If True, and normalize takes “index”, “columns” or All, margins=True is passed to
show (bool) – If False, figure is not plotted.
vis_var (Optional[VisVariables]) – Although title, xlabel, and ylabel are contained in vis_var, vis_var is given priority to the above parameters.

Note

If vis_var is None, the following code is run.

>>> vis_var = vs.VisVariables( xrotation=90,  figsize=(4,3),  cmap_name="haline", annotate=True)
>>> if normalize is not None:
>>>     vis_var.title = f"{qdc_row.var_name},{qdc_col.var_name},per"
>>>     vis_var.annotate_fmt = ".1f"
>>> else:
>>>     vis_var.title = f"{qdc_row.var_name},{qdc_col.var_name},cnt"
>>>     vis_var.annotate_fmt = ".0f"

Return type: None

py_simple_report.main.obtain_multi_binaries_items_with_strat(df, q_var_names, qdcs_dic, qdc_strf, percentage=True, fetch_value=1, crosstab_kwgs=None)¶

Calculate percentage of yes for multiple binary question items.

Args :: q_var_names: multiple binary question items for crosstabulation. fetch_value : a value for flag yes.

Return type: DataFrame

py_simple_report.main.one_cate_bar_data(df, qdc, percentage=False, order=None, skip_miss=False)¶

Obtain summarized data for barplot. 1. Value_counts of a specific column. 2. Reorder index according to “order” parameters. 3. Skip missing variables or not. 4. Adjsut percentage as 100% or not.

Return type: Series

py_simple_report.main.one_cate_bar_plot(tab, qdc, vis_var, percentage=False)¶

Plotting bar plot for one categorical variable.

Return type: None

py_simple_report.main.output_crosstab_cate_barplot(df, qdc, qdc_strf, skip_miss=False, vis_var=None, save_fig_path=None, save_num_path=None, percentage=True, include_all=True, show=True, decimal=2, stacked=True, transpose=False)¶

Output cross-tabulated data as number/percentage and a figure.

Parameters

df (DataFrame) – DataFrame used for calculation.
qdc (QuestionDataContainer) – Can include missing.
qdc_strf (QuestionDataContainer) – For stratification. Can not include missing for this qdc.
skip_miss (bool) – If True, missing of rows is ignored and percentage is calculated without missing.
vis_var (Optional[VisVariables]) – For control of visualization. See vs.VisVariables for more detail.
save_fig_path (Optional[str]) – Path for saving fig.
save_num_path (Optional[str]) – Path for saving numbers/percentages.
percentage (bool) – If True, a figure is created as a percentage style.
include_all (bool) – If True, margins in pd.crosstab set True for number and percentage.
show (Union[bool, str]) – Takes True, False, “number”, “figure”.
decimal (int) – Round to “decimal”th place when exporting a percentage.
transpose (bool) – transpose dataframe.

Return type

None

py_simple_report.main.output_multi_binaries_with_strat(df, q_var_names, qdcs_dic, qdc_strf, vis_var=None, save_fig_path=None, save_num_path=None, show=True, percentage=True, transpose=False, decimal=2)¶

Output cross-tabulated data as number/percentage and a figure.

Parameters

df (DataFrame) –
q_var_names (List[str]) –
qdcs_dic (Dict[str, QuestionDataContainer]) –
qdc_strf (QuestionDataContainer) –
vis_var (Optional[VisVariables]) –
save_fig_path (Optional[str]) –
save_num_path (Optional[str]) –
percentage (bool) – If True, a figure is created as a percentage style.
show (bool) – Takes True, False, “number”, “figure”.
transpose (bool) – If True, x and y axis is swapped when plotting.
decimal (int) – Round to “decimal”th place when exporting a percentage.

Return type

None

py_simple_report.main.question_data_containers_from_dataframe(df_var, col_var_name, col_item, col_desc, missing='missing')¶

QestionDataContainers for each question are constructed from variable table of dataframe.

Parameters

df_var (DataFrame) – a dataframe of a variable table.
col_var_name (str) – a column name of variable name.
col_item (str) – a column name of a corresponding items.
col_desc (str) – a column name of a description of a question.

Return type

Dict[str, QuestionDataContainer]

py_simple_report.utils module¶

class py_simple_report.utils.PathOutput(path)¶: Bases: object

py_simple_report.utils.delete_All_from_index_column(tab)¶

Delete “All” row and column from dataframe.

Return type: DataFrame

py_simple_report.utils.delete_and_create_csv(save_num_path)¶

Delete and create csv file.

Return type: None

py_simple_report.utils.imputate_reorder_table(table_, cols_, rows_, fill_value=0, allow_except=True)¶

Insert and reorder dataframe that is created from “.pivot_table” or “.crosstab”.

Parameters

table – dataframe that contains crosstabulated.
cols – columns are reordered by this list.
rows – rows are reordered by this list.
fill_value (int) – values to be filled.
allow_except (bool) – If True, cols and rows that are not contained in “cols_” and “rows_” remain.

Return type

DataFrame

Returns

Reordered table.

py_simple_report.utils.item_str2dict(s, missing=None)¶

Convert strings in items column into dictionary format. If keys can be converted into float type, convert into float.

Parameters

s (str) – string containing information of match patterns between numerical categories and string categories.
missing (Optional[str]) – items for missing. If a string is specified, that string is used to display the name of missing.

Return type

dict

Returns

Keys are numerical categories. Values are string categories.

py_simple_report.utils.save_number_to_data(tab, save_num_path, title='', decimal=None)¶

Add the number to data.

Return type: None

py_simple_report.variables module¶

class py_simple_report.variables.QuestionDataContainer(var_name=None, desc=None, title=None, missing=None, dic=None, order=None)¶

Bases: object

show()¶

class py_simple_report.variables.VisVariables(figsize=(5, 3), dpi=150, xrotation=None, yrotation=None, xlabel=None, ylabel=None, xlabelsize=None, ylabelsize=None, xticksize=None, yticksize=None, xlim=None, ylim=None, title=None, titley=1.05, main_kwgs=None, save_fig_path=None, cmap_type='cmocean', cmap_name='balance', colors=None, show=True, annotate=True, annotate_fontsize=None, annotate_fmt='.1f', annotate_cutoff=10, label_count='Count', label_cont='Percentage (%)')¶: Bases: object

py_simple_report package¶

Submodules¶

py_simple_report.main module¶

py_simple_report.utils module¶

py_simple_report.variables module¶

Module contents¶