py_simple_report package

Submodules

py_simple_report.main module

py_simple_report.main.barplot_multi_binaries_with_strat(df_sum, vis_var, percentage=False, legend=True)

Plot multiple binary items simultaneously.

Return type

None

py_simple_report.main.create_one_question_data_container(var_name, str_items, desc, missing='missing')

Create question data container object.

Parameters
  • var_name (str) – question variable name.

  • str_items (str) – strings which contain a corresponding matches for all numerical values. Ex. ‘1=very good,2=moderately good,3=moderately bad,4=bad’

  • desc (str) – a question description.

  • missing (str) – np.nan is replaced with this value.

Return type

QuestionDataContainer

py_simple_report.main.crosstab_cate_barplot(tab, qdc, vis_var, percentage=False, legend=True, transpose=False)

Plot crosstabulational data.

Return type

None

py_simple_report.main.crosstab_cate_stacked_barplot(tab, qdc, vis_var, percentage=False, legend=True)

Plot crosstabulational data.

Return type

None

py_simple_report.main.crosstab_data(df, qdc, qdc_strf, percentage=True, skip_miss=False, crosstab_kwgs=None)

Crosstabulation of data given original dataframe, qdc and qdc_strf. Percentage and skip_miss are adjusted by parameters.

  1. Cross tabulate a specific column with a column for stratification..

  2. Reorder index according to “order” in QuestionDataContainer..

  3. Skip missing variables or not.

  4. Adjsut percentage as 100% or not.

Parameters

corsstab_kwgs – a dictionary passed to pd.crosstab. The “percentage” parameter edit this dictionary.

Return type

DataFrame

py_simple_report.main.heatmap_crosstab_from_df(df, qdc_row, qdc_col, normalize=None, fontsize=8, title=None, xlabel=None, ylabel=None, save_fig_path=None, skip_all=False, show=True, vis_var=None)

Create crosstab heatmap from dataframe.

Parameters
  • df (DataFrame) – dataframe.

  • qdc_row (QuestionDataContainer) – a qdc for a row direction.

  • qdc_col (QuestionDataContainer) – a qdc for a column direction.

  • normalize (Optional[str]) – passed to pd.crosstab.

  • title (Optional[str]) – title.

  • xlabel (Optional[str]) – xlabel.

  • ylabel (Optional[str]) – ylabel.

  • save_fig_path (Optional[str]) – to save file path.

  • skip_all (bool) – If True, and normalize takes “index”, “columns” or All, margins=True is passed to

  • show (bool) – If False, figure is not plotted.

  • vis_var (Optional[VisVariables]) – Although title, xlabel, and ylabel are contained in vis_var, vis_var is given priority to the above parameters.

Note

If vis_var is None, the following code is run.

>>> vis_var = vs.VisVariables( xrotation=90,  figsize=(4,3),  cmap_name="haline", annotate=True)
>>> if normalize is not None:
>>>     vis_var.title = f"{qdc_row.var_name},{qdc_col.var_name},per"
>>>     vis_var.annotate_fmt = ".1f"
>>> else:
>>>     vis_var.title = f"{qdc_row.var_name},{qdc_col.var_name},cnt"
>>>     vis_var.annotate_fmt = ".0f"
Return type

None

py_simple_report.main.obtain_multi_binaries_items_with_strat(df, q_var_names, qdcs_dic, qdc_strf, percentage=True, fetch_value=1, crosstab_kwgs=None)

Calculate percentage of yes for multiple binary question items.

Args :

q_var_names: multiple binary question items for crosstabulation. fetch_value : a value for flag yes.

Return type

DataFrame

py_simple_report.main.one_cate_bar_data(df, qdc, percentage=False, order=None, skip_miss=False)

Obtain summarized data for barplot. 1. Value_counts of a specific column. 2. Reorder index according to “order” parameters. 3. Skip missing variables or not. 4. Adjsut percentage as 100% or not.

Return type

Series

py_simple_report.main.one_cate_bar_plot(tab, qdc, vis_var, percentage=False)

Plotting bar plot for one categorical variable.

Return type

None

py_simple_report.main.output_crosstab_cate_barplot(df, qdc, qdc_strf, skip_miss=False, vis_var=None, save_fig_path=None, save_num_path=None, percentage=True, include_all=True, show=True, decimal=2, stacked=True, transpose=False)

Output cross-tabulated data as number/percentage and a figure.

Parameters
  • df (DataFrame) – DataFrame used for calculation.

  • qdc (QuestionDataContainer) – Can include missing.

  • qdc_strf (QuestionDataContainer) – For stratification. Can not include missing for this qdc.

  • skip_miss (bool) – If True, missing of rows is ignored and percentage is calculated without missing.

  • vis_var (Optional[VisVariables]) – For control of visualization. See vs.VisVariables for more detail.

  • save_fig_path (Optional[str]) – Path for saving fig.

  • save_num_path (Optional[str]) – Path for saving numbers/percentages.

  • percentage (bool) – If True, a figure is created as a percentage style.

  • include_all (bool) – If True, margins in pd.crosstab set True for number and percentage.

  • show (Union[bool, str]) – Takes True, False, “number”, “figure”.

  • decimal (int) – Round to “decimal”th place when exporting a percentage.

  • transpose (bool) – transpose dataframe.

Return type

None

py_simple_report.main.output_multi_binaries_with_strat(df, q_var_names, qdcs_dic, qdc_strf, vis_var=None, save_fig_path=None, save_num_path=None, show=True, percentage=True, transpose=False, decimal=2)

Output cross-tabulated data as number/percentage and a figure.

Parameters
  • df (DataFrame) –

  • q_var_names (List[str]) –

  • qdcs_dic (Dict[str, QuestionDataContainer]) –

  • qdc_strf (QuestionDataContainer) –

  • vis_var (Optional[VisVariables]) –

  • save_fig_path (Optional[str]) –

  • save_num_path (Optional[str]) –

  • percentage (bool) – If True, a figure is created as a percentage style.

  • show (bool) – Takes True, False, “number”, “figure”.

  • transpose (bool) – If True, x and y axis is swapped when plotting.

  • decimal (int) – Round to “decimal”th place when exporting a percentage.

Return type

None

py_simple_report.main.question_data_containers_from_dataframe(df_var, col_var_name, col_item, col_desc, missing='missing')

QestionDataContainers for each question are constructed from variable table of dataframe.

Parameters
  • df_var (DataFrame) – a dataframe of a variable table.

  • col_var_name (str) – a column name of variable name.

  • col_item (str) – a column name of a corresponding items.

  • col_desc (str) – a column name of a description of a question.

Return type

Dict[str, QuestionDataContainer]

py_simple_report.utils module

class py_simple_report.utils.PathOutput(path)

Bases: object

py_simple_report.utils.delete_All_from_index_column(tab)

Delete “All” row and column from dataframe.

Return type

DataFrame

py_simple_report.utils.delete_and_create_csv(save_num_path)

Delete and create csv file.

Return type

None

py_simple_report.utils.imputate_reorder_table(table_, cols_, rows_, fill_value=0, allow_except=True)

Insert and reorder dataframe that is created from “.pivot_table” or “.crosstab”.

Parameters
  • table – dataframe that contains crosstabulated.

  • cols – columns are reordered by this list.

  • rows – rows are reordered by this list.

  • fill_value (int) – values to be filled.

  • allow_except (bool) – If True, cols and rows that are not contained in “cols_” and “rows_” remain.

Return type

DataFrame

Returns

Reordered table.

py_simple_report.utils.item_str2dict(s, missing=None)

Convert strings in items column into dictionary format. If keys can be converted into float type, convert into float.

Parameters
  • s (str) – string containing information of match patterns between numerical categories and string categories.

  • missing (Optional[str]) – items for missing. If a string is specified, that string is used to display the name of missing.

Return type

dict

Returns

Keys are numerical categories. Values are string categories.

py_simple_report.utils.save_number_to_data(tab, save_num_path, title='', decimal=None)

Add the number to data.

Return type

None

py_simple_report.variables module

class py_simple_report.variables.QuestionDataContainer(var_name=None, desc=None, title=None, missing=None, dic=None, order=None)

Bases: object

show()
class py_simple_report.variables.VisVariables(figsize=(5, 3), dpi=150, xrotation=None, yrotation=None, xlabel=None, ylabel=None, xlabelsize=None, ylabelsize=None, xticksize=None, yticksize=None, xlim=None, ylim=None, title=None, titley=1.05, main_kwgs=None, save_fig_path=None, cmap_type='cmocean', cmap_name='balance', colors=None, show=True, annotate=True, annotate_fontsize=None, annotate_fmt='.1f', annotate_cutoff=10, label_count='Count', label_cont='Percentage (%)')

Bases: object

Module contents