py_simple_report package¶
Submodules¶
py_simple_report.main module¶
- py_simple_report.main.barplot_multi_binaries_with_strat(df_sum, vis_var, percentage=False, legend=True)¶
Plot multiple binary items simultaneously.
- Return type
None
- py_simple_report.main.create_one_question_data_container(var_name, str_items, desc, missing='missing')¶
Create question data container object.
- Parameters
var_name (
str) – question variable name.str_items (
str) – strings which contain a corresponding matches for all numerical values. Ex. ‘1=very good,2=moderately good,3=moderately bad,4=bad’desc (
str) – a question description.missing (
str) – np.nan is replaced with this value.
- Return type
- py_simple_report.main.crosstab_cate_barplot(tab, qdc, vis_var, percentage=False, legend=True, transpose=False)¶
Plot crosstabulational data.
- Return type
None
- py_simple_report.main.crosstab_cate_stacked_barplot(tab, qdc, vis_var, percentage=False, legend=True)¶
Plot crosstabulational data.
- Return type
None
- py_simple_report.main.crosstab_data(df, qdc, qdc_strf, percentage=True, skip_miss=False, crosstab_kwgs=None)¶
Crosstabulation of data given original dataframe, qdc and qdc_strf. Percentage and skip_miss are adjusted by parameters.
Cross tabulate a specific column with a column for stratification..
Reorder index according to “order” in QuestionDataContainer..
Skip missing variables or not.
Adjsut percentage as 100% or not.
- Parameters
corsstab_kwgs – a dictionary passed to pd.crosstab. The “percentage” parameter edit this dictionary.
- Return type
DataFrame
- py_simple_report.main.heatmap_crosstab_from_df(df, qdc_row, qdc_col, normalize=None, fontsize=8, title=None, xlabel=None, ylabel=None, save_fig_path=None, skip_all=False, show=True, vis_var=None)¶
Create crosstab heatmap from dataframe.
- Parameters
df (
DataFrame) – dataframe.qdc_row (
QuestionDataContainer) – a qdc for a row direction.qdc_col (
QuestionDataContainer) – a qdc for a column direction.normalize (
Optional[str]) – passed to pd.crosstab.title (
Optional[str]) – title.xlabel (
Optional[str]) – xlabel.ylabel (
Optional[str]) – ylabel.save_fig_path (
Optional[str]) – to save file path.skip_all (
bool) – If True, and normalize takes “index”, “columns” or All, margins=True is passed toshow (
bool) – If False, figure is not plotted.vis_var (
Optional[VisVariables]) – Although title, xlabel, and ylabel are contained in vis_var, vis_var is given priority to the above parameters.
Note
If vis_var is None, the following code is run.
>>> vis_var = vs.VisVariables( xrotation=90, figsize=(4,3), cmap_name="haline", annotate=True) >>> if normalize is not None: >>> vis_var.title = f"{qdc_row.var_name},{qdc_col.var_name},per" >>> vis_var.annotate_fmt = ".1f" >>> else: >>> vis_var.title = f"{qdc_row.var_name},{qdc_col.var_name},cnt" >>> vis_var.annotate_fmt = ".0f"
- Return type
None
- py_simple_report.main.obtain_multi_binaries_items_with_strat(df, q_var_names, qdcs_dic, qdc_strf, percentage=True, fetch_value=1, crosstab_kwgs=None)¶
Calculate percentage of yes for multiple binary question items.
- Args :
q_var_names: multiple binary question items for crosstabulation. fetch_value : a value for flag yes.
- Return type
DataFrame
- py_simple_report.main.one_cate_bar_data(df, qdc, percentage=False, order=None, skip_miss=False)¶
Obtain summarized data for barplot. 1. Value_counts of a specific column. 2. Reorder index according to “order” parameters. 3. Skip missing variables or not. 4. Adjsut percentage as 100% or not.
- Return type
Series
- py_simple_report.main.one_cate_bar_plot(tab, qdc, vis_var, percentage=False)¶
Plotting bar plot for one categorical variable.
- Return type
None
- py_simple_report.main.output_crosstab_cate_barplot(df, qdc, qdc_strf, skip_miss=False, vis_var=None, save_fig_path=None, save_num_path=None, percentage=True, include_all=True, show=True, decimal=2, stacked=True, transpose=False)¶
Output cross-tabulated data as number/percentage and a figure.
- Parameters
df (
DataFrame) – DataFrame used for calculation.qdc (
QuestionDataContainer) – Can include missing.qdc_strf (
QuestionDataContainer) – For stratification. Can not include missing for this qdc.skip_miss (
bool) – If True, missing of rows is ignored and percentage is calculated without missing.vis_var (
Optional[VisVariables]) – For control of visualization. See vs.VisVariables for more detail.save_fig_path (
Optional[str]) – Path for saving fig.save_num_path (
Optional[str]) – Path for saving numbers/percentages.percentage (
bool) – If True, a figure is created as a percentage style.include_all (
bool) – If True, margins in pd.crosstab set True for number and percentage.show (
Union[bool,str]) – Takes True, False, “number”, “figure”.decimal (
int) – Round to “decimal”th place when exporting a percentage.transpose (
bool) – transpose dataframe.
- Return type
None
- py_simple_report.main.output_multi_binaries_with_strat(df, q_var_names, qdcs_dic, qdc_strf, vis_var=None, save_fig_path=None, save_num_path=None, show=True, percentage=True, transpose=False, decimal=2)¶
Output cross-tabulated data as number/percentage and a figure.
- Parameters
df (
DataFrame) –q_var_names (
List[str]) –qdcs_dic (
Dict[str,QuestionDataContainer]) –qdc_strf (
QuestionDataContainer) –vis_var (
Optional[VisVariables]) –save_fig_path (
Optional[str]) –save_num_path (
Optional[str]) –percentage (
bool) – If True, a figure is created as a percentage style.show (
bool) – Takes True, False, “number”, “figure”.transpose (
bool) – If True, x and y axis is swapped when plotting.decimal (
int) – Round to “decimal”th place when exporting a percentage.
- Return type
None
- py_simple_report.main.question_data_containers_from_dataframe(df_var, col_var_name, col_item, col_desc, missing='missing')¶
QestionDataContainers for each question are constructed from variable table of dataframe.
- Parameters
df_var (
DataFrame) – a dataframe of a variable table.col_var_name (
str) – a column name of variable name.col_item (
str) – a column name of a corresponding items.col_desc (
str) – a column name of a description of a question.
- Return type
Dict[str,QuestionDataContainer]
py_simple_report.utils module¶
- class py_simple_report.utils.PathOutput(path)¶
Bases:
object
- py_simple_report.utils.delete_All_from_index_column(tab)¶
Delete “All” row and column from dataframe.
- Return type
DataFrame
- py_simple_report.utils.delete_and_create_csv(save_num_path)¶
Delete and create csv file.
- Return type
None
- py_simple_report.utils.imputate_reorder_table(table_, cols_, rows_, fill_value=0, allow_except=True)¶
Insert and reorder dataframe that is created from “.pivot_table” or “.crosstab”.
- Parameters
- Return type
DataFrame- Returns
Reordered table.
- py_simple_report.utils.item_str2dict(s, missing=None)¶
Convert strings in items column into dictionary format. If keys can be converted into float type, convert into float.
- Parameters
s (
str) – string containing information of match patterns between numerical categories and string categories.missing (
Optional[str]) – items for missing. If a string is specified, that string is used to display the name of missing.
- Return type
dict- Returns
Keys are numerical categories. Values are string categories.
- py_simple_report.utils.save_number_to_data(tab, save_num_path, title='', decimal=None)¶
Add the number to data.
- Return type
None
py_simple_report.variables module¶
- class py_simple_report.variables.QuestionDataContainer(var_name=None, desc=None, title=None, missing=None, dic=None, order=None)¶
Bases:
object- show()¶
- class py_simple_report.variables.VisVariables(figsize=(5, 3), dpi=150, xrotation=None, yrotation=None, xlabel=None, ylabel=None, xlabelsize=None, ylabelsize=None, xticksize=None, yticksize=None, xlim=None, ylim=None, title=None, titley=1.05, main_kwgs=None, save_fig_path=None, cmap_type='cmocean', cmap_name='balance', colors=None, show=True, annotate=True, annotate_fontsize=None, annotate_fmt='.1f', annotate_cutoff=10, label_count='Count', label_cont='Percentage (%)')¶
Bases:
object