Appendix J — chapter3.analysis.visualization

chapter3.analysis.visualization

Provides functions to visualize the obtained results.

Functions

Name	Description
plot_comparison	Generates a figure containing a plot for each data source and model showing the accuracy/F1-score evolution with regards to the training data (i.e., n).
plot_evolution	Generates a figure containing a plot for each data source showing the accuracy/F1-score evolution with regards to the training data (i.e., n).
plot_pairwise_comparision	Generates a figure to represent the pairwise tests generated by `libs.chapter3.statistical_tests.pairwise_n_comparison`
plot_visual_comparison	Generates a figure to visually summarize statistical group comparisons. For each group, plots a symbol representing the best performant item (`focus_on`).
plot_visual_ties	Generates a figure to indicate the ties in the group comparions. For each item, indicates how many times it has tied with other item.

plot_comparison

chapter3.analysis.visualization.plot_comparison(reports, models, sources, filters, sources_print)

Generates a figure containing a plot for each data source and model showing the accuracy/F1-score evolution with regards to the training data (i.e., n).

Parameters

Name	Type	Description	Default
`reports`	pandas.pandas.DataFrame	Model reports.	required
`models`	list[`libs.chapter3.model.Models`]	List with the models to include in the figure.	required
`sources`	list[`libs.chapter3.model.Source`]	List with the data sources to include in the figure.	required
`filters`	`libs.chapter3.model.Filter`	Filter to apply to the model reports.	required
`sources_print`	dict	Mapping between a Source and a string representation.	required

Returns

Type	Description
`plotly.Figure`	Interactive Plotly figure.

plot_evolution

chapter3.analysis.visualization.plot_evolution(reports, sources, filters, fig_titles, filters_secondary=None)

Generates a figure containing a plot for each data source showing the accuracy/F1-score evolution with regards to the training data (i.e., n).

Parameters

Name	Type	Description	Default
`reports`	`pandas.DataFrame`	Model reports.	required
`sources`	list[`libs.chapter3.model.Source`]	List with the data sources to include in the figure.	required
`filters`	`libs.chapter3.model.Filter`	Filter to apply to the model reports.	required
`fig_titles`	list[str]	Title to use for the plot of each data source.	required
`filters_secondary`	`libs.chapter3.model.Filter`	Filter to apply to the model reports, plotting the result in the secondary axis.	`None`

Returns

Type	Description
plotly.plotly.Figure	Interactive Plotly figure.

plot_pairwise_comparision

chapter3.analysis.visualization.plot_pairwise_comparision(reports, sources, filters, sources_print, alternative='two-sided', stars=False, parametric=False)

Generates a figure to represent the pairwise tests generated by libs.chapter3.statistical_tests.pairwise_n_comparison

Parameters

Name	Type	Description	Default
`reports`	`pandas.DataFrame`	Model reports.	required
`sources`	list[`libs.chapter3.model.Source`]	List with the data sources to include in the figure.	required
`filters`	`libs.chapter3.model.Filter`	Filter to apply to the model reports.	required
`sources_print`	dict	Mapping between a Source and a string representation.	required
`alternative`	str	Hypothesis to test. One of: ‘two-sided’, ‘less’ or ‘greater’.	`'two-sided'`
`stars`	boolean	Replace p-values under 0.05 by stars. ‘’ when 0.01<p-value<0.05; ’’ when 0.001<p-value<0.01; ’’ when p-value<0.001;	`False`
`parametric`	boolean	Compute parametric or non-parametric tests.	`False`

Returns

Type	Description
`plotly.Figure`	Interactive Plotly figure.

plot_visual_comparison

chapter3.analysis.visualization.plot_visual_comparison(best_items, significance_results, focus_on, groups)

Generates a figure to visually summarize statistical group comparisons. For each group, plots a symbol representing the best performant item (focus_on). If the item is the statistically best performant item (i.e., doesn’t ties with other item), the symbol is filled. Otherwise, the symbol contains a number indicating the quantity of items the best performant item ties with.

Parameters

Name	Type	Description	Default
`best_items`	dict	Best item from a performance comparison. See: `libs.chapter3.model.obtain_best_items`.	required
`significance_results`	`pd.DataFrame`	DataFrame containing the number of best significant data sources/models for each combination of number of training	required

subjects and models/data sources. focus_on (list[str]): Items being compared. List items are one of libs.chapter3.model.Source or libs.chapter3.model.Model. groups (list[str]): Each group where focus_on items are compared. List items are one of libs.chapter3.model.Source or libs.chapter3.model.Model.

Returns

Type	Description
`plotly.Figure`	Interactive Plotly figure.

plot_visual_ties

chapter3.analysis.visualization.plot_visual_ties(best_items, significance_results, focus_on, groups)

Generates a figure to indicate the ties in the group comparions. For each item, indicates how many times it has tied with other item. Complementary figure to libs.chapter3.visualization.plot_visual_comparison.

Parameters

Name	Type	Description	Default
`best_items`	dict	Best item from a performance comparison. See: `libs.chapter3.model.obtain_best_items`.	required
`significance_results`	`pd.DataFrame`	DataFrame containing the number of best significant data sources/models for each combination of number of training	required

Returns

Type	Description
`plotly.Figure`	Interactive Plotly figure.