Appendix J — chapter3.analysis.visualization

chapter3.analysis.visualization

Provides functions to visualize the obtained results.

Functions

Name Description
plot_comparison Generates a figure containing a plot for each data source and model showing the accuracy/F1-score evolution with regards to the training data (i.e., n).
plot_evolution Generates a figure containing a plot for each data source showing the accuracy/F1-score evolution with regards to the training data (i.e., n).
plot_pairwise_comparision Generates a figure to represent the pairwise tests generated by libs.chapter3.statistical_tests.pairwise_n_comparison
plot_visual_comparison Generates a figure to visually summarize statistical group comparisons. For each group, plots a symbol representing the best performant item (focus_on).
plot_visual_ties Generates a figure to indicate the ties in the group comparions. For each item, indicates how many times it has tied with other item.

plot_comparison

chapter3.analysis.visualization.plot_comparison(reports, models, sources, filters, sources_print)

Generates a figure containing a plot for each data source and model showing the accuracy/F1-score evolution with regards to the training data (i.e., n).

Parameters

Name Type Description Default
reports pandas.pandas.DataFrame Model reports. required
models list[libs.chapter3.model.Models] List with the models to include in the figure. required
sources list[libs.chapter3.model.Source] List with the data sources to include in the figure. required
filters libs.chapter3.model.Filter Filter to apply to the model reports. required
sources_print dict Mapping between a Source and a string representation. required

Returns

Type Description
plotly.Figure Interactive Plotly figure.

plot_evolution

chapter3.analysis.visualization.plot_evolution(reports, sources, filters, fig_titles, filters_secondary=None)

Generates a figure containing a plot for each data source showing the accuracy/F1-score evolution with regards to the training data (i.e., n).

Parameters

Name Type Description Default
reports pandas.DataFrame Model reports. required
sources list[libs.chapter3.model.Source] List with the data sources to include in the figure. required
filters libs.chapter3.model.Filter Filter to apply to the model reports. required
fig_titles list[str] Title to use for the plot of each data source. required
filters_secondary libs.chapter3.model.Filter Filter to apply to the model reports, plotting the result in the secondary axis. None

Returns

Type Description
plotly.plotly.Figure Interactive Plotly figure.

plot_pairwise_comparision

chapter3.analysis.visualization.plot_pairwise_comparision(reports, sources, filters, sources_print, alternative='two-sided', stars=False, parametric=False)

Generates a figure to represent the pairwise tests generated by libs.chapter3.statistical_tests.pairwise_n_comparison

Parameters

Name Type Description Default
reports pandas.DataFrame Model reports. required
sources list[libs.chapter3.model.Source] List with the data sources to include in the figure. required
filters libs.chapter3.model.Filter Filter to apply to the model reports. required
sources_print dict Mapping between a Source and a string representation. required
alternative str Hypothesis to test. One of: ‘two-sided’, ‘less’ or ‘greater’. 'two-sided'
stars boolean Replace p-values under 0.05 by stars. ‘’ when 0.01<p-value<0.05; ’’ when 0.001<p-value<0.01; ’’ when p-value<0.001; False
parametric boolean Compute parametric or non-parametric tests. False

Returns

Type Description
plotly.Figure Interactive Plotly figure.

plot_visual_comparison

chapter3.analysis.visualization.plot_visual_comparison(best_items, significance_results, focus_on, groups)

Generates a figure to visually summarize statistical group comparisons. For each group, plots a symbol representing the best performant item (focus_on). If the item is the statistically best performant item (i.e., doesn’t ties with other item), the symbol is filled. Otherwise, the symbol contains a number indicating the quantity of items the best performant item ties with.

Parameters

Name Type Description Default
best_items dict Best item from a performance comparison. See: libs.chapter3.model.obtain_best_items. required
significance_results pd.DataFrame DataFrame containing the number of best significant data sources/models for each combination of number of training required

subjects and models/data sources. focus_on (list[str]): Items being compared. List items are one of libs.chapter3.model.Source or libs.chapter3.model.Model. groups (list[str]): Each group where focus_on items are compared. List items are one of libs.chapter3.model.Source or libs.chapter3.model.Model.

Returns

Type Description
plotly.Figure Interactive Plotly figure.

plot_visual_ties

chapter3.analysis.visualization.plot_visual_ties(best_items, significance_results, focus_on, groups)

Generates a figure to indicate the ties in the group comparions. For each item, indicates how many times it has tied with other item. Complementary figure to libs.chapter3.visualization.plot_visual_comparison.

Parameters

Name Type Description Default
best_items dict Best item from a performance comparison. See: libs.chapter3.model.obtain_best_items. required
significance_results pd.DataFrame DataFrame containing the number of best significant data sources/models for each combination of number of training required

subjects and models/data sources. focus_on (list[str]): Items being compared. List items are one of libs.chapter3.model.Source or libs.chapter3.model.Model. groups (list[str]): Each group where focus_on items are compared. List items are one of libs.chapter3.model.Source or libs.chapter3.model.Model.

Returns

Type Description
plotly.Figure Interactive Plotly figure.