We compute the variation of information (VI) between the partition provided by new_classification and old_classification. The VI between a random partitions (obtained with re-shuffle from original labels in old_classification) and old_classification is also computed. A distribution of VI values from random partitions is built. Finally, from the comparison with this distribution, an empirical p value is given to the VI of the unsupervised cluster analysis.
vi_comparison(old_classification, new_classification, number_iter)
Character vector. First column of the dataframe returned by function clustering_angular_distance (first element of the output).
Character vector.Second column of the dataframe returned by function clustering_angular_distance (first element of the output).
Integer value. Specify how many random partition are generated (starting from re-shuffle of labels in old_classification).
Empirical p value.