WEClustering
Preprocessed data
2000 documents
Elbow for topics discovery
WEClustering_words_emb_clustering.PNG
Elbow on final clustering
WEClustering_CD_matrix_clustering.PNG
Silhouette
WEClustering_silhouette.PNG
Inter Cluster Distance
WEClustering_interdistance.PNG
Intra Cluster Distance
Intra cluster distances for topic 0:
Complete Diameter Distance: 1999.0
Average Diameter Distance: 698.2844360902255
Centroid Diameter Distance: 3982.728611467601
Intra cluster distances for topic 2:
Complete Diameter Distance: 1993.0
Average Diameter Distance: 621.6203680981595
Centroid Diameter Distance: 4146.472106029553
Intra cluster distances for topic 3:
Complete Diameter Distance: 1995.0
Average Diameter Distance: 700.0921209551321
Centroid Diameter Distance: 4152.436644353412
Intra cluster distances for topic 1:
Complete Diameter Distance: 1986.0
Average Diameter Distance: 674.2332056331949
Centroid Diameter Distance: 3582.031089475701
Intra cluster distances for topic 4:
Complete Diameter Distance: 1978.0
Average Diameter Distance: 618.4894146706732
Centroid Diameter Distance: 4139.655301913347
Calinski-Harabasz
Calinski-Harabasz score (higher is better): 1672.112786093521
Davies-Bouldin
Davies-Bouldin score (closer to 0 is better): 0.9258612661868135
Topic diversity
{0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0}
Topic coherence
c_npmi: 0.07004621457256555
c_uci: -0.023082386170603632
c_umass: -2.857567623681361
c_npmi for each topic: [0.06158716302875173, 0.009856810035351493, 0.031588865350902655, 0.14569834344979796, 0.1014998909980239]
c_uci for each topic: [0.06158716302875173, 0.009856810035351493, 0.031588865350902655, 0.14569834344979796, 0.1014998909980239]
c_umass for each topic: [0.06158716302875173, 0.009856810035351493, 0.031588865350902655, 0.14569834344979796, 0.1014998909980239]
Distribution of number of samples
400, 431, 326, 389, 454