Hierarchical clustering is an unsupervised framework that organizes observations according to pairwise similarity relationships. In this study, an agglomerative hierarchical approach combined with Gower dissimilarity is employed to accommodate mixed-type customer data. To address data quality issues such as missing values and outliers, Multiple Imputation by Chained Equations (MICE) and Winsorization are incorporated into the preprocessing pipeline. To validate cluster stability and identify the optimal number of clusters, we employ silhouette analysis, the Davies–Bouldin Index (DBI), the Proportion of Ambiguous Clustering (PAC), and a subsampling-based consensus clustering framework. A consensus-based hierarchical tree derived from the consensus matrix is employed to assess the robustness of the segmentation structure. The resulting clusters are further evaluated through comparisons with baseline algorithms for mixed-type data, including Partitioning Around Medoids (PAM) based on Gower dissimilarity and the K-prototypes method, together with statistical tests confirming significant behavioral differences between the identified segments. From an application standpoint, these results provide a data-driven basis for customer targeting by identifying distinct behavioral patterns, thereby supporting more effective engagement strategies and optimized resource allocation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nooshin Marefat
Purificación Galindo‐Villardón
Purificación Vicente-Galindo
Mathematics
Universidad de Salamanca
Escuela Superior Politecnica del Litoral
Universidad Estatal de Milagro
Building similarity graph...
Analyzing shared references across papers
Loading...
Marefat et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69df2b85e4eeef8a2a6b077b — DOI: https://doi.org/10.3390/math14081294