Abstract Large-scale network data can pose computational challenges, be expensive to acquire, and compromise the privacy of individuals in social networks. We show that the locations and scales of latent space cluster models can be inferred from aggregate network data alone, i.e., the number of connections between groups. We develop a likelihood approximation and, taking a Bayesian perspective, an efficient approach to draw samples from the posterior. We demonstrate this modeling approach using synthetic data and apply it to two real-world datasets: friendships between students from the Add Health study and face-to-face contact patterns in eight European countries. The method eliminates the need for node-level connection data, reduces disclosure risk for individuals, and simplifies data sharing. It also offers performance advantages over node-level latent space models because the computational cost scales with the number of clusters rather than the number of nodes.
Till Hoffmann (Mon,) studied this question.