An outstanding open problem is whether collective social phenomena occurring over short timescales can systematically reduce cultural heterogeneity in the long run, and whether offline and online human interactions contribute differently to the process. Theoretical models suggest that short-term collective behavior and long-term cultural diversity are mutually excluding, since they require very different levels of social influence. The latter jointly depends on two factors: the topology of the underlying social network and the overlap between individuals in multidimensional cultural space. However, while the empirical properties of social networks are intensively studied, little is known about the large-scale organization of real societies in cultural space, so that random input specifications are necessarily used in models. Here we use a large dataset to perform a high-dimensional analysis of the scientific beliefs of thousands of Europeans. We find that interopinion correlations determine a nontrivial ultrametric hierarchy of individuals in cultural space. When empirical data are used as inputs in models, ultrametricity has strong and counterintuitive effects. On short timescales, it facilitates a symmetry-breaking phase transition triggering coordinated social behavior. On long timescales, it suppresses cultural convergence by restricting it within disjoint groups. Moreover, ultrametricity implies that these results are surprisingly robust to modifications of the dynamical rules considered. Thus the empirical distribution of individuals in cultural space appears to systematically optimize the coexistence of short-term collective behavior and long-term cultural diversity, which can be realized simultaneously for the same moderate level of mutual influence in a diverse range of online and offline settings.