Statisticians have celebrated a lot recently. 2013 marked the 300th anniversary of Jacob Bernoulli's Ars Conjectandi, which used probability theory to explore the properties of statistics as more observations were taken. It was also the 250th anniversary of Thomas Bayes' essay on how humans can sequentially learn from experience, steadily updating their beliefs as more data become available (1). And it was the International Year of Statistics (2). Now that the bunting has been taken down, it is a good time to take stock of recent developments in statistical science and examine its role in the age of Big Data. Much enthusiasm for statistics hangs on the ever-increasing availability of large data sets, particularly when something has to be ranked or classified. These situations arise, for example, when deciding which book to recommend, working out where your arm is when practicing golf swings in front of a games console, or (if you're a security agency) deciding whose private e-mail to read first. Purely data-based approaches, under the title of machine-learning, have been highly successful in speech recognition, real-time interpretation of moving images, and online translation.
The future lies in uncertainty . D. J. Spiegelhalter
Collective decisions in animal groups emerge from the actions of individuals who are unlikely to have global information. Comparative assessment of options can be valuable in decision-making. Ant colonies are excellent collective decision-makers, for example when selecting a new nest-site.
We show that in the non-linear regime of the optimal velocity model, there is an emergent quantity that gives the extremum headways in the cluster formation, as well as the coexistence curve separating the absolute stable phase from the metastable phase. This emergent quantity is independent of the density of the traffic lane, and determines the width of the transition region from the minimum headways (or clusters) to the maximum headways (or anti-clusters). The width also gives an intrinsic scale that controls the strength of interaction between multiple clusters. This leads to non-trivial cluster statistics from random initial perturbations, and the statistics also depends on the density of the traffic lane. We conjecture these aspects are universal features for various different car-following models.
Cluster Statistics and Universal Aspects of the Optimal Velocity Model in the Non-Linear Regime B Yang, X Xu, Z.F. Pang, C Monterola
Context: Ernst von Glasersfeld introduced radical constructivism in 1974 as a new interpretation of Jean Piaget’s constructivism to give new meanings to the notions of knowledge, communication, and reality. He also claimed that RC would affect traditional theories of education. Problem: After 40 years it has become necessary to review and evaluate von Glasersfeld’s claim. Also, has RC been successful in taking the “social turn” in educational research, or is it unable to go beyond “private worlds? Method: We provide an overview of contributed articles that were written with the aim of showing whether RC has an impact on educational research, and we discuss three core issues: Can RC account for inter-individual aspects? Is RC a theory of learning? And should Piaget be regarded as a radical constructivist? Results: We argue that the contributed papers demonstrate the efficiency of the application of RC to educational research and practice. Our argumentation also shows that in RC it would be misleading to claim a dichotomy between cognition and social interaction (rather, social constructivism is a radical constructivism), that RC does not contain a theory of mathematics learning any more or less than it contains a theory of mathematics teaching, and that Piaget should not be considered a mere trivial constructivist. Implications: Still one of the most challenging influences on educational research and practice, RC is ready to embark on many further questions, including its relationship with other constructivist paradigms, and to make progress in the social dimension.
Sensory neuroprostheses show great potential for alleviating major sensory deficits. It is not known, however, whether such devices can augment the subject’s normal perceptual range. Here we show that adult rats can learn to perceive otherwise invisible infrared light through a neuroprosthesis that couples the output of a head-mounted infrared sensor to their somatosensory cortex (S1) via intracortical microstimulation. Rats readily learn to use this new information source, and generate active exploratory strategies to discriminate among infrared signals in their environment. S1 neurons in these infrared-perceiving rats respond to both whisker deflection and intracortical microstimulation, suggesting that the infrared representation does not displace the original tactile representation. Hence, sensory cortical prostheses, in addition to restoring normal neurological functions, may serve to expand natural perceptual capabilities in mammals.
Perceiving invisible light through a somatosensory cortical prosthesis • Eric E. Thomson, Rafael Carra & Miguel A.L. Nicolelis
Understanding the assembly of ecosystems to estimate the number of species at different spatial scales is a challenging problem. Until now, maximum entropy approaches have lacked the important feature of considering space in an explicit manner. We propose a spatially explicit maximum entropy model suitable to describe spatial patterns such as the species area relationship and the endemic area relationship. Starting from the minimal information extracted from presence/absence data, we compare the behavior of two models considering the occurrence or lack thereof of each species and information on spatial correlations. Our approach uses the information at shorter spatial scales to infer the spatial organization at larger ones. We also hypothesize a possible ecological interpretation of the effective interaction we use to characterize spatial clustering. (http://arxiv.org/abs/1407.2425)
Overexploitation of renewable resources today has a high cost on the welfare of future generations. Unlike in other public goods games, however, future generations cannot reciprocate actions made today. What mechanisms can maintain cooperation with the future? To answer this question, we devise a new experimental paradigm, the /`Intergenerational Goods Game/'. A line-up of successive groups (generations) can each either extract a resource to exhaustion or leave something for the next group. Exhausting the resource maximizes the payoff for the present generation, but leaves all future generations empty-handed. Here we show that the resource is almost always destroyed if extraction decisions are made individually. This failure to cooperate with the future is driven primarily by a minority of individuals who extract far more than what is sustainable. In contrast, when extractions are democratically decided by vote, the resource is consistently sustained. Voting is effective for two reasons. First, it allows a majority of cooperators to restrain defectors. Second, it reassures conditional cooperators that their efforts are not futile. Voting, however, only promotes sustainability if it is binding for all involved. Our results have implications for policy interventions designed to sustain intergenerational public goods.
Cooperating with the future Oliver P. Hauser, David G. Rand, Alexander Peysakhovich & Martin A. Nowak
Core percolation is a fundamental structural transition in complex networks related to a wide range of important problems. Recent advances have provided us an analytical framework of core percolation in uncorrelated random networks with arbitrary degree distributions. Here we apply the tools in analysis of network controllability. We confirm analytically that the emergence of the bifurcation in control coincides with the formation of the core and the structure of the core determines the control mode of the network. We also derive the analytical expression related to the controllability robustness by extending the deduction in core percolation. These findings help us better understand the interesting interplay between the structural and dynamical properties of complex networks.
Connecting Core Percolation and Controllability of Complex Networks • Tao Jia & Márton Pósfai
GitHub is the most popular repository for open source code. It has more than 3.5 million users, as the company declared in April 2013, and more than 10 million repositories, as of December 2013. It has a publicly accessible API and, since March 2012, it also publishes a stream of all the events occurring on public projects. Interactions among GitHub users are of a complex nature and take place in different forms. Developers create and fork repositories, push code, approve code pushed by others, bookmark their favorite projects and follow other developers to keep track of their activities. In this paper we present a characterization of GitHub, as both a social network and a collaborative platform. To the best of our knowledge, this is the first quantitative study about the interactions happening on GitHub. We analyze the logs from the service over 18 months (between March 11, 2012 and September 11, 2013), describing 183.54 million events and we obtain information about 2.19 million users and 5.68 million repositories, both growing linearly in time. We show that the distributions of the number of contributors per project, watchers per project and followers per user show a power-law-like shape. We analyze social ties and repository-mediated collaboration patterns, and we observe a remarkably low level of reciprocity of the social connections. We also measure the activity of each user in terms of authored events and we observe that very active users do not necessarily have a large number of followers. Finally, we provide a geographic characterization of the centers of activity and we investigate how distance influences collaboration.
Coding Together at Scale: GitHub as a Collaborative Social Network Antonio Lima, Luca Rossi, Mirco Musolesi
Predicting a transition point in behavioral data should take into account the complexity of the signal being influenced by contextual factors. In this paper, we propose to analyze changes in the embedding dimension as contextual information indicating a proceeding transitive point, called OPtimal Embedding tRANsition Detection (OPERAND). Three texts were processed and translated to time-series of emotional polarity. It was found that changes in the embedding dimension proceeded transition points in the data. These preliminary results encourage further research into changes in the embedding dimension as generic markers of an approaching transition point.
While numerous changes in human lifestyle constitute modern life, our diet has been gaining attention as a potential contributor to the increase in immune-mediated diseases. The Western diet is characterized by an over consumption and reduced variety of refined sugars, salt, and saturated fat. Herein our objective is to detail the mechanisms for the Western diet’s impact on immune function. The manuscript reviews the impacts and mechanisms of harm for our over-indulgence in sugar, salt, and fat, as well as the data outlining the impacts of artificial sweeteners, gluten, and genetically modified foods; attention is given to revealing where the literature on the immune impacts of macronutrients is limited to either animal or in vitro models versus where human trials exist. Detailed attention is given to the dietary impact on the gut microbiome and the mechanisms by which our poor dietary choices are encoded into our gut, our genes, and are passed to our offspring. While today’s modern diet may provide beneficial protection from micro- and macronutrient deficiencies, our over abundance of calories and the macronutrients that compose our diet may all lead to increased inflammation, reduced control of infection, increased rates of cancer, and increased risk for allergic and auto-inflammatory disease.
Fast food fever: reviewing the impacts of the Western diet on immunity Ian A Myles
To account for the dissipative mechanisms found in nature, non-conservative elements have been incorporated in the energy redistribution rules of sandpiles and similar models of hazard phenomena. In this work, we found that incorporating non-conservation in the form of spatially-distributed sink sites affect both the external driving and internal cascade mechanisms of the sandpile. Increasing sink densities result in the loss of critical behavior, as evidenced by the gradual evolution of the avalanche size distribution from power-law (correlated) to exponential (random). For low density cases, we found no optimal configuration that will minimize the risk of producing large avalanches. Our model is inspired by analogs in natural avalanche systems, where non-conservative elements have an inherent spatial distribution.
Loss of criticality in the avalanche statistics of sandpiles with dissipative sites
Antonino A. Paguirigan Jr., Christopher P. Monterola, Rene C. Batac
More than any other species, humans form social ties to individuals who are neither kin nor mates, and these ties tend to be with similar people. Here, we show that this similarity extends to genotypes. Across the whole genome, friends’ genotypes at the single nucleotide polymorphism level tend to be positively correlated (homophilic). In fact, the increase in similarity relative to strangers is at the level of fourth cousins. However, certain genotypes are also negatively correlated (heterophilic) in friends. And the degree of correlation in genotypes can be used to create a “friendship score” that predicts the existence of friendship ties in a hold-out sample. A focused gene-set analysis indicates that some of the overall correlation in genotypes can be explained by specific systems; for example, an olfactory gene set is homophilic and an immune system gene set is heterophilic, suggesting that these systems may play a role in the formation or maintenance of friendship ties. Friends may be a kind of “functional kin.” Finally, homophilic genotypes exhibit significantly higher measures of positive selection, suggesting that, on average, they may yield a synergistic fitness advantage that has been helping to drive recent human evolution.
Friendship and natural selection Nicholas A. Christakis and James H. Fowler
Cooperating animals frequently show closely coordinated behaviours organized by a continuous flow of information between interacting partners. Such real-time coaction is not captured by the iterated prisoner׳s dilemma and other discrete-time reciprocal cooperation games, which inherently feature a delay in information exchange. Here, we study the evolution of cooperation when individuals can dynamically respond to each other׳s actions.
The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research.
A mother’s diet leaves a lasting impact on the health of her descendants. Mice that are starved during pregnancy give birth to pups that later develop diabetes, and whose offspring are also at risk of the disease. Now a new study provides fresh evidence for the controversial idea that chemical or ‘epigenetic’ alterations to the genome — which influence gene activity, but not the DNA sequence — can transmit the effects of environmental exposures across multiple generations
A number of social-ecological systems exhibit complex behavior associated with nonlinearities, bifurcations, and interaction with stochastic drivers. These systems are often prone to abrupt and unexpected instabilities and state shifts that emerge as a discontinuous response to gradual changes in environmental drivers. Predicting such behaviors is crucial to the prevention of or preparation for unwanted regime shifts. Recent research in ecology has investigated early warning signs that anticipate the divergence of univariate ecosystem dynamics from a stable attractor. To date, leading indicators of instability in systems with multiple interacting components have remained poorly investigated. This is a major limitation in the understanding of the dynamics of complex social-ecological networks. Here, we develop a theoretical framework to demonstrate that rising variance—measured, for example, by the maximum element of the covariance matrix of the network—is an effective leading indicator of network instability. We show that its reliability and robustness depend more on the sign of the interactions within the network than the network structure or noise intensity. Mutualistic, scale free and small world networks are less stable than their antagonistic or random counterparts but their instability is more reliably predicted by this leading indicator. These results provide new advances in multidimensional early warning analysis and offer a framework to evaluate the resilience of social-ecological networks.
Early Warning Signs in Social-Ecological Networks.
PLoS ONE 9(7): e101851. doi:10.1371/journal.pone.0101851 (2014)
This article introduces a special issue of Complexity dedicated to the increasingly important element of complexity science that engages with social policy. We introduce and frame an emerging research agenda that seeks to enhance social policy by working at the interface between the social sciences and the physical sciences (including mathematics and computer science), and term this research area the “social science interface” by analogy with research at the life sciences interface. We locate and exemplify the contribution of complexity science at this new interface before summarizing the contributions collected in this special issue and identifying some common themes that run through them.
Complexity at the social science interface Nigel Gilbert and Seth Bullock
In one important way, the recipient of a heart transplant ignores its new organ: Its nervous system usually doesn’t rewire to communicate with it. The 40,000 neurons controlling a heart operate so perfectly, and are so self-contained, that a heart can be cut out of one body, placed into another, and continue to function perfectly, even in the absence of external control, for a decade or more. This seems necessary: The parts of our nervous system managing our most essential functions behave like a Swiss watch, precisely timed and impervious to perturbations. Chaotic behavior has been throttled out.
Or has it? Two simple pendulums that swing with perfect regularity can, when yoked together, move in a chaotic trajectory. Given that the billions of neurons in our brain are each like a pendulum, oscillating back and forth between resting and firing, and connected to 10,000 other neurons, isn’t chaos in our nervous system unavoidable?
Oscillating diurnal rhythms of gene transcription, metabolic activity, and behavior are found in all three domains of life. However, diel cycles in naturally occurring heterotrophic bacteria and archaea have rarely been observed. Here, we report time-resolved whole-genome transcriptome profiles of multiple, naturally occurring oceanic bacterial populations sampled in situ over 3 days. As anticipated, the cyanobacterial transcriptome exhibited pronounced diel periodicity. Unexpectedly, several different heterotrophic bacterioplankton groups also displayed diel cycling in many of their gene transcripts. Furthermore, diel oscillations in different heterotrophic bacterial groups suggested population-specific timing of peak transcript expression in a variety of metabolic gene suites. These staggered multispecies waves of diel gene transcription may influence both the tempo and the mode of matter and energy transformation in the sea.
Multispecies diel transcriptional oscillations in open ocean heterotrophic bacterial assemblages Elizabeth A. Ottesen, et al.
The spatial dissemination of a directly transmitted infectious disease in a population is driven by population movements from one region to another allowing mixing and importation. Public health policy and planning may thus be more accurate if reliable descriptions of population movements can be considered in the epidemic evaluations. Next to census data, generally available in developed countries, alternative solutions can be found to describe population movements where official data is missing. These include mobility models, such as the radiation model, and the analysis of mobile phone activity records providing individual geo-temporal information. Here we explore to what extent mobility proxies, such as mobile phone data or mobility models, can effectively be used in epidemic models for influenza-like-illnesses and how they compare to official census data. By focusing on three European countries, we find that phone data matches the commuting patterns reported by census well but tends to overestimate the number of commuters, leading to a faster diffusion of simulated epidemics. The order of infection of newly infected locations is however well preserved, whereas the pattern of epidemic invasion is captured with higher accuracy by the radiation model for centrally seeded epidemics and by phone proxy for peripherally seeded epidemics.
Group-level cognitive states are widely observed in human social systems, but their discussion is often ruled out a priori in quantitative approaches. In this paper, we show how reference to the irreducible mental states and psychological dynamics of a group is necessary to make sense of large scale social phenomena. We introduce the problem of mental boundaries by reference to a classic problem in the evolution of cooperation. We then provide an explicit quantitative example drawn from ongoing work on cooperation and conflict among Wikipedia editors. We show the limitations of methodological individualism, and the substantial benefits that come from being able to refer to collective intentions and attributions of cognitive states of the form "what the group believes" and "what the group values".
Mitigating traffic congestion on urban roads, with paramount importance in urban development and reduction of energy consumption and air pollution, depends on our ability to foresee road usage and traffic conditions pertaining to the collective behavior of drivers, raising a significant question: to what degree is road traffic predictable in urban areas? Here we rely on the precise records of daily vehicle mobility based on GPS positioning device installed in taxis to uncover the potential daily predictability of urban traffic patterns. Using the mapping from the degree of congestion on roads into a time series of symbols and measuring its entropy, we find a relatively high daily predictability of traffic conditions despite the absence of any a priori knowledge of drivers' origins and destinations and quite different travel patterns between weekdays and weekends. Moreover, we find a counterintuitive dependence of the predictability on travel speed: the road segment associated with intermediate average travel speed is most difficult to be predicted. We also explore the possibility of recovering the traffic condition of an inaccessible segment from its adjacent segments with respect to limited observability. The highly predictable traffic patterns in spite of the heterogeneity of drivers' behaviors and the variability of their origins and destinations enables development of accurate predictive models for eventually devising practical strategies to mitigate urban road congestion.
Predictability of road traffic and congestion in urban areas Jingyuan Wang, Yu Mao, Jing Li, Chao Li, Zhang Xiong, Wen-Xu Wang
The exponential growth in world population is feeding a steadily increasing global need for arable farmland, a resource that is already in high demand. This trend has led to increased farming on subprime arid and semi-arid lands, where limited availability of water and a host of environmental stresses often severely reduce crop productivity. The conventional approach to mitigating the abiotic stresses associated with arid climes is to breed for stress-tolerant cultivars, a time and labor intensive venture that often neglects the complex ecological context of the soil environment in which the crop is grown. In recent years, studies have attempted to identify microbial symbionts capable of conferring the same stress-tolerance to their plant hosts, and new developments in genomic technologies have greatly facilitated such research. Here, we highlight many of the advantages of these symbiont-based approaches and argue in favor of the broader recognition of crop species as ecological niches for a diverse community of microorganisms that function in concert with their plant hosts and each other to thrive under fluctuating environmental conditions.