Computational creativity is an emerging branch of artificial intelligence that places computers in the center of the creative process. Broadly, creativity involves a generative step to produce many ideas and a selective step to determine the ones that are the best. Many previous attempts at computational creativity, however, have not been able to achieve a valid selective step. This work shows how bringing data sources from the creative domain and from hedonic psychophysics together with big data analytics techniques can overcome this shortcoming to yield a system that can produce novel and high-quality creative artifacts. Our data-driven approach is demonstrated through a computational creativity system for culinary recipes and menus we developed and deployed, which can operate either autonomously or semi-autonomously with human interaction. We also comment on the volume, velocity, variety, and veracity of data in computational creativity.
Abstract: While many recently proposed methods aim to detect network communities in large datasets, such as those generated by social media and telecommunications services, most evaluation (i.e. benchmarking) of this research is based on small, hand-curated datasets. We argue that these two types of networks differ so significantly that, by evaluating algorithms solely on the smaller networks, we know little about how well they perform on the larger datasets. Recent work addresses this problem by introducing social network datasets annotated with meta-data that is believed to approximately indicate a ‘ground truth’ set of network communities. While such efforts are a step in the right direction, we find this meta-data problematic for two reasons. First, in practice, the groups contained in such meta-data may only be a subset of a network's communities. Second, while it is often reasonable to assume that meta-data is related to network communities in some way, we must be cautious about assuming that these groups correspond closely to network communities. Here, we consider these difficulties and propose an evaluation scheme based on a classification task that is tailored to deal with them.
Some communities have agreed to share online — geneticists, for example, post DNA sequences at the GenBank repository, and astronomers are accustomed to accessing images of galaxies and stars from, say, the Sloan Digital Sky Survey, a telescope that has observed some 500 million objects — but these remain the exception, not the rule. Historically, scientists have objected to sharing for many reasons: it is a lot of work; until recently, good databases did not exist; grant funders were not pushing for sharing; it has been difficult to agree on standards for formatting data and the contextual information called metadata; and there is no agreed way to assign credit for data. But the barriers are disappearing, in part because journals and funding agencies worldwide are encouraging scientists to make their data public.
Last week, civil libertarians cried foul when press reports revealed that, in its efforts to ferret out terrorists, the U.S. National Security Agency (NSA) is collecting cell phone records and Internet data from companies such as Verizon, Facebook, and Skype. Some argued that the federal government is spying on its own citizens. From the nature of the data, scientists say it's clear that NSA is performing network analysis, a type of science that aims to identify social groups from the connections among people. And NSA is hardly the only organization doing such work, researchers say. Private companies are already tracing people's social circles.
Network Science at Center of Surveillance Dispute Adrian Cho
In financial markets, participants locally optimize their profit which can result in a globally unstable state leading to a catastrophic change. The largest crash in the past decades is the bankruptcy of Lehman Brothers which was followed by a trust-based crisis between banks due to high-risk trading in complex products. We introduce information dissipation length (IDL) as a leading indicator of global instability of dynamical systems based on the transmission of Shannon information, and apply it to the time series of USD and EUR interest rate swaps (IRS). We find in both markets that the IDL steadily increases toward the bankruptcy, then peaks at the time of bankruptcy, and decreases afterwards. Previously introduced indicators such as ‘critical slowing down’ do not provide a clear leading indicator. Our results suggest that the IDL may be used as an early-warning signal for critical transitions even in the absence of a predictive model.
Information dissipation as an early-warning signal for the Lehman Brothers collapse in financial time series Rick Quax, Drona Kandhai & Peter M. A. Sloot Scientific Reports 3, Article number: 1898 http://dx.doi.org/10.1038/srep01898
We introduce a network-based index analyzing excess scientific production and consumption to perform a comprehensive global analysis of scholarly knowledge production and diffusion on the level of continents, countries, and cities.
Today’s strongly connected, global networks have produced highly interdependent systems that we do not understand and cannot control well. These systems are vulnerable to failure at all scales, posing serious threats to society, even when external shocks are absent. As the complexity and interaction strengths in our networked world increase, man-made systems can become unstable, creating uncontrollable situations even when decision-makers are well-skilled, have all data and technology at their disposal, and do their best. To make these systems manageable, a fundamental redesign is needed. A ‘Global Systems Science’ might create the required knowledge and paradigm shift in thinking.
A few years ago, Hawking was asked what he thought of the common opinion that the twentieth century was that of biology and the twenty-first century would be that of physics. Hawking replied that in his opinion the twenty-first century would be the “century of complexity”. That remark probably holds more useful advice for contemporary students than they realize since it points to at least two skills which are going to be essential for new college grads in the age of complexity: statistics and data visualization.
We analyze the entire publication database of the American Physical Society generating longitudinal (50 years) citation networks geolocalized at the level of single urban areas. We define the knowledge diffusion proxy, and scientific production ranking algorithms to capture the spatio-temporal dynamics of Physics knowledge worldwide. By using the knowledge diffusion proxy we identify the key cities in the production and consumption of knowledge in Physics as a function of time. The results from the scientific production ranking algorithm allow us to characterize the top cities for scholarly research in Physics. Although we focus on a single dataset concerning a specific field, the methodology presented here opens the path to comparative studies of the dynamics of knowledge across disciplines and research areas.
Characterizing scientific production and consumption in Physics
We introduce an automated method for the bottom-up reconstruction of the cognitive evolution of science, based on big-data issued from digital libraries, and modeled as lineage relationships between scientific fields. We refer to these dynamic structures as phylomemetic networks or phylomemies, by analogy with biological evolution; and we show that they exhibit strong regularities, with clearly identifiable phylomemetic patterns. Some structural properties of the scientific fields - in particular their density -, which are defined independently of the phylomemy reconstruction, are clearly correlated with their status and their fate in the phylomemy (like their age or their short term survival). Within the framework of a quantitative epistemology, this approach raises the question of predictibility for science evolution, and sketches a prototypical life cycle of the scientific fields: an increase of their cohesion after their emergence, the renewal of their conceptual background through branching or merging events, before decaying when their density is getting too low.
Effective point-of-use devices for providing safe drinking water are urgently needed to reduce the global burden of waterborne disease. Here we show that plant xylem from the sapwood of coniferous trees – a readily available, inexpensive, biodegradable, and disposable material – can remove bacteria from water by simple pressure-driven filtration. Approximately 3 cm3 of sapwood can filter water at the rate of several liters per day, sufficient to meet the clean drinking water needs of one person. The results demonstrate the potential of plant xylem to address the need for pathogen-free drinking water in developing countries and resource-limited settings.
Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant.
Models for the topology or dynamics of various networks abound, but until now, there has been no single universal framework for complex networks that can separate factors contributing to the topology and dynamics of networks.
Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.
Network deconvolution as a general method to distinguish direct dependencies in networks Soheil Feizi, Daniel Marbach, Muriel Médard & Manolis Kellis
Food webs are networks of feeding interactions among species. Although parasites comprise a large proportion of species diversity, they have generally been underrepresented in food web data and analyses. Previous analyses of the few datasets that contain parasites have indicated that their inclusion alters network structure. However, it is unclear whether those alterations were a result of unique roles that parasites play, or resulted from the changes in diversity and complexity that would happen when any type of species is added to a food web. In this study, we analyzed many aspects of the network structure of seven highly resolved coastal estuary or marine food webs with parasites. In most cases, we found that including parasites in the analysis results in generic changes to food web structure that would be expected with increased diversity and complexity. However, in terms of specific patterns of links in the food web (“motifs”) and the breadth and contiguity of feeding niches, parasites do appear to alter structure in ways that result from unique traits—in particular, their close physical intimacy with their hosts, their complex life cycles, and their small body sizes. Thus, this study disentangles unique from generic effects of parasites on food web organization, providing better understanding of similarities and differences between parasites and free-living species in their roles as consumers and resources.
Dunne JA, Lafferty KD, Dobson AP, Hechinger RF, Kuris AM, et al. (2013) Parasites Affect Food Web Structure Primarily through Increased Diversity and Complexity. PLoS Biol 11(6): e1001579. http://dx.doi.org/10.1371/journal.pbio.1001579
Biologists are joining the big-data club. With the advent of high-throughput genomics, life scientists are starting to grapple with massive data sets, encountering challenges with handling, processing and moving information that were once the domain of astronomers and high-energy physicists
Nationwide hackathon this weekend encourages coders to use publicly available data to tackle problems ranging from poverty to poultry handling.
Eric L Berlow's insight:
For Intel, the hacking event is the latest in a series of initiatives designed to draw attention to big data's impact on society. WeTheData, a research program the company conducted last year in collaboration with Vibrant Data Labs, a collective of scientists, artists, and designers, posed technical challenges that focused on the democratization of digital information.
The WeTheData research made it clear that data literacy is a fundamental principle that must be in place for a new "data society" to emerge, Barnett noted. Events like the National Day of Civic Hacking are designed to move that process by making hard-to get (and often inaccessible) data sets available to everyone.
Consensus in multi-agents systems can be efficiently used for large-scale optimization problems. Connectivity structure of the consensus network is effective in the convergence to the optimum solution where random structures show better performance as compared to heterogeneous networks.
Large-scale global optimization through consensus of opinions over complex networks Omid Askari Sichani and Mahdi Jalili
It is conventional in labor economics to treat all workers who are seeking new jobs as belonging to a labor pool, and all firms that have job vacancies as an employer pool, and then match workers to jobs. Here we develop a new approach to study labor and firm dynamics. By combining the emerging science of networks with newly available employment micro-data, comprehensive at the level of whole countries, we are able to broadly characterize the process through which workers move between firms. Specifically, for each firm in an economy as a node in a graph, we draw edges between firms if a worker has migrated between them, possibly with a spell of unemployment in between. An economy's overall graph of firm-worker interactions is an object we call the labor flow network (LFN). This is the first study that characterizes a LFN for an entire economy. We explore the properties of this network, including its topology, its community structure, and its relationship to economic variables. It is shown that LFNs can be useful in identifying firms with high growth potential. We relate LFNs to other notions of high performance firms. Specifically, it is shown that fewer than 10% of firms account for nearly 90% of all employment growth. We conclude with a model in which empirically-salient LFNs emerge from the interaction of heterogeneous adaptive agents in a decentralized labor market.
A large-scale residential-location model of the Greater London region is being developed in which all stages of the model-building process—from data input, analysis through calibration to prediction—are rapid to execute and accessible in a visual and immediate fashion. The model is structured to distribute trips across competing modes of transport from employment to population locations. It is cast in an entropy-maximising framework which has been extended to measure actual components of energy—travel costs, free energy, and unusable energy (entropy itself)—and these provide indicators for examining future scenarios based on changing the costs of travel in the metro region. Although the model is comparatively static, we interpret its predictions in terms of fast and slow processes—‘fast’ relating to changes in transport modes, and ‘slow’ relating to changes in location. After developing and explaining the model using appropriate visual analytics, a scenario in which road-travel costs double is tested: this shows that mode switching is considerably more significant than shifts in location—which are minimal.
Batty M, 2013, "Visually-Driven Urban Simulation: exploring fast and slow change in residential location" Environment and Planning A 45(3) 532 – 552
Physics — and physicists — have had much to contribute to economic and finance. Now the science of complex networks sets a way forward to understanding and managing the complex financial networks of the world's markets.