Data Science
109 views | +0 today
Follow
Data Science
Curating data science
Curated by Hiquda
Your new post is loading...
Your new post is loading...
Rescooped by Hiquda from Data hacking
Scoop.it!

Beaker Notebook - The data scientist's lab notebook

Beaker Notebook - The data scientist's lab notebook | Data Science | Scoop.it

Beaker is a code notebook that allows you to analyze, visualize, and document data using multiple programming languages. Beaker's plugin-based polyglot architecture enables you to seamlessly switch between languages in your documents and add support for your favorite languages that we've missed.


Via Claudia Mihai
more...
No comment yet.
Rescooped by Hiquda from Data visualization
Scoop.it!

A Striking New Way to Visualize Mobility

A Striking New Way to Visualize Mobility | Data Science | Scoop.it
This tool shows how far you can travel in 10 minutes in ghostly splotches and tendrils.

For quick info on routes and travel times, there's always Google Maps. But for a traveler wanting more of a beauteous, immersive experience, try Isoscope, a mapping tool that plots possible journeys in what looks like glowing-blue ectoplasm.

Isoscope is not a service one would use to get from point A to B. It's more of a way to explore mobility in areas where travel conditions change hour by hour. First, set the map to zoom in on any place in the world and click it to site your imaginary traveler. A ghostly, translucent presence will then form, all splotches and tendrils. This cerulean shape is actually 24 different layers representing all the hours of the day. Track the mouse over the hours at the map's bottom, and the shape's outline will expand and contract to show how far you can get in a preselected two-to-ten minute car trip.


Via Claudia Mihai
more...
No comment yet.
Scooped by Hiquda
Scoop.it!

The Digital Observatory for Protected Areas | DOPA

The Digital Observatory for Protected Areas | DOPA | Data Science | Scoop.it

Supporting GEO’s Biodiversity Observation Network (GEO-BON), the Digital Observatory for Protected Areas (DOPA) is conceived as a set of distributed Critical Biodiversity Informatics Infrastructures (databases, web modeling services, broadcasting services, ...) combined with interoperable web services to provide a large variety of endusers including park managers, decision-makers and researchers with means to assess, monitor and possibly forecast the state and pressure of protected areas at the global scale.

DOPA has three main objectives:

1) Provide best available material (data, indicators, models) agreed on by contributing institutions which can serve for establishing baselines for research & reporting;

2) Provide free web based tools (databases, portals, modeling services) designed to generate the best available material but also for research purposes, decision making and capacity building activities for conservation;

3) Provide an interoperable and, as much as possible, open source framework to allow institutions to get their own means to assess, monitor and forecast the state and pressure of protected areas and help these to further engage with the organizations hosting critical biodiversity informatics infrastructures.

Developped in collaboration by major institutions active in the field of biodiversity conservation (UNEP-WCMC, BirdLife International, GBIF, IUCN, ...), DOPA is designed to encourage a multi-scale cross-disciplinary approach to biodiversity without being exposed to excessive risks coming from mixing data from undocumented sources and/or with undocumented uncertainties.

 Data: DOPA Explorer Beta is a first web based assessment tool where information on 9 000 protected areas covering almost 90% of the global protected surface has been processed automatically to generate a set of indicators on ecosystems, climate, phenology, species, ecosystem services and pressures. DOPA Explorer can so help identify the protected areas with most unique ecosystems and the species that have lowest level of protection and assess the pressures they are exposed to because of human development.

Access DOPA Explorer (optimized for FireFox),  at: http://ehabitat-wps.jrc.ec.europa.eu/dopa_explorer/

more...
No comment yet.
Scooped by Hiquda
Scoop.it!

Project Tycho - Data for health

Project Tycho - Data for health | Data Science | Scoop.it

After four years of data digitization and processing, the Project Tycho™ Web site provites open access to newly digitized and integrated data from the entire 125 years history of United States weekly nationally notifiable disease surveillance data since 1888. These data can now be used by scientists, decision makers, investors, and the general public for any purpose. The Project Tycho™ aim is to advance the availability and use of public health data for science and decision making in public health, leading to better programs and more efficient control of diseases.


The Project Tycho™ data are organized as counts. A count is defined as the number of cases or deaths due to a disease in a specific location and time period. A count is equivalent to a data point. During the 125 year period of weekly disease reporting, the types of reports have been changed regularly, leading to different types of data counts across time. This makes the integration and standardization of these data a complex task. Currently, available data are categorized in three levels based on the type of counts included. Level 1 includes different types of counts that have been standardized into a common format for a specific analysis published recently in the NEJM. Level 2 data only includes counts that have been reported in a common format, e.g. diseases reported for a one week period and without disease subcategories. These data can be used immediately for analysis, includes a wide range of diseases and locations but this level does not include data that have not been standardized yet. Level 3 data include all the different types of counts ever reported. Although this is the most complete data, the large number of different counts requires extensive standardization and various judgment calls before they can be used for analysis.

more...
No comment yet.
Scooped by Hiquda
Scoop.it!

JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles

JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles | Data Science | Scoop.it

JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR—the JASPAR CORE subcollection, which contains curated, non-redundant profiles—with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods.

more...
No comment yet.
Scooped by Hiquda
Scoop.it!

Data from: Microbial life in a fjord: metagenomic analysis of a microbial mat in Chilean Patagonia

Data from: Microbial life in a fjord: metagenomic analysis of a microbial mat in Chilean Patagonia | Data Science | Scoop.it

The current study describes the taxonomic and functional composition of metagenomic sequences obtained from a filamentous microbial mat isolated from the Comau fjord, located in the northernmost part of the Chilean Patagonia. The taxonomic composition of the microbial community showed a high proportion of members of the Gammaproteobacteria, including a high number of sequences that were recruited to the genomes of Moritella marina MP-1 and Colwellia psycherythraea 34H, suggesting the presence of populations related to these two psychrophilic bacterial species. Functional analysis of the community indicated a high proportion of genes coding for the transport and metabolism of amino acids, as well as in energy production. Among the energy production functions, we found protein-coding genes for sulfate and nitrate reduction, both processes associated with Gammaproteobacteria-related sequences. This report provides the first examination of the taxonomic composition and genetic diversity associated with these conspicuous microbial mat communities and provides a framework for future microbial studies in the Comau fjord.

more...
No comment yet.
Rescooped by Hiquda from Data hacking
Scoop.it!

Cytoscape.js an open-source graph theory library

Cytoscape.js an open-source graph theory library | Data Science | Scoop.it

Cytoscape.js is an open-source graph theory library written in JavaScript. You can use Cytoscape.js for graph analysis and visualisation.

Cytoscape.js allows you to easily display and manipulate rich, interactive graphs. Because Cytoscape.js allows the user to interact with the graph and the library allows the client to hook into user events, Cytoscape.js is easily integrated into your webapp, especially since Cytoscape.js supports both desktop browsers, like Chrome, and mobile browsers, like on the iPad. Cytoscape.js includes all the gestures you would expect out-of-the-box, including pinch-to-zoom, box selection, panning, et cetera.

Cytoscape.js also has graph analysis in mind: The library contains a slew of useful functions in graph theory. You can use Cytoscape.js headlessly on Node.js to do graph analysis in the terminal or on a web server.

Cytoscape.js is an open-source project, and anyone is free to contribute. For more information, refer to the GitHub README.

The library was developed at the Donnelly Centre at the University of Toronto. It is the successor of Cytoscape Web.

 


Via Claudia Mihai
more...
No comment yet.
Rescooped by Hiquda from Research Tools
Scoop.it!

Computer system automatically solves word problems

Computer system automatically solves word problems | Data Science | Scoop.it

Researchers in MIT's Computer Science and Artificial Intelligence Laboratory, working with colleagues at the University of Washington, have developed a new computer system that can automatically solve the type of word problems common in introductory algebra classes.

In the near term, the work could lead to educational tools that identify errors in students' reasoning or evaluate the difficulty of word problems. But it may also point toward systems that can solve more complicated problems in geometry, physics, and finance—problems whose solutions don't appear in the back of the teacher's edition of a textbook.


Via Alin Velea
more...
No comment yet.
Scooped by Hiquda
Scoop.it!

Independent Evolution of Leaf and Root Traits within and among Temperate Grassland Plant Communities

Independent Evolution of Leaf and Root Traits within and among Temperate Grassland Plant Communities | Data Science | Scoop.it

In this study, we used data from temperate grassland plant communities in Alberta, Canada to test two longstanding hypotheses in ecology: 1) that there has been correlated evolution of the leaves and roots of plants due to selection for an integrated whole-plant resource uptake strategy, and 2) that trait diversity in ecological communities is generated by adaptations to the conditions in different habitats. We tested the first hypothesis using phylogenetic comparative methods to test for evidence of correlated evolution of suites of leaf and root functional traits in these grasslands. There were consistent evolutionary correlations among traits related to plant resource uptake strategies within leaf tissues, and within root tissues. In contrast, there were inconsistent correlations between the traits of leaves and the traits of roots, suggesting different evolutionary pressures on the above and belowground components of plant morphology. To test the second hypothesis, we evaluated the relative importance of two components of trait diversity: within-community variation (species trait values relative to co-occurring species; α traits) and among-community variation (the average trait value in communities where species occur; β traits). Trait diversity was mostly explained by variation among co-occurring species, not among-communities. Additionally, there was a phylogenetic signal in the within-community trait values of species relative to co-occurring taxa, but not in their habitat associations or among-community trait variation. These results suggest that sorting of pre-existing trait variation into local communities can explain the leaf and root trait diversity in these grasslands.


Data source: http://figshare.com/articles/Alberta_grassland_plant_data/861980

more...
No comment yet.
Scooped by Hiquda
Scoop.it!

Integrated Cancer Drug Discovery Platform

Integrated Cancer Drug Discovery Platform | Data Science | Scoop.it

canSAR is a free, public cancer focused knowledgebase.
It brings together biological, chemical, pharmacological and disease data, distills them and makes them accessible to cancer research scientists from all disciplines to support translational research and drug discovery.

more...
No comment yet.
Scooped by Hiquda
Scoop.it!

Data from: Size and accumulation of fuel reserves at stopover predict nocturnal restlessness in a migratory bird

Data from: Size and accumulation of fuel reserves at stopover predict nocturnal restlessness in a migratory bird | Data Science | Scoop.it

An excel sheet containing data collected on wild caught migrating northern wheatears on Helgoland, Germany. Zugunruhe is expressed as the number of 15 min. periods in a 10 hr. night during which a bird showed at least five activity counts.

more...
No comment yet.