The Big Data open source tools landscape is growing rapidly. Check it out here.
There are already so many open source tools related to Big Data. Check out the below figure to find out about the most important open source tools for big data. In the near future we will describe each open source tool in more detail. At the moment you can click the logo’s to open the respective website
As part of the “Open Government Data Switzerland” project, the Swiss Federal Archives and their project partners are operating a central pilot portal providing access to open data from the Swiss authorities (“open government data” or OGD). The pilot portal was launched on 16 September 2013 and is expected to remain online until the end of 2014. The authorities involved in the project are supplying some of their already accessible data for use in the pilot portal. These include a wide variety of data records, such as Swiss municipal boundaries, population statistics, up-to-date camera images of weather in Switzerland, historical documents and a directory of Swiss literature.
Planning for Big Data is a new book that helps you understand what big data is, why it matters, and where to get started.
" thanks to an open source project called Hadoop, commodity Linux hardware and cloud computing, this power is in reach for everyone. A data revolution is sweeping business, government and science, with consequences as far reaching and long lasting as the web itself. "
Our aim with Strata is to help you understand what big data is, why it matters, and where to get started. In the wake the recent conference, we’re delighted to announce the publication of our “Planning for Big Data” book. Available as a free download, the book contains the best insights from O’Reilly Radar authors over the past three months, including myself, Alistair Croll, Julie Steele and Mike Loukides.
JeDEM - eJournal of eDemocracy and Open Government
Vol 5, No 2 (2013)
This issue presents the best papers from the CeDEM13 (Conference for E-Democracy and Open Government 2013) that received the highest scores by the reviewers. The papers in this issue have been extended and/or updated.
Scientific Research Papers
- Online parliamentary election campaigns in Scotland: a decade of research Graeme Baxter, Rita Marcella 107-127
- The Five Stars Movement in the Italian Political Scenario. A Case for Cybercratic Centralism? Rosanna De Rosa 128-140
The emergence of new organizational designs based on the metaphors of open, living systems will become a powerful, corrective force. Permeability and increased stakeholder engagement are part of something very big just that’s over the horizon. You might even say it is the next phase in the evolution of democracy; one that will extend its reach beyond the public realm and into the private sector.When it comes, it will be good for society, good for the planet and good for business.
Today we’re introducing a pilot project we’re calling Twitter Data Grants, through which we’ll give a handful of research institutions access to our public and historical data. Wi......
With more than 500 million Tweets a day, Twitter has an expansive set of data from which we can glean insights and learn about a variety of topics, from health-related information such as when and where the flu may hit to global events like ringing in the new year. To date, it has been challenging for researchers outside the company who are tackling big questions to collaborate with us to access our public, historical data. Our Data Grants program aims to change that by connecting research institutions and academics with the data they need.
If you’d like to participate, submit a proposal here no later than March 15th. For this initial pilot, we’ll select a small number of proposals to receive free datasets. We can do this thanks to Gnip, one of our certified data reseller partners. They are working with us to give selected institutions free and easy access to Twitter datasets. In addition to the data, we will also be offering opportunities for the selected institutions to collaborate with Twitter engineers and researchers.
A conversation about data-driven innovation is possible now because new technologies have made it easier and cheaper to collect, store, analyze, use, and disseminate data. But while the potential for vastly more data-driven innovation exists, many organizations have been slow to adopt these technologies. Policymakers around the world should do more to spur data-driven innovation in both the public and private sectors, including by supporting the development of human capital, encouraging the advancement of innovative technology, and promoting the availability of data itself for use and reuse.
Big-data researchers have the option to stop doing their research once they have the right result. In options language: The researcher gets the “upside” and truth gets the “downside.” It makes him antifragile, that is, capable of benefiting from complexity and uncertainty — and at the expense of others.
But beyond that, big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal). It’s a property of sampling: In real life there is no cherry-picking, but on the researcher’s computer, there is. Large deviations are likely to be bogus.
Graph databases use graph structures (a finite set of ordered pairs or certain entities), with edges, properties and nodes for data storage. It provides index-free adjacency, meaning that every element is directly linked to its neighbour element. No index lookups are necessary. Graph database are faster when it comes to associative data set compared to relational databases. As they do not need join operations, they can scale naturally to large data sets.
Serendipity is an faceted search engine based on Semantic Web Technologies. As an important feature of Serendipity, Serendipity POIs (Points of Interest), allows users visualize OCW Repositories from an dataset based on LInked Data technologies.
Serendipity is sponsored by the research group GICAC from the Universidad Politécnica de Madrid (GICAC-UPM) and the Universidad Técnica Particular de Loja (UTPL) in collaboration with the OCW Institutions. This project aims to improve the searchability and discoverability of open educational content, which will enhance the ability for learners and educators to find and use OCW courses.
INTRODUCING STREAMTOOLS: A GRAPHICAL TOOL FOR WORKING WITH STREAMS OF DATA
New and open source from the New York Times R&D Lab.
We see a moment coming when the collection of endless streams of data is commonplace. As this transition accelerates it is becoming increasingly apparent that our existing toolset for dealing with streams of data is lacking. Over the last 20 years we have invested heavily in tools that deal with tabulated data, from Excel, MySQL, and MATLAB to Hadoop, R, and Python+Numpy. These tools, when faced with a stream of never-ending data, fall short and diminish our creative potential.
In response to this shortfall we have created streamtools—a new, open source project by the New York Times R&D Lab which provides a general purpose, graphical tool for dealing with streams of data. It offers a vocabulary of operations that can be connected together to create live data processing systems without the need for programming or complicated infrastructure. These systems are assembled using a visual interface that affords both immediate understanding and live manipulation of the system.
UNDP’s new strategic knowledge management (KM) framework operationalizes the KM objectives of the Strategic Plan 2014-2017, drawing from lessons from the last Knowledge Strategy 2009-2011 implementation as well as feedback from staff, clients and formal evaluations. Within this strategy UNDP will focus its KM work on organizational learning on what does and does not work in UNDP’s areas of development work, collecting, analysing and using evidence and lessons from a global and country perspective, and from external and internal experience. UNDP’s KM thereby covers both external KM for and with partners and clients as well as internal KM to support the organization’s operational efficiency.
The KM framework builds upon past KM successes (such as Communities of Practice, the Teamworks platform, public knowledge mobilization like theRio+20 Dialogues, MY World, the Post-2015 Consultations and the Civil 20 Dialogues), as well as numerous regional initiatives), however, it also addresses a range of challenges with regards to KM, in particular with respect toorganizational learning and knowledge capture (including knowledge products), knowledge networking,measurement and incentives, openness and public engagement, as well as talent management.
Do you work with surveys, demographic information, evaluation data, test scores, or observation data? Are you interested in making the data you collect more useful by organizing it, analyzing it, and applying it in different ways?
This self-paced, online course is intended for anyone who wants to learn more about how to structure, visualize, and manipulate data. This includes student, educators, researchers, journalists, and small business owners.
The course is available from March 18 - April 4, 2014 with support from peers and Google content experts.
DATA - Avez-vous une idée de votre valeur sur les réseaux sociaux ? À combien estimez-vous votre compte Facebook ? C'est la question qu'on peut se poser après les sommes vertigineuses engagées par les géants du Web. Le rachat du service de messagerie WhatsApp par Facebook: 16 milliards de dollars.
Avec WhatsApp, Facebook va gagner dans l'opération une grande masse de données sur les usages des consommateurs dans le monde. Des informations qui ont elles aussi de la valeur, relève IHS Technology. La forte présence de WhatsApp sur des marchés émergents clés, comme le Brésil et l'Inde, fourniront à Facebook des informations sur des pays où il n'avait pas développé de monétisation. Pour les analystes de Cantor Fitzgerald, "la valorisation semble raisonnable sur la base du prix par utilisateur".
Data brokers don't make it easy to see the data they hold about you. Here's what you can do to opt-out.
Data brokers have been around forever, selling mailing lists to companies that send junk mail. But in today’s data-saturated economy, data brokers know more information than ever about us, with sometimes disturbing results.
The first spreadsheet below is a list of data brokers who will give you copies of your data. (You can scroll around inside the box below, and you can also download your own copy of the spreadsheet, here.) The second is the list of data brokers from whom I sought to opt-out, with the ones that allowed opt-outs highlighted. (Download that onehere.)
Societal institutions at all levels have begun experimenting with innovative practices to collaborate and co-create solutions with citizens. However, we still have a limited conceptual understanding of what opening governance through institutional innovation means and what its impacts may be. Nor is there a systemic effort to capture current practices and what is known on […]
In essence, the Observatory functions as a “knowledge broker,” acting as an intermediary between the increased need for knowledge and the growing body of information on the practice and impact of opening governance. Toward that end, the Observatory seeks to curate information from a wide variety of sources, structure this content in an intelligent manner, and present it in accessible ways.
The GovLab Index
Inspired by Harper’s index, GovLab Research will issue on a regular basis a compilation of statistics that reflect the trends, attitudes, behaviors, and environmental settings related to the Open Government and Government 3.0 movements. Click here to contribute to the new GovLab Index by suggesting additional stats and numerical summaries. View all of our Index posts here.
- The GovLab Index: Designing for Behavior Change - Please find below the latest installment in The GovLab Index series, inspired by the Harper’s Index. “The GovLab Index: Designing for Behavior Change” explores the recent application of psychology and behavioral economics towards solving social issues and shaping public policy and programs. Previous installments include The Networked Public, Measuring Impact with Evidence, Open Data, The […]
- The GovLab Index: The Networked Public, (Updated and Expanded) - The GovLab Index: The Networked Public, (Updated and Expanded) Please find below the latest installment in The GovLab Index series, inspired by the Harper’s Index. “The GovLab Index: The Networked Public — January 2014” provides an update on our previous The Networked Public installment, and highlights global trends in internet use, social media, and mobile […]
- The GovLab Index: Open Data, (Updated and Expanded) - The GovLab Index: Open Data, (Updated and Expanded) Please find below the latest installment in The GovLab Index series, inspired by Harper’s Index. “The GovLab Index: Open Data — December 2013” provides an update on our previous Open Data installment, and highlights global trends in Open Data and the release of public sector information. Previous […]