Public Datasets - Open Data -
9.2K views | +0 today
Follow
Public Datasets - Open Data -
Your new post is loading...
Your new post is loading...
Rescooped by luiy from BIG data, Data Mining, Predictive Modeling, Visualization
Scoop.it!

Bringing Open Source Predictive Modeling to Real-Time Data ...

KijiMR enables developers to build predictive models directly on data collected and accessible in real-time, paving the way for more scalable data-driven applications. As a complete project, Kiji allows developers to build big ...


Via AnalyticsInnovations
more...
No comment yet.
Scooped by luiy
Scoop.it!

Open Data Communities | Open Access to Local Data

Open Data Communities | Open Access to Local Data | Public Datasets - Open Data - | Scoop.it
luiy's insight:
Open Access to Local Data

This site is the Department for Communities and Local Government's first step towards more open, accessible and re-usable data.

 

It provides a selection of statistics on Local Governmentfinance, housing and homelessness, wellbeing, deprivation, and the department's business plan as well as supportinggeographical data.

 

All of the data is available as fully browsable and queryable Linked Data, and is free to re-use under the Open Government Licence.

more...
No comment yet.
Rescooped by luiy from Humanities and their Algorithmic Revolution
Scoop.it!

Open City - Civic apps built with open data

Open City - Civic apps built with open data | Public Datasets - Open Data - | Scoop.it

Via Pierre Levy
more...
Pierre Levy's curator insight, March 17, 2013 1:56 PM

Open City is a group of volunteers that create apps with open data to improve transparency and citizen understanding of our government.

Scooped by luiy
Scoop.it!

Creative Commons Announces “School of Open” with Courses to Focus on Digital Openness

Creative Commons Announces “School of Open” with Courses to Focus on Digital Openness | Public Datasets - Open Data - | Scoop.it
Just in time to celebrate Open Education Week, here comes a new initiative, the School of Open, a learning environment focused on increasing our understanding of “openness” and the benefits it brings to creativity and education in the digital age.
luiy's insight:

Just in time to celebrate Open Education Week, here comes a new initiative, the School of Open, a learning environment focused on increasing our understanding of “openness” and the benefits it brings to creativity and education in the digital age.

Developed by the collaborative education platform Peer to Peer University(P2PU) with organizational support from Creative Commons, the School of Open aims to spread understanding of the power of this brave new world through free online classes.

We hear about it all the time: Universal access to research, education and culture—all good things, without a doubt—made possible by things like open source software, open educational resources and the like.

But what are these various communities and what do they mean? How can we all learn more and get involved?

School of Open has rolled the conversation back to square one so that understanding the basics is easy. Through a list of new courses created by users and experts, people can learn more about what “openness” means and how to apply it. There are stand-alone courses on copyright, writing for Wikipedia, the collaborative environment of open science, and the process behind making open video.

These free courses start March 18 (sign up by clicking the “start course” button by Sunday, March 17):

Copyright 4 Educators (US)Copyright 4 Educators (AUS)Creative Commons for K-12 EducatorsWriting Wikipedia Articles: The Basics and Beyond

These free courses are open for you to take at any time:

Get a CC license. Put it on your websiteOpen Science: An IntroductionOpen data for GLAMsIntro to Openness in EducationA Look at Open VideoContributing to Wikimedia CommonsOpen Detective

The approach at P2PU encourages people to work together, assess one another’s work, and provide constructive feedback. It’s a great place to learn how to design your own course, because the design process is broken down step-by-step, and course content is vetted by users and P2PU staff. The tutorial shows you how the process works.

more...
No comment yet.
Scooped by luiy
Scoop.it!

Google Research Releases Wikilinks Corpus With 40M Mentions And 3M Entities | TechCrunch

Google Research Releases Wikilinks Corpus With 40M Mentions And 3M Entities | TechCrunch | Public Datasets - Open Data - | Scoop.it
TechCrunch is a leading technology media property, dedicated to obsessively profiling startups, reviewing new Internet products, and breaking tech news.
luiy's insight:

Google Research just launched its Wikilinks corpus, a massive new data set for developers and researchers that could make it easier to add smart disambiguation and cross-referencing to their applications. The data could, for example, make it easier to find out if two web sites are talking about the same person or concept, Google says. In total, the corpus features 40 million disambiguated mentions found within 10 million web pages. This, Google notes, makes it “over 100 times bigger than the next largest corpus,” which features fewer than 100,000 mentions.

 

For Google, of course, disambiguation is something that is a core feature of the Knowledge Graph project, which allows you to tell Google whether you are looking for links related to the planet, car or chemical element when you search for ‘mercury,’ for example. It takes a large corpus like this one and the ability to understand what each web page is really about to make this happen.

more...
No comment yet.
Scooped by luiy
Scoop.it!

About the ODI | Open Data Institute

About the ODI | Open Data Institute | Public Datasets - Open Data - | Scoop.it
luiy's insight:
About the ODI

The Open Data Institute will catalyse an open data culture that has economic, environmental and social benefits. It will unlock supply, generate demand, create and disseminate knowledge to address local and global issues.

We will convene world-class experts to collaborate, incubate, nurture and mentor new ideas, and promote innovation. We will enable anyone to learn and engage with open data, and empower our teams to help others through professional coaching and mentoring.

more...
No comment yet.
Scooped by luiy
Scoop.it!

Open Data for Africa - African Development Bank

luiy's insight:

The Open Data for Africa platform is a response from the African Development Bank Group (AfDB) aimed at boosting access to quality data necessary for managing and monitoring development results in African countries, including the millennnium development goals. It responds to a number of important global and regional initiatives increase the availability of data on Africa.  It will foster evidence-based decision-making, public accountability, and good governance. The initiative forms part of the worldwide effort to strengthen statistical capacity, articulated in the Busan Action Plan for Statistics (BAPS), which was endorsed by the international community at the High-Level Forum on Aid Effectiveness, which took place in Busan, Korea, between 28 November and 1 December 2011. Read more

more...
No comment yet.
Scooped by luiy
Scoop.it!

P2PU | Open data for GLAMs

P2PU | Open data for GLAMs | Public Datasets - Open Data - | Scoop.it
Open data for GLAMs: Open up your institution's data
luiy's insight:

Open up your institution's data

 

This challenge is for professionals in cultural institutions who are interested in opening up their data as open culture data.

 

The course will guide you through the different steps towards open data and provide you with extensive background information on how to handle copyright and other possible issues.

 

The different steps will force you to think about different aspects of your data that could lead to a more efficient data infrastructure and a coherent data policy with great internal benefits for your institution. 

more...
No comment yet.
Scooped by luiy
Scoop.it!

New York State Unveils New Open Data Portal - techPresident

New York State Unveils New Open Data Portal - techPresident | Public Datasets - Open Data - | Scoop.it
alloveralbany
New York State Unveils New Open Data Portal
techPresident
New York Governor Andrew Cuomo launched a new open data portal Monday, Open.ny.gov, following through on a promise made in his State of the State speech in January.
luiy's insight:

New York Governor Andrew Cuomo launched a new open data portal Monday, Open.ny.gov, following through on a promise made in his State of the State speech in January. The site will feature data from every New York State agency, and tie in localities from all over the state.

more...
No comment yet.
Scooped by luiy
Scoop.it!

G-8 International Conference on Open Data for Agriculture | e-Agriculture

G-8 International Conference on Open Data for Agriculture | e-Agriculture | Public Datasets - Open Data - | Scoop.it

Open access to ag knowledge 'vital' http://t.co/WCe3Egd4sf #agriculture #foodsecurity

luiy's insight:

Open access to publicly funded agriculturally-relevant data is critical to increasing global food security. It is being used by innovators and entrepreneurs around the world to accelerate development, whether it be tracking election transparency in Kenya or providing essential information to rural farmers in Uganda.

The G-8 conference will convene policy makers, thought leaders, food security stakeholders, and data experts to discuss the role of public, agriculturally-relevant data in increasing food security and to build a strategy to spur innovation by making agriculture data more accessible. As part of the conference, selected applicants will be invited to showcase innovative uses of open data for food security in either a Lightning Presentation (a 3-5 minute, image-rich presentation on the first day of the conference) or in the Exhibit Hall (an image-rich exhibit on display throughout the two-day conference).

The G-8 is inviting innovators to apply to present ideas that demonstrate how open data can be unleashed to increase food security at the G-8 International Conference on Open Data for Agriculture on April 29-30, 2013 in Washington, D.C.

 For more information on the conference and to submit your application, please visit the conference website or email G8AGOPENDATA@osec.usda.gov.

 Discuss the upcoming conference and the role for open data in promoting agriculture and food security on twitter!   #OpenAgData

more...
No comment yet.
Scooped by luiy
Scoop.it!

Why NIHR Clinical Research Network is creating an open data platform

Why NIHR Clinical Research Network is creating an open data platform | Public Datasets - Open Data - | Scoop.it
Richard Corbridge, chief information officer, explains why NIHR Clinical Research Network is creating an open data platform. He says the idea is to have a transparent data set that everyone can have access to.
more...
No comment yet.
Scooped by luiy
Scoop.it!

De la “Data Science” à l’infovisualisation (1/2) : qu’est-ce qu’un data scientist ? « InternetActu.net

De la “Data Science” à l’infovisualisation (1/2) : qu’est-ce qu’un data scientist ? « InternetActu.net | Public Datasets - Open Data - | Scoop.it
InternetActu.net est un site d'actualité consacré aux enjeux de l'internet, aux usages innovants qu'il permet et aux recherches qui en découlent.
more...
No comment yet.
Scooped by luiy
Scoop.it!

Health data: How open data can be used to examine the NHS - Journalism.co.uk

Health data: How open data can be used to examine the NHS - Journalism.co.uk | Public Datasets - Open Data - | Scoop.it
Journalism.co.uk
Health data: How open data can be used to examine the NHS
Journalism.co.uk
"There are some drugs that the NHS wants to understand the patterns of usage of, and we've used those files to turn the data into interactive maps.
luiy's insight:

The dangers in interpreting data

The researchers wanted to highlight the story of the potential savings to journalists. But in briefing journalists on the story, they paid particular care to how the data was released – as simply mapping spending would have showed the most densely populated areas of the country.

"There's an easy and sensational story of 'this doctor has a high proportion [of prescribing expensive drugs], so they are a very bad person'. That's not necessarily true but would be easy to pick up and run with," Bennett said.

"We were careful about the level on which we released data and the way in which we visualised it, which tried to encourage people to think about wider patterns and systems and try and discourage into digging into an individual's behaviour.

"PCTs and health organisations might want to do that but we didn't want to create a tabloid story around it, we wanted it to be a genuine discussion about 'how do we manage the NHS well?'

"We think it's a great institution and it's about giving it positive support and transparency rather than accusing people of things."

The planning around the controlled release worked, and the story was reported by the Financial Times, the Economist and the Daily Mail.

more...
No comment yet.
Scooped by luiy
Scoop.it!

Google BigQuery Ratchets Up Evolution of New-Age Data Analysis | Wired Enterprise | Wired.com

Google BigQuery Ratchets Up Evolution of New-Age Data Analysis | Wired Enterprise | Wired.com | Public Datasets - Open Data - | Scoop.it
The latest incarnation of Google BigQuery is yet example of the way today's "Big Data" tools -- tools designed to process mega amounts of information -- are evolving to behave more and more like traditional databases.
luiy's insight:

Google was sitting on two massive collections of data describing its App Engine, a web service where software developers can build and deploy online applications.

 

One data set described the way people used the service, and it spanned 2 terabytes of information, or roughly 2,000 gigabytes. The second showed how these customers were billed for using the service, and this was about 10 gigabytes. Google wanted to examine the relationship between these two enormous collections of information, so it shuttled both into a service it calls BigQuery. With BigQuery, the company merged the data in about 60 seconds, according to Google man Ju-kay Kwek, and it could then zero in on the results for each individual App Engine user.

 

When you’re dealing with such large data sets, 60 seconds is pretty darn quick. And this didn’t require any specialized programming. Google was using standard tools built into BigQuery, and as the company announced late last week, these tools are now available to the world at large.

 

The tools mimic the sort of rapid queries that have long been possible on ordinary databases via the structure query language, or SQL. The difference is that Google is doing this on such large amounts of data. The latest incarnation of Google BigQuery is yet another example of the way today’s “Big Data” tools — tools designed to process mega amounts of information — are evolving to behave more and more like traditional databases.

 

In October, Silicon Valley startup Cloudera uncloaked a tool called Impala that’s designed to run rapid queries on massive data sets, and this month, tech giant EMC followed with a similar tool. Based on aninternal Google software platform called Dremel, Big Query predates both these tools, and Google continues to fine-tune it.

more...
No comment yet.
Rescooped by luiy from The World of Open
Scoop.it!

Open Culture Data: Opening GLAM Data Bottom-up | MW2013: Museums and the Web 2013

Open Culture Data: Opening GLAM Data Bottom-up | MW2013: Museums and the Web 2013 | Public Datasets - Open Data - | Scoop.it
RT @johanoomen: our #MW2013 paper 'Open Culture Data: Opening GLAM Data Bottom-up' = available online http://t.co/j7gU906MQA #glamwiki #europeana #museweb

Via cafonso
more...
No comment yet.
Scooped by luiy
Scoop.it!

French Data Protection Authority Launches Consultation on Open Data : : Privacy and Information Security Law Blog

French Data Protection Authority Launches Consultation on Open Data : : Privacy and Information Security Law Blog | Public Datasets - Open Data - | Scoop.it
more...
No comment yet.
Rescooped by luiy from Communication in the digital era
Scoop.it!

#Tip: Watch these free data journalism tutorials

#Tip: Watch these free data journalism tutorials | Public Datasets - Open Data - | Scoop.it
kdmcBerkeley is running free tutorials in spreadsheet basics, data visualisations and mapping

Via Andrea Naranjo
more...
No comment yet.
Rescooped by luiy from Open Government Daily
Scoop.it!

A Look at Utah's Future in Open Data

A Look at Utah's Future in Open Data | Public Datasets - Open Data - | Scoop.it
Open data policies can come in different shapes, sizes, and strengths. The most common and idealized form aims to mandate or direct energy toward open data specifically (reflected in the recent wave of municipal referendums).

Via Ivan Begtin
more...
No comment yet.
Scooped by luiy
Scoop.it!

Journal of Biomedical Semantics | Abstract | Building Linked Open Data towards integration of biomedical scientific literature with DBpedia

There is a growing need for efficient and integrated access to databases provided by diverse institutions. Using a linked data design pattern allows the diverse data on the Internet to be linked effectively and accessed efficiently by computers.
more...
No comment yet.
Scooped by luiy
Scoop.it!

Association of National Mapping Land Registry and Cadastral Agencies | EuroGeographics

Association of National Mapping Land Registry and Cadastral Agencies | EuroGeographics | Public Datasets - Open Data - | Scoop.it
more...
No comment yet.
Scooped by luiy
Scoop.it!

Pan-European open data available online from EuroGeographics - DirectionsMag.com (press release)

Pan-European open data available online from EuroGeographics - DirectionsMag.com (press release) | Public Datasets - Open Data - | Scoop.it
Pan-European open data available online from EuroGeographics DirectionsMag.com (press release) From today (8 March 2013), the 1:1 million scale topographic dataset, EuroGlobalMap will be available free of charge for any use under a new open data...
more...
No comment yet.
Scooped by luiy
Scoop.it!

5 of the Best Free and Open Source Data Mining Software | TechSource

5 of the Best Free and Open Source Data Mining Software | TechSource | Public Datasets - Open Data - | Scoop.it
Tips, News, Tutorials, Reviews about Linux, Open Source Software, Ubuntu, Google, Chrome, Android, Apple, Programming, Gadgets, and all things tech.
luiy's insight:

For those of you who are looking for some data mining tools, here are five of the best open-source data mining software that you could get for free:


Orange
Orange is a component-based data mining and machine learning software suite that features friendly yet powerful, fast and versatile visual programming front-end for explorative data analysis and visualization, and Python bindings and libraries for scripting. It contains complete set of components for data preprocessing, feature scoring and filtering, modeling, model evaluation, and exploration techniques. It is written in C++ and Python, and its graphical user interface is based on cross-platform Qt framework.


RapidMiner
RapidMiner, formerly called YALE (Yet Another Learning Environment), is an environment for machine learning and data mining experiments that is utilized for both research and real-world data mining tasks. It enables experiments to be made up of a huge number of arbitrarily nestable operators, which are detailed in XML files and are made with the graphical user interface of RapidMiner. RapidMiner provides more than 500 operators for all main machine learning procedures, and it also combines learning schemes and attribute evaluators of the Weka learning environment. It is available as a stand-alone tool for data analysis and as a data-mining engine that can be integrated into your own products.


Weka
Written in Java, Weka (Waikato Environment for Knowledge Analysis) is a well-known suite of machine learning software that supports several typical data mining tasks, particularly data preprocessing, clustering, classification, regression, visualization, and feature selection. Its techniques are based on the hypothesis that the data is available as a single flat file or relation, where each data point is labeled by a fixed number of attributes. Weka provides access to SQL databases utilizing Java Database Connectivity and can process the result returned by a database query. Its main user interface is the Explorer, but the same functionality can be accessed from the command line or through the component-based Knowledge Flow interface.


JHepWork
Designed for scientists, engineers and students, jHepWork is a free and open-source data-analysis framework that is created as an attempt to make a data-analysis environment using open-source packages with a comprehensible user interface and to create a tool competitive to commercial programs. It is specially made for interactive scientific plots in 2D and 3D and contains numerical scientific libraries implemented in Java for mathematical functions, random numbers, and other data mining algorithms. jHepWork is based on a high-level programming language Jython, but Java coding can also be used to call jHepWork numerical and graphical libraries.


KNIME
KNIME (Konstanz Information Miner) is a user friendly, intelligible, and comprehensive open-source data integration, processing, analysis, and exploration platform. It gives users the ability to visually create data flows or pipelines, selectively execute some or all analysis steps, and later study the results, models, and interactive views. KNIME is written in Java, and it is based on Eclipse and makes use of its extension method to support plugins thus providing additional functionality. Through plugins, users can add modules for text, image, and time series processing and the integration of various other open source projects, such as R programming language, Weka, the Chemistry Development Kit, and LibSVM.

more...
No comment yet.
Scooped by luiy
Scoop.it!

Github Communities - Visualization

Github Communities - Visualization | Public Datasets - Open Data - | Scoop.it
luiy's insight:

Cartographie des 2770 comptes les plus importants de la communauté github.

 

La taille de chaque noeud est fonction du nombre de « followers » de l’utilisateur. L’épaisseur des liens représente le nombre de projets « forkés » entre 2 utilisateurs. La spatialisation est le résultat de l’algorithme de spatialisation ForceAtlas.

more...
No comment yet.