Data Management Thread
4.1K views | +0 today
 
Scooped by Irina Radchenko
onto Data Management Thread
Scoop.it!

How the Data Sausage Gets Made - Learning - Source: An OpenNews project

How the Data Sausage Gets Made - Learning - Source: An OpenNews project | Data Management Thread | Scoop.it

Let me start with a caution. This subject—both the food issues and the code issues—might make you queasy. Food safety is an issue that’s of critical importance. In the U.S., food safety is long on data and short on ways to make the data usable. Every few months, we get another multi-state outbreak that reminds us of the safety problems in our food supply and how significant they are. Sadly, these problems are largely inevitable; to keep food costs low as we expect them to be, companies cut corners or import more food from other countries with laxer food-safety laws. Meanwhile, federal regulatory agencies are unable to adequately police an increasingly complex food supply chain. Many people think about food poisoning in terms of meat. There is a reason for this; in 1993, there was a severe outbreak of food poisoning at 173 Jack-in-the-Box restaurants, caused by a relatively novel strain of the E. Coli bacteria (O157:H7). It hospitalized 171 victims and killed 4 people, 3 of whom were small children. Since then, we’ve come to expect regular problems with ground beef. But, meat accounts just only 22% of food poisoning outbreaks; in the past few years, there have been several major outbreaks stemming from cantaloupes, spinach, sprouts, and peanut butter.

more...
No comment yet.
Data Management Thread
All around the data: processing, storage, management, curation, etc
Your new post is loading...
Your new post is loading...
Scooped by Irina Radchenko
Scoop.it!

Common Probability Distributions: The Data Scientist's Crib Sheet - Cloudera Engineering Blog

Common Probability Distributions: The Data Scientist's Crib Sheet - Cloudera Engineering Blog | Data Management Thread | Scoop.it
Data scientists have hundreds of probability distributions from which to choose. Where to start?
Data science, whatever it may be, remains a big deal.  “A data scientist is better at statistics than any software engineer,” you may overhear a pundit say, at your local tech get-togethers and hackathons. The applied mathematicians have their revenge, because statistics hasn’t been this talked-about since the roaring 20s. They have their own legitimizing Venn diagram of which people don’t make fun. Read More
more...
No comment yet.
Rescooped by Irina Radchenko from Analytics, Big Data, and Data Science
Scoop.it!

90+ Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning

90+ Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning | Data Management Thread | Scoop.it
90+ Active Blogs on Analytics, Big Data, Data Mining, Data Science, Machine Learning http://t.co/NfPPMukiBm

Via Gregory Piatetsky
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Big Data Problems Solved Fast On An Open Source Platform

Big Data Problems Solved Fast On An Open Source Platform | Data Management Thread | Scoop.it
RapidMiner produces analytics without coding, and makes the process and results comprehensible to business mortals without advanced math degrees.
more...
Jakes Rawlinson's curator insight, September 15, 2015 2:56 PM

I still have to try 'RapidMiner', but this might be a solution for us professional mortals who can't code or know programming.....? Comments?

Scooped by Irina Radchenko
Scoop.it!

It’s All in the Data – My, My, Hey, Hey

It’s All in the Data – My, My, Hey, Hey | Data Management Thread | Scoop.it
Welcome to my new column on the new pages of TDAN.com. This quarterly column will address data from a personal perspective and is written to get you to think about the importance of data in our daily lives. I cannot tell you how often I tell people, after lengthy discussions about seemingly unconnected subjects, that […]
more...
No comment yet.
Rescooped by Irina Radchenko from Data Nerd's Corner
Scoop.it!

Data + Science = Sexy

Data + Science = Sexy | Data Management Thread | Scoop.it
I remember the days of Bezdek, fuzzy c-means clustering. My humble team developed algorithms to classify landmines in Angola. We spent a lot of time looking at the data, matrices and vectors before selecting a random sample group. Principal component analysis was another popular method to compress the data to decrease the cost of algorithms. It was not too long ago that I wrote my dissertation on it in 2010.

Via Carla Gentry CSPO
more...
Carla Gentry CSPO's curator insight, February 26, 2015 7:40 AM

These were the days that MATLAB crashed over and over, had problems with averaging and filtering. We all needed to validate what we were doing. I was wondering if we still need to validate what we are doing with data and try to learn from the nature of the data? Or else, are we a step further that all datasets are the same? Can we trust to commercial products and press a button to puke graphs and histograms? Is that why data science became so sexy?

Scooped by Irina Radchenko
Scoop.it!

Tim O'Reilly Explains the Internet of Things

Tim O'Reilly Explains the Internet of Things | Data Management Thread | Scoop.it

Tim O’Reilly has been at the cutting edge of the Internet since it went commercial. In fact, he helped take it there: In August 1993 he released the Global Network Navigator, a web page containing information, catalogs and a marketplace, which may have been the first site with advertising.

more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Announcing MongoDB 3.0 | MongoDB

Announcing MongoDB 3.0 | MongoDB | Data Management Thread | Scoop.it

Today we are announcing MongoDB 3.0. This release marks the beginning of a new phase in which we build on an increasingly mature foundation to deliver a database so powerful, flexible, and easy to manage that it can be the new DBMS standard for any team, in any industry.

more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

​Facebook open sources AI tools, possibly turbo charges deep learning | ZDNet

​Facebook open sources AI tools, possibly turbo charges deep learning | ZDNet | Data Management Thread | Scoop.it
The plain English takeaway is that faster training of neural networks will now be widely available via open source project Torch.
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Anyone Can Now Use IBM's Watson To Crunch Data For Free

Anyone Can Now Use IBM's Watson To Crunch Data For Free | Data Management Thread | Scoop.it
You probably know IBM's Watson platform best from its winning performance on Jeopardy. But the supercomputer is more than just a mechanism for IBM to publicly shame smart people. It's arguably the most powerful natural-language supercomputer in the world, and thanks to a new public beta, its number-crunching abilities are open to all.
more...
No comment yet.
Rescooped by Irina Radchenko from The Programmable City
Scoop.it!

Why the sharing economy needs the Internet of Things | Gigaom

Why the sharing economy needs the Internet of Things | Gigaom | Data Management Thread | Scoop.it
Organizations ranging from Zipcar to bike-sharing programs rely on remote unlocking and return of assets. Sharing economy companies that rely on individual assets need to do the same to compete on experience and drive member loyalty.

Via Rob Kitchin
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

10 Online Big Data Courses

10 Online Big Data Courses | Data Management Thread | Scoop.it
The explosion of hype around the term big data ushered in a rabid desire in companies big and small to get their hands on employees with a data science skill set. For evidence, you need look no furthe
more...
No comment yet.
Rescooped by Irina Radchenko from Data Science
Scoop.it!

Not only CRAN downloads and Shiny … but also .. rCharts | PremierSoccerStats

Not only CRAN downloads and Shiny … but also .. rCharts | PremierSoccerStats | Data Management Thread | Scoop.it
I have been meaning for some time to get stuck into the rCharts package which provides
an interface to many Javascript graphic libraries. These offer rich charting capabilities with interactivity and a great deal of customization.

Via M. Edward (Ed) Borasky, Data Scientist Insights
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Data Science

Data Science | Data Management Thread | Scoop.it
Organize anything, together. Trello is a collaboration tool that organizes your projects into boards. In one glance, know what's being worked on, who's working on what, and where something is in a process.
more...
No comment yet.
Rescooped by Irina Radchenko from Big Data Analysis in the Clouds
Scoop.it!

OpenText pitches big data service, first in a planned series for analytics

OpenText pitches big data service, first in a planned series for analytics | Data Management Thread | Scoop.it
Its built-in columnar database is up to 1,000 times faster than traditional relational databases, the company claims

Via Pierre Levy
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Data analysis: Create a cloud commons

Data analysis: Create a cloud commons | Data Management Thread | Scoop.it

Major funding agencies should ensure that large biological data sets are stored in cloud services to enable easy access and fast analysis, say Lincoln D. Stein and colleagues.

more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Data Analytics Dominates Enterprises' Spending Plans For 2015

Data Analytics Dominates Enterprises' Spending Plans For 2015 | Data Management Thread | Scoop.it

Companies will spend an average of $7.4M on data-related initiatives over the next twelve months , with enterprises investing $13.8M, and small & medium businesses (SMBs) investing $1.6M.

more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Text Analytics Tutorials

Text Analytics Tutorials | Data Management Thread | Scoop.it

What is text analytics and how can it be beneficial to my business, skillset, or predictive models?  If you’ve searched out this website, it is likely that you are here to learn the how of text analytics. In this case, we will primarily address the how with IBM/SPSS Modeler.  But, we will also answer the question of why.  Why will be answered several times over in the use case section, but in overview the broader question is “What is text analytics?”

more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Big Data: Telling The Story Of Falling Oil Prices

Big Data: Telling The Story Of Falling Oil Prices | Data Management Thread | Scoop.it
Figure 1: Network Diagram for the period before the falling of Oil pricesBig Data is indeed disrupting our industries and to a data scientist, the best way to prove that and show it to people is to play around with some data!The fall of Oil prices is one of the most prominent topics in our world nowadays, so its a matter of curiosity for any data enthusiast to see what Big Data can tell us about the Oil market's scene. We analyzed hundred of thousands of news article mentioning the Oil & Gas discussions before and after the fall of Oil prices in a 6 months’ time frame. We used the G
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Data Mining Reveals A Global Link Between Corruption and Wealth | MIT Technology Review

Data Mining Reveals A Global Link Between Corruption and Wealth | MIT Technology Review | Data Management Thread | Scoop.it
Social scientists have never understood why some countries are more corrupt than others. But the first study that links corruption with wealth could help change that.
more...
No comment yet.
Rescooped by Irina Radchenko from Digital Preservation (Russia)
Scoop.it!

How one of the world’s largest archives is managing the move from parchment to pixels

How one of the world’s largest archives is managing the move from parchment to pixels | Data Management Thread | Scoop.it
From the Domesday Book to modern government papers, the National Archives' collection of more than 11m historical government and public records is one of the world’s largest. It includes paper and parchment…

Via Ivan Begtin
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Global Smart Cities Market to Reach US$1.56 Trillion by 2020 | Grid Opt Smart Grid content from TDWorld

Global Smart Cities Market to Reach US$1.56 Trillion by 2020 | Grid Opt Smart Grid content from TDWorld | Data Management Thread | Scoop.it
The global smart city market will be valued at US$1.565 trillion in 2020.
more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Internet of Things network launches in 10 UK cities

Internet of Things network launches in 10 UK cities | Data Management Thread | Scoop.it

ARQIVA HAS ANNOUNCED the first 10 cities in its nationwide Internet of Things (IoT) network.

 

The company, best known as the operator of the UK's terrestrial television transmitter infrastructure, has switched on IoT transceivers in Birmingham, Bristol, Edinburgh, Glasgow, Leeds, Leicester, Liverpool, Manchester, London (starting in Greenwich) and Sheffield.

 

more...
No comment yet.
Scooped by Irina Radchenko
Scoop.it!

Big Picture: Google Visualization Research

Big Picture: Google Visualization Research | Data Management Thread | Scoop.it
more...
Fàtima Galan's curator insight, December 2, 2014 3:05 AM

"how information visualization can make complex data accessible, useful, and even fun"