EEDSP
Follow
Find
15.0K views | +4 today
EEDSP
Digital Signal Processing, Data Analytics, Big Data, HPC, Deep Learning, GPGPU, Distributed and Parallel Computing
Curated by Shiwon Cho
Your new post is loading...
Your new post is loading...
Scooped by Shiwon Cho
Scoop.it!

R to Latex packages: Coverage

There are now quite a few R packages to turn cross-tables and fitted models into nicely formatted latex. In a previous post I showed how to use one of them to display regression tables on the fly.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

PostgreSQL advances to rank 4 in DB-Engines Ranking

Despite losing a few score points since last month, PostgreSQL managed to move up one rank to become #4 in our ranking. It was only at rank #6 half a year ago. An impressive achievement in a short period of time.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Why Big Data Projects Fail

Too many focus on volume, variety, and velocity, when value is the priority with big data. That's why many projects fail, writes Teradata's Stephen Brobst.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Visualizing neural networks from the nnet package

Visualizing neural networks from the nnet package | EEDSP | Scoop.it
Neural networks have received a lot of attention for their abilities to ‘learn’ relationships among variables.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Facebook kisses DRAM goodbye, builds memcached for flash

Facebook kisses DRAM goodbye, builds memcached for flash | EEDSP | Scoop.it
Facebook has developed a new data cache called McDipper that’s essentially memcached rewritten to run on flash memory instead of DRAM, thus saving money while still delivering higher performance than disk.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Pig Eye for the SQL Guy - Hortonworks

Cat Miller is an engineer at Mortar Data, a Hadoop-as-a-service provider, and creator of mortar, an open source framework for data.

Pig is similar enough to SQL to be familiar, but divergent enough to be disorienting to newcomers. The goal of this guide is to ease the friction in adding Pig to an existing SQL skillset.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

How to install sar, sadf, mpstat, iostat, pidstat and sa tools on CentOS / Fedora / RHEL

How to install sar, sadf, mpstat, iostat, pidstat and sa tools on CentOS / Fedora / RHEL | EEDSP | Scoop.it
The following command can be used to install sar, sadf, mpstat, iostat, pidstat and sa tools on RPM based systems like CentOS, Fedora, RHEL (Red Hat Enterprise Linux):
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

DDS Programming using Modern C++

Resurgence of C++ is spreading in many industries. International computer system standards that target C++ for application portability, are quickly adopting modern C++. At the Object Management Gro...
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Slides from "Big Data Real Time Predictive Analytics"

At Tuesday's Data Driven Business Day at the Strata conference I gave my talk, Real-time Big Data Predictive Analytics: From Deployment to Production.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Storm and Hadoop: Convergence of Big-Data and Low-Latency Processing · YDN Blog

Storm and Hadoop: Convergence of Big-Data and Low-Latency Processing · YDN Blog | EEDSP | Scoop.it

At Yahoo!, Hadoop plays a central role in providing personalized experiences for our users and creating value for our advertisers. To serve Yahoo!’s emerging business needs, the Cloud Engineering Group is working on a next generation platform that enables the convergence of big-data and low-latency processing.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

C++ REST SDK (codename "Casablanca") - Home

This library is a Microsoft effort to support cloud-based client-server communication in native code using a modern asynchronous C++ API design.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

New ways to Hadoop with R

Today, there are two main ways to use Hadoop with R and big data: 1. Use the open-source rmr package to write map-reduce tasks in R (running within the Hadoop cluster - great for data distillation!) 2.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

reports 0.1.2 Released

reports 0.1.2 Released | EEDSP | Scoop.it
I’m very pleased to announce the release of reports : An R package to assist in the workflow of writing academic articles and other reports.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

A thinking programer's blog: HBase Benchmarking for multi-threaded environment

how HBase writes perform in multi-threaded environments.  To do so,  I wrote a Java program that writes records in a HBase table, and which can be configured to run to write N number of records using M Threads. 

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

MapR Apache Hadoop MapReduce Software Download | MapR Technologies

MapR Apache Hadoop MapReduce Software Download | MapR Technologies | EEDSP | Scoop.it

Hadoop users were excited to see the real-time Hadoop analytics demonstration at the Strata Conference in Santa Clara.  By streaming the #strataconf twitter hashtag directly into a cluster during the conference, MapR displayed two real-time tag clouds showing a word bubble with the most frequently used words in conference tweets and a user name cloud of top tweeters.  Watching the information change proved mesmerizing for some.

 
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The history of Hadoop: From 4 nodes to the future of data

The history of Hadoop: From 4 nodes to the future of data | EEDSP | Scoop.it
In the first of our four-part multi-media series on Hadoop, the people who helped build Hadoop talk about its birth, its promise and the challenges in moving it from webscale to just large-scale.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Near Real-time Processing Over Hadoop and HBase | Cerner Engineering Health

Near Real-time Processing Over Hadoop and HBase | Cerner Engineering Health | EEDSP | Scoop.it

These significant differences mean different processing infrastructures. Nathan Marz described this well in his How to Beat the CAP Theorem post. The result is a system that uses complementary technologies: stream-based processing with Storm and batch processing with Hadoop.

Interestingly, HBase sits at a juncture between realtime and batch processing models. It offers aspects of batch processing; computation can be moved to the data via direct MapReduce support. It also supports realtime patterns with random access and fas

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

FDTD Algorithm Optimization on Intel® Xeon Phi™ coprocessor | Intel® Developer Zone

FDTD Algorithm Optimization on Intel® Xeon Phi™ coprocessor | Intel® Developer Zone | EEDSP | Scoop.it
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

PARALUTION - The Library for Iterative Sparse Methods on CPU and GPU

PARALUTION - The Library for Iterative Sparse Methods on CPU and GPU | EEDSP | Scoop.it

PARALUTION is a library for sparse iterative methods with special focus on multi-core and accelerator technology such as GPUs. In particular, it incorporates fine-grained parallel preconditioners designed to expolit modern multi-/many-core devices. Based on C++, it provides a generic and flexible design and interface which allow seamless integration with other scientific software packages. The library is open source and released under GPL.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

How to make a scientific result disappear

How to make a scientific result disappear | EEDSP | Scoop.it
Nathan Danneman (a co-author and one of my graduate students from Emory) recently sent me a New Yorker article from 2010 about the “decline effect,” the tendency for initially promising scientific results to get smaller upon replication.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Big Data at Torbit: Custom MapReduce-like System

Big Data at Torbit: Custom MapReduce-like System | EEDSP | Scoop.it
Tylor Arndt about Torbit’s “build-your-own-MapReduce”: The final system begins with a web-service against which client systems interface. To ensure resiliency, an instance of the web- service runs on each cluster host.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

LG경제연구원 걸음마 뗀 소셜 분석, 한계 아는 만큼 가치가 보인다.

소셜 미디어가 활용의 도구에서 분석의 대상으로 진화하고 있다. 소셜 데이터는 방대한 양뿐만 아니라 자발적으로 표현되고 실시간으로 확보가능한 정보라는 점 때문에 기존의 인위적인 실험 환경이나 구조화된 설문 방식을 보완할 새로운 연구대상으로 관심을 모으고 있다

more...
No comment yet.