EEDSP
18.5K views | +7 today
Follow
EEDSP
Digital Signal Processing, Data Analytics, Big Data, HPC, Deep Learning, GPGPU, Distributed and Parallel Computing
Curated by Shiwon Cho
Your new post is loading...
Your new post is loading...
Scooped by Shiwon Cho
Scoop.it!

App Indexing for Google Search — Google Developers

App Indexing for Google Search — Google Developers | EEDSP | Scoop.it

App Indexing for Google Search.

App Indexing helps you drive usage of your app through Google. Deep links to your app appear in Google Search results on Android so users can get to your native mobile experience quickly and easily.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing

Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google's Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Hadoop YARN Installation: The definitive guide

Hadoop YARN Installation: The definitive guide | EEDSP | Scoop.it
This article guides you in the installation of the new generation Hadoop based on YARN. It is based on the most recent version of Hadoop at the time of this writing (2.2.0) and includes HDFS, YARN and MapReduce configurations for both single-node and cluster environments.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The lab that created Spark wants to speed up everything, including cures for cancer

The lab that created Spark wants to speed up everything, including cures for cancer | EEDSP | Scoop.it

AMPLab, the University of California, Berkeley, research group responsible for making Spark a household name in big data, has a lot more tricks up its sleeve. They range from databases to machine learning, and even include tools that could help treat cancer.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The Future of Apache Ambari

The Future of Apache Ambari | EEDSP | Scoop.it
It’s been a busy year for Apache Ambari. Keeping up with the rapid innovation in the open community certainly is exciting. We’ve already seen six releases this year to maintain a steady drumbeat of new features and usability guardrails. We have also seen some exciting announcements of new folks jumping into the Ambari community.
With all these releases and community activities, let’s take a break to talk about how the broader Hadoop community is affecting Ambari and how this is influencing what you will see from Ambari in the future.
Take a Look Around
To talk about the future of Ambari, we have to recognize what is happening outside of Ambari in the Hadoop community. We have to talk about Apache Hadoop YARN.
YARN is the operating system for data processing, making it possible to bring multiple workloads and processing engines to the data stored in Apache Hadoop 2.…
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Linus Torvalds' Workspace Is Nothing Like I Imagined (Video) - OMG! Ubuntu!

Linus Torvalds' Workspace Is Nothing Like I Imagined (Video) - OMG! Ubuntu! | EEDSP | Scoop.it
If you've ever found yourself wondering what sort of workspace environment the creator of Linux works from, it's your lucky day.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Emerging Trends in Big Data Technologies

Emerging Trends in Big Data Technologies | EEDSP | Scoop.it
Big Data technologies have been getting lot of attention over the last few years. There are several trends and innovations happening in this space. InfoQ would like to learn what new trends in Big Data you are currently using or planning on using in the future.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Scientists Question the Big Price Tags of Big Data

Scientists Question the Big Price Tags of Big Data | EEDSP | Scoop.it

The National Ecological Observatory Network, funded by Congress for $434 million, will equip 106 U.S. sites with sensors to gather ecological data all day, every day, for 30 years after it goes operational in 2017. The Human Brain Project, supported by $1.6 billion from the European Union, intends to create a supercomputer simulation of a working human brain, including all 86 billion neurons and 100 trillion synapses. The International Cancer Genome Consortium, 74 research teams across 17 countries spending an estimated $1 billion, is compiling 25,000 tumor genome sequences from 50 types of cancers.Click here to edit the title

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Building big data storage

Building big data storage | EEDSP | Scoop.it
Too many people think that big data is all about the software. Storage plays a huge role in big data success – here’s what you need to know to get started.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

PredictionIO Open Source Machine Learning Server

PredictionIO Open Source Machine Learning Server | EEDSP | Scoop.it

PredictionIO Open Source Machine Learning Server.

PredictionIO is an open source machine learning server for software developers to create predictive features, such as personalization, recommendation and content discovery.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Kyle Lutz: Boost.Compute v0.3 Released

Boost.Compute is a header-only C++ library for GPGPU and parallel-computing based on OpenCL. It is available on GitHub and instructions for getting started can be found in the documentation.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Google shows off Mesa, a super-fast data warehouse that runs across data centers

Google shows off Mesa, a super-fast data warehouse that runs across data centers | EEDSP | Scoop.it
Google has published a paper about its latest big data system, a globally distributed data warehouse called Mesa that can ingest millions of rows in minutes and even survive a data center failure.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Accelerate R Applications with CUDA

Accelerate R Applications with CUDA | EEDSP | Scoop.it

In this article, I will introduce the computation model of R with GPU acceleration, focusing on three topics:

accelerating R computations using CUDA libraries;calling your own parallel algorithms written in CUDA C/C++ or CUDA Fortran from R; andprofiling GPU-accelerated R applications using the CUDA Profiler.

 

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

A first step toward more global email

A first step toward more global email | EEDSP | Scoop.it

 In 2012, an organization called the Internet Engineering Task Force (IETF) created a new email standard that supports addresses with non-Latin and accented Latin characters (e.g. 武@メール.グーグル). In order for this standard to become a reality, every email provider and every website that asks you for your email address must adopt it. That’s obviously a tough hill to climb. The technology is there, but someone has to take the first step.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Tumblr: Hashing Your Way to Handling 23,000 Blog Requests per Second - High Scalability -

Tumblr: Hashing Your Way to Handling 23,000 Blog Requests per Second - High Scalability - | EEDSP | Scoop.it

At Tumblr, blogs (or Tumblelog) are one of our most highly trafficked faces on the internet.  One of the most convenient aspects of tumblelogs is their highly cacheable nature, which is fantastic because of the high views/post ratio the Tumblr network offers our users.  That said, it's not entirely trivial to scale out the perimeter proxy tier, let alone the caching tier, necessary for serving all of those requests.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

» Hadoop Cluster – Architecture and Core Components

» Hadoop Cluster – Architecture and Core Components | EEDSP | Scoop.it
Hadoop Architecture and Core Components - This articles explain the architecture of Hadoop and The Core Components of Hadoop
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The Twelve-Factor App

A methodology for building modern, scalable, maintainable software-as-a-service apps. The twelve-factor methodology can be applied to apps written in any programming language, and which use any combination of backing services (database, queue, memory cache, etc).

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

New in CDH 5.1: Hue’s Improved Search App

New in CDH 5.1: Hue’s Improved Search App | EEDSP | Scoop.it

Hue 3.6 (now packaged in CDH 5.1) has brought the second version of the Search App up to even higher standards. The user experience has been greatly improved, as the app now provides a very easy way to build custom dashboards and visualizations.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

GraphLab Create

GraphLab Create | EEDSP | Scoop.it

The software, GraphLab Create, simplifies big data analysis by combining all phases of the prototype-to-production process, allowing a single data scientist to do the job of many, according to the creators. The company says that there is a current shortage of data scientists, who have to derive value from a company's data by integrating a range of highly complicated, disparate tools and datasets. By using machine learning, GraphLab Create simplifies this task.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

RECURSIVE FUNCTIONS OF SYMBOLIC EXPRESSIONS AND THEIR COMPUTATION BY MACHINE (Part I) (12-May-1998)

The original Lisp paper. McCarthy 1960

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Powering big data at Pinterest

Powering big data at Pinterest | EEDSP | Scoop.it

Powering big data at Pinterest.

Big data plays a big role at Pinterest. With more than 30 billion Pins in the system, we’re building the most comprehensive collection of interests online. One of the challenges associated with building a personalized discovery engine is scaling our data infrastructure to traverse the interest graph to extract context and intent for each Pin.

more...
Dataconomy's curator insight, July 28, 2014 1:01 PM

An interesting look at the big data approach at Pinterest. 

Scooped by Shiwon Cho
Scoop.it!

CoreOS

CoreOS | EEDSP | Scoop.it

CoreOS is a new Linux distribution that has been rearchitected to provide features needed to run modern infrastructure stacks. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Mozilla Advances JPEG Encoding with mozjpeg 2.0

We’re pleased to announce the release of mozjpeg 2.0. Early this year, we explained that we started this project to provide a production-quality JPEG encoder ...
more...
No comment yet.