EEDSP
Follow
Find
10.9K views | +2 today
 
Scooped by Shiwon Cho
onto EEDSP
Scoop.it!

Using R — Packaging a C library in 15 minutes

Yes, this post condenses 50+ hours of learning into a 15 minute tutorial.  Read ‘em and weep.  (That is, you read while I weep.) OK.  For the last week I’ve been learning how to call C code as documented in … Continue reading →...
more...
No comment yet.
EEDSP
Digital Signal Processing, Data Analytics, Big Data, HPC, Deep Learning, GPGPU
Curated by Shiwon Cho
Your new post is loading...
Your new post is loading...
Scooped by Shiwon Cho
Scoop.it!

PageSpeed Tools - Make the Web Faster — Google Developers

Fast and optimized pages lead to higher visitor engagement, retention, and conversions. The PageSpeed family of tools is designed to help you optimize the performance of your website. PageSpeed Insights products will help you identify performance best practices that can be applied to your site, and PageSpeed optimization tools can help you automate the process.

 
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Apache Storm Graduates to a Top-Level Project

Apache Storm Graduates to a Top-Level Project | EEDSP | Scoop.it
Apache Storm is a distributed, fault tolerant, and scalable platform for processing streaming data, supporting real-time analytics and machine learning.

On September 17, the Apache Software Foundation (ASF) voted to graduate Apache Storm to a top-level project (TLP). This represents a major step forward for the project and represents the momentum built by a broad community of developers from not only Hortonworks, but also Yahoo!, Alibaba, Twitter, Microsoft and many other companies.

 

http://storm.apache.org

 

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Deep Learning Sentiment Analysis for Movie Reviews using Neo4j - Neo4j Graph Database

Deep Learning Sentiment Analysis for Movie Reviews using Neo4j - Neo4j Graph Database | EEDSP | Scoop.it

Kenny Bastani talks through using Neo4j for Deep Learning Sentiment Analysis for Movie Reviews.

Sentiment analysis uses natural language processing to extract features of a text that relate to subjective information found in source materials.


more...
Jim Goldsmith's curator insight, September 22, 10:30 AM

Very interesting topic.  From the article:  " movie review website allows users to submit reviews describing what they either liked or disliked about a particular movie. Being able to mine these reviews and generate valuable meta data that describes its content provides an opportunity to understand the general sentiment around that movie in a democratized way. That’s a pretty cool thing if you think about it. Using machine learning we can democratize subjectivity about anything in the world. We can make an objective analysis of subjective content, giving us the ability to better understand trends around products and services that we can use to make better decisions as consumers."  Read on... 

Scooped by Shiwon Cho
Scoop.it!

Google's First Quantum Computer Will Build on D-Wave's Approach - IEEE Spectrum

Google's First Quantum Computer Will Build on D-Wave's Approach - IEEE Spectrum | EEDSP | Scoop.it
Google's first quantum computer may represent a more stabilized version of D-Wave's specialized machines
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

OpenCL™ 2.0 is here! Download the Release 2 of Intel® SDK for OpenCL™ Applications 2014

Dear Developers,

We are happy to announce the availability of our latest and most advanced SDK for OpenCL: Release 2 of Intel® SDK for OpenCL SDK 2014 is the industry’s first SDK to provide an OpenCL 2.0 development environment with the new Intel® Core™ M Processors.

This major advance in graphics programmability and accessibility will help you make greater use of the graphics engine to deliver new experiences on Intel-based platforms 

New with SDK 2014 Release 2:
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Web Data | Gaston Sanchez

Web Data | Gaston Sanchez | EEDSP | Scoop.it

Website of Gaston Sanchez, statistical programmer and analytics engineer. 

The Web is a vast source of data and information. But the amount of content and its exponential growth surpasses our capacity to handle it analytically. We hardly know what to do with it, not to mention how to make sense of it. Most of us are in the same quest for our holy grails: gaining insight and learning from data. However, in order to start crunching numbers, building models, and training machines to learn, we must first have access to the data itself.

 
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

10 FREE Resources to Learn Statistics | Marketing Distillery

10 FREE Resources to Learn Statistics | Marketing Distillery | EEDSP | Scoop.it
Learning statistics is an important step to become a data scientist. It is not an easy subject, but easy to digest resources are available on the internet.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

VisuAlgo - visualising data structures and algorithms through animation

VisuAlgo - visualising data structures and algorithms through animation | EEDSP | Scoop.it

visualising data structures and algorithms through animation

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Hardware Initiative at Quantum Artificial Intelligence Lab

The Quantum Artificial Intelligence team at Google is launching a hardware initiative to design and build new quantum information processors based on superconducting electronics.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Using a Graph Database for Deep Learning Text Classification

Using a Graph Database for Deep Learning Text Classification | EEDSP | Scoop.it
Graphify gives you a mechanism to train natural language parsing models that extract features of a text using deep learning and a graph database.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The Feynman Lectures on Physics

Caltech and The Feynman Lectures Website are pleased to present this online edition of The Feynman Lectures on Physics. Now, anyone with internet access and a web browser can enjoy reading a high quality up-to-date copy of Feynman's legendary lectures.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Part 2: The Cloud Does Equal High performance - High Scalability -

Part 2: The Cloud Does Equal High performance - High Scalability - | EEDSP | Scoop.it

This a guest post by Anshu Prateek, Tech Lead, DevOps at Aerospike and Rajkumar Iyer, Member of the Technical Staff at Aerospike.

In our first post we busted the myth that cloud != high performance and outlined the steps to 1 Million TPS (100% reads in RAM) on 1 Amazon EC2 instance for just $1.68/hr. In this post we evaluate the performance of 4 Amazon instances when running a 4 node Aerospike cluster in RAM with 5 different read/write workloads and show that the r3.2xlarge instance delivers the best price/performance.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The Future of Apache Storm: Secure, Highly-Available, Multi-Tenant

The Future of Apache Storm: Secure, Highly-Available, Multi-Tenant | EEDSP | Scoop.it
YARN and Apache Storm: A Powerful Combination YARN changed the game for all data access engines in Apache Hadoop. As part of Hadoop 2, YARN took the resource management capabilities that were in MapReduce and packaged them for use by new engines. Now Apache Storm is one of those data-processing engines that can run alongside …
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

How Twitter Uses Redis to Scale - 105TB RAM, 39MM QPS, 10,000+ Instances  - High Scalability -

Yao Yue has worked on Twitter’s Cache team since 2010. She recently gave a really great talk: Scaling Redis at Twitter. It’s about Redis of course, but it's not just about Redis.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

43 Bigdata Platforms and Bigdata Analytics Software -

43 Bigdata Platforms and Bigdata Analytics Software - | EEDSP | Scoop.it
Bigdata Platforms and Bigdata Analytics Software : 41 + Bigdata Platforms and Bigdata Analytics Software including IBM Bigdata Analytics, HP Bigdata , SAP Bigdata Analytics, Microsoft Bigdata, Oracle Bigdata Analytics, Teradata Bigdata Analytics, SAS Big data, Dell Bigdata Analytics, Palantir Bigdata, Pivotal Bigdata, Google BigQuery, Pentaho Big Data Analytics, Amazon Web Service, Cloudera Enterprise Bigdata, Hortonworks Data Platform,
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Data Science 101: SparkR - Interactive R Programs at Scale - insideBIGDATA

Data Science 101: SparkR - Interactive R Programs at Scale - insideBIGDATA | EEDSP | Scoop.it

Data Science 101: SparkR - Interactive R Programs at Scale

SparkR is an open source R package developed at U.C. Berkeley AMPLab that allows data scientists to analyze large data sets and interactively run jobs on them from the R shell.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

NLUlite – An NLP Database

NLUlite – An NLP Database | EEDSP | Scoop.it
Programming book reviews, programming tutorials,programming news, C#, Ruby, Python,C, C++, PHP, Visual Basic, Computer book reviews, computer history, programming history, joomla, theory, spreadsheets and more.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

CausalImpact: A new open-source package for estimating causal effects in time series

CausalImpact: A new open-source package for estimating causal effects in time series | EEDSP | Scoop.it

Google open sourced a new package for the R statistical computing software that’s designed to help users infer whether a particular action really did cause subsequent activity. Google has been using the tool, called CausalImpact, to measure AdWords campaigns but it has broader appeal.


https://gigaom.com/2014/09/11/google-has-open-sourced-a-tool-for-inferring-cause-from-correlations/


more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

R & hadoop: Install Hadoop 2.5 in ubuntu 14.04 as well as RHadoop.

This article describes the step-by-step approach to install Hadoop/YARN 2.4.0 and R

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Linux Performance Tools at LinuxCon North America 2014

Linux Performance tools

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Home — Memory Management Reference 4.0 documentation

Home — Memory Management Reference 4.0 documentation | EEDSP | Scoop.it

This is a resource for programmers and computer scientists interested in memory management and garbage collection.


more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

How-to: Translate from MapReduce to Apache Spark

Venerable MapReduce has been Apache Hadoop‘s work-horse computation paradigm since its inception. It is ideal for the kinds of work for which Hadoop was originally designed: large-scale log processing, and batch-oriented ETL (extract-transform-load) operations.

As Hadoop’s usage has broadened, it has become clear that MapReduce is not the best framework for all computations. Hadoop has made room for alternative architectures by extracting resource management into its own first-class component, YARN. And so, projects like Impala have been able to use new, specialized non-MapReduce architectures to add interactive SQL capability to the platform, for example.

Today, Apache Spark is another such alternative, and is said by many to succeed MapReduce as Hadoop’s general-purpose computation paradigm. But if MapReduce has been so useful, how can it suddenly be replaced? After all, there is still plenty of ETL-like work to be done on Hadoop, even if the platform now has other real-time capabilities as well.


more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Building Lambda Architecture with Spark Streaming

The versatility of Apache Spark’s API for both batch/ETL and streaming workloads brings the promise of lambda architecture to the real world.

Few things help you concentrate like a last-minute change to a major project.

One time, after working with a customer for three weeks to design and implement a proof-of-concept data ingest pipeline, the customer’s chief architect told us:

You know, I really like the design – I like how data is validated on arrival. I like how we store the raw data to allow for exploratory analysis while giving the business analysts pre-computed aggregates for faster response times. I like how we automatically handle data that arrives late and changes to the data structure or algorithms.

But, he continued, I really wish there was a real-time component here. There is a one-hour delay between the point when data is collected until it’s available in our dashboards. I understand that this is to improve efficiency and protect us from unclean data. But for some of our use cases, being able to react immediately to new data is more important than being 100% certain of data validity.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Apache Storm Design Pattern—Micro Batching

Apache Storm Design Pattern—Micro Batching | EEDSP | Scoop.it
As part of Apache Storm design patterns' series blog, this first blog explores three options for micro-batching using Apache Storm's core APIs.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

1 Aerospike server X 1 Amazon EC2 instance = 1 Million TPS for just $1.68/hour - High Scalability -

1 Aerospike server X 1 Amazon EC2 instance = 1 Million TPS for just $1.68/hour - High Scalability - | EEDSP | Scoop.it

This a guest post by Anshu Prateek, Tech Lead, DevOps at Aerospike and Rajkumar Iyer, Member of the Technical Staff at Aerospike.

Cloud infrastructure services like Amazon EC2 have proven their worth with wild success. The ease of scaling up resources, spinning them up as and when needed and paying by unit of time has unleashed developer creativity, but virtualized environments are not widely considered as the place to run high performance applications and databases.

more...
No comment yet.