EEDSP
Follow
Find tag "bigdata"
14.7K views | +7 today
EEDSP
Digital Signal Processing, Data Analytics, Big Data, HPC, Deep Learning, GPGPU, Distributed and Parallel Computing
Curated by Shiwon Cho
Your new post is loading...
Your new post is loading...
Scooped by Shiwon Cho
Scoop.it!

GitHub Special: Data Scientists to Follow & Best Tutorials on GitHub

GitHub Special: Data Scientists to Follow & Best Tutorials on GitHub | EEDSP | Scoop.it
GitHub has some of the most awesome collections of data science resources. This article provides this list and people to follow on GitHub
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

You’re Not a Data Scientist

You’re Not a Data Scientist | EEDSP | Scoop.it
Many of my friends, colleagues and contacts have started calling themselves Data Scientists. A number of resumes have cr…
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Tachyon Overview - Tachyon 0.6.4 Documentation

Tachyon is a memory-centric distributed storage system enabling reliable data sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. It achieves high performance by leveraging lineage information and using memory aggressively. Tachyon caches working set files in memory, thereby avoiding going to disk to load datasets that are frequently read. This enables different jobs/queries and frameworks to access cached files at memory speed.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

New in the Wolfram Language: WikipediaData—Wolfram Blog

New in the Wolfram Language: WikipediaData—Wolfram Blog | EEDSP | Scoop.it
Wikipedia integrated service has been added to the latest version of the Wolfram Language. Just feed in content for text processing and visualization.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Top 30 Predictive Analytics Software API

Top 30 Predictive Analytics Software API | EEDSP | Scoop.it
Top 30 Predictive Analytics Software API : Top 30 Predictive Analytics Software API including Google Prediction API, BigML, Microsoft Azure Machine Learning, Swift API, Datagami, GraphLab, Data Science Studio, Apigee Insights, Openscoring.io, Intuitics, Anomaly Detective, Zementis, Predixion, Datumbox Machine Learning Framework, PredictionIO, Logical Glue, Ersatz, H2O, Yottamine, Lattice, InsideView, AgilOne, Futurelytics, Fliptop, RelateIQ, Lumiata, Versium LifeData, Indico, INRIX
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Lambda Architecture: Design Simpler, Resilient, Maintainable and Scalable Big Data Solutions

Lambda Architecture: Design Simpler, Resilient, Maintainable and Scalable Big Data Solutions | EEDSP | Scoop.it
Lambda Architecture proposes a simpler, elegant paradigm designed to store and process large amounts of data. In this article, author Daniel Jebaraj presents the motivation behind the Lambda Architecture, reviews its structure with the help of a sample Java application.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Yandex technologies. MatrixNet

Yandex technologies. MatrixNet | EEDSP | Scoop.it
The job of a search engine is, first and foremost, to provide answers to user’s queries. In response to each query, a search engine returns links to web pages it finds in its index – a database of web pages known to this particular search engine. Thus, an answer to the user’s query comes in the form of search results – a list of hyperlinks to web pages, whose content matches this query.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

How to become a data scientist in 8 easy steps: the infographic

How to become a data scientist in 8 easy steps: the infographic | EEDSP | Scoop.it
This post was written by the team behind DataCamp, the online interactive learning platform for data science.   After being dubbed “sexiest job of the 21st Century” by Harvard Business Review, data scientists have stirred the interest of the general public. Many people are intrigued by this job, namely because the name has an interesting […]
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Top 29 Predictive Analytics Software API

Top 29 Predictive Analytics Software API | EEDSP | Scoop.it
Top 29 Predictive Analytics Software API : Top 29 Predictive Analytics Software API including Google Prediction API, BigML, Microsoft Azure Machine Learning, Swift API, Datagami, GraphLab, Data Science Studio, Apigee Insights, Openscoring.io, Intuitics, Anomaly Detective, Zementis, Predixion, Datumbox Machine Learning Framework, PredictionIO, Logical Glue, Ersatz, H2O, Yottamine, Lattice, InsideView, AgilOne, Futurelytics, Fliptop, RelateIQ, Lumiata, Versium LifeData, Indico, INRIX
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Gaia Mission versus Star Trek: The data challenge

Gaia Mission versus Star Trek: The data challenge | EEDSP | Scoop.it
To gather precise data on more than a billion stars and other phenomena in our galaxy requires some intense data collection equipment and processing tools.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Part 2: The Cloud Does Equal High performance - High Scalability -

Part 2: The Cloud Does Equal High performance - High Scalability - | EEDSP | Scoop.it

This a guest post by Anshu Prateek, Tech Lead, DevOps at Aerospike and Rajkumar Iyer, Member of the Technical Staff at Aerospike.

In our first post we busted the myth that cloud != high performance and outlined the steps to 1 Million TPS (100% reads in RAM) on 1 Amazon EC2 instance for just $1.68/hr. In this post we evaluate the performance of 4 Amazon instances when running a 4 node Aerospike cluster in RAM with 5 different read/write workloads and show that the r3.2xlarge instance delivers the best price/performance.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

分散システム処理モデルに関する動向について(MapReduceからBorgまで)

分散システム処理モデルに関する動向について(MapReduceからBorgまで) | EEDSP | Scoop.it
今回は、Googleから公開されたBorgなる論文を大規模分散システムの処理モデル的な観点から考察してみたいと思います。端的に言えば、Borgも含めた最近のクラウド環境の分散システムには重要なパラダイムシフト的な潮流があります。
大規模分散システムの処理モデル的な観点で、最初に近年のクラウド環境の分散システム動向を整理しつつ、最後にBorgから直近の分散システムの潮流を考察してみます。
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

How Stephen Wolfram’s image-recognition tool performs against 5 alternatives

How Stephen Wolfram’s image-recognition tool performs against 5 alternatives | EEDSP | Scoop.it
To get a feel for the power of the new Wolfram technology, I put it up against other image-recognition systems.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Grappling with the growth of scientific data - Scientific Computing World

Grappling with the growth of scientific data - Scientific Computing World | EEDSP | Scoop.it

Metadata is key to mastering the volumes of data in science and engineering, argues Bob Murphy, and tools are available to make it easier

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform (Part 1)

Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform (Part 1) | EEDSP | Scoop.it
These days you hear a lot about "stream processing", "event data", and "real-time", often related to technologies like Kafka, Storm, Samza, or Spark's Streaming module. Though there is a lot of exc...
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Start of a new era: Apache HBase 1.0 : Apache HBase

Start of a new era: Apache HBase 1.0 : Apache HBase | EEDSP | Scoop.it

The Apache HBase community has released Apache HBase 1.0.0. Seven years in the making, it marks a major milestone in the Apache HBase project’s development, offers some exciting features and new API’s without sacrificing stability, and is both on-wire and on-disk compatible with HBase 0.98.x.

 
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

The Big vs of Big Data - Predictive Analytics Today

The Big vs of Big Data - Predictive Analytics Today | EEDSP | Scoop.it
The Big vs of Big data: The Big vs of Big Data : Turning information overload to sales.
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Top 14 Big Data Books of 2014

Top 14 Big Data Books of 2014 | EEDSP | Scoop.it
2014 has been a huge year in big data- and big data publishing. Viktor Mayer-Schoenberger and Kenneth Cukier re-published and added an extra chapter to their bestselling Big Data; Nate Silver graced t
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Zeppelin

Zeppelin | EEDSP | Scoop.it

A web-based notebook that enables interactive data analytics. 
You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.

Watch the video

 
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Announcing Azure Stream Analytics for real-time event processing | Microsoft Azure Blog

Announcing Azure Stream Analytics for real-time event processing | Microsoft Azure Blog | EEDSP | Scoop.it

At TechEd Europe 2014, Microsoft announced the preview of Azure Stream Analytics. Stream Analytics is a real-time event processing engine that helps uncover insights from devices, sensors, infrastructure, applications, and data. With out-of-the-box integration to Event Hubs, the combined solution can both ingest millions of events as well as do analytics to better understand patterns, power a dashboard, detect anomalies, and kick off an action while data is being streamed in real-time.

more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts - IEEE Spectrum

Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts - IEEE Spectrum | EEDSP | Scoop.it
Big-data boondoggles and brain-inspired chips are just two of the things we’re really getting wrong
more...
No comment yet.
Scooped by Shiwon Cho
Scoop.it!

43 Bigdata Platforms and Bigdata Analytics Software -

43 Bigdata Platforms and Bigdata Analytics Software - | EEDSP | Scoop.it
Bigdata Platforms and Bigdata Analytics Software : 41 + Bigdata Platforms and Bigdata Analytics Software including IBM Bigdata Analytics, HP Bigdata , SAP Bigdata Analytics, Microsoft Bigdata, Oracle Bigdata Analytics, Teradata Bigdata Analytics, SAS Big data, Dell Bigdata Analytics, Palantir Bigdata, Pivotal Bigdata, Google BigQuery, Pentaho Big Data Analytics, Amazon Web Service, Cloudera Enterprise Bigdata, Hortonworks Data Platform,
more...
No comment yet.