Today, we announced an exciting set of joint initiatives with Microsoft, including:
Extending Docker to Windows with Docker Engine for Windows ServerMicrosoft’s support of Docker’s open orchestration APIsIntegration of Docker Hub with Microsoft Azure, andCollaboration on the multi-Docker container model, including support for applications consisting of both Linux and Windows Docker containers
I’d like to provide some context for this announcement, and why we are so excited.
During photosynthesis, plants only convert about 10% of the light they receive from the sun into usable hydrogen to fuel the reaction. Last summer, a group of researchers were able to break the world record for laboratory efficiency by reaching 44.7% with a new cell, with 50% as the ultimate goal.
Bigdata Platforms and Bigdata Analytics Software : 41 + Bigdata Platforms and Bigdata Analytics Software including IBM Bigdata Analytics, HP Bigdata , SAP Bigdata Analytics, Microsoft Bigdata, Oracle Bigdata Analytics, Teradata Bigdata Analytics, SAS Big data, Dell Bigdata Analytics, Palantir Bigdata, Pivotal Bigdata, Google BigQuery, Pentaho Big Data Analytics, Amazon Web Service, Cloudera Enterprise Bigdata, Hortonworks Data Platform,
Google open sourced a new package for the R statistical computing software that’s designed to help users infer whether a particular action really did cause subsequent activity. Google has been using the tool, called CausalImpact, to measure AdWords campaigns but it has broader appeal.
Venerable MapReduce has been Apache Hadoop‘s work-horse computation paradigm since its inception. It is ideal for the kinds of work for which Hadoop was originally designed: large-scale log processing, and batch-oriented ETL (extract-transform-load) operations.
As Hadoop’s usage has broadened, it has become clear that MapReduce is not the best framework for all computations. Hadoop has made room for alternative architectures by extracting resource management into its own first-class component, YARN. And so, projects like Impala have been able to use new, specialized non-MapReduce architectures to add interactive SQL capability to the platform, for example.
Today, Apache Spark is another such alternative, and is said by many to succeed MapReduce as Hadoop’s general-purpose computation paradigm. But if MapReduce has been so useful, how can it suddenly be replaced? After all, there is still plenty of ETL-like work to be done on Hadoop, even if the platform now has other real-time capabilities as well.
The versatility of Apache Spark’s API for both batch/ETL and streaming workloads brings the promise of lambda architecture to the real world.
Few things help you concentrate like a last-minute change to a major project.
One time, after working with a customer for three weeks to design and implement a proof-of-concept data ingest pipeline, the customer’s chief architect told us:
You know, I really like the design – I like how data is validated on arrival. I like how we store the raw data to allow for exploratory analysis while giving the business analysts pre-computed aggregates for faster response times. I like how we automatically handle data that arrives late and changes to the data structure or algorithms.
But, he continued, I really wish there was a real-time component here. There is a one-hour delay between the point when data is collected until it’s available in our dashboards. I understand that this is to improve efficiency and protect us from unclean data. But for some of our use cases, being able to react immediately to new data is more important than being 100% certain of data validity.
Fast and optimized pages lead to higher visitor engagement, retention, and conversions. The PageSpeed family of tools is designed to help you optimize the performance of your website. PageSpeed Insights products will help you identify performance best practices that can be applied to your site, and PageSpeed optimization tools can help you automate the process.
Apache Storm is a distributed, fault tolerant, and scalable platform for processing streaming data, supporting real-time analytics and machine learning.
On September 17, the Apache Software Foundation (ASF) voted to graduate Apache Storm to a top-level project (TLP). This represents a major step forward for the project and represents the momentum built by a broad community of developers from not only Hortonworks, but also Yahoo!, Alibaba, Twitter, Microsoft and many other companies.
We are happy to announce the availability of our latest and most advanced SDK for OpenCL: Release 2 of Intel® SDK for OpenCL SDK 2014 is the industry’s first SDK to provide an OpenCL 2.0 development environment with the new Intel® Core™ M Processors.
This major advance in graphics programmability and accessibility will help you make greater use of the graphics engine to deliver new experiences on Intel-based platforms
Website of Gaston Sanchez, statistical programmer and analytics engineer.
The Web is a vast source of data and information. But the amount of content and its exponential growth surpasses our capacity to handle it analytically. We hardly know what to do with it, not to mention how to make sense of it. Most of us are in the same quest for our holy grails: gaining insight and learning from data. However, in order to start crunching numbers, building models, and training machines to learn, we must first have access to the data itself.