C++14 is the informal name for the most recent revision of the C++ ISO/IEC standard, formally " International Standard ISO/IEC 14882:2014(E) Programming Language C++". C++14 is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements.
Chinese tech company Baidu has yet to make its popular search engine and other web services available in English. But consider yourself warned: Baidu could someday wind up becoming a favorite among consumers.
The developers of Apache Spark have given thoughtful consideration to Python as a language of choice for data analysis. They have developed the PySpark API for working with RDDs in Python, and further support using the powerful IPythonshell instead of the builtin Python REPL.
This site describes the rationale for the Wyvern programming language targeted at potential users of the language. It will grow to include a specification for Wyvern as well. The Wyvern programming language is a new language designed to smooth the use of internal DSLs in the form of type-specific languages.
Coprocessors and a neural network supercomputer may follow.
A team of scientists at Cornell University and IBM Research have gotten together to design a chip that's fundamentally different: an asynchronous collection of thousands of small processing cores, each capable of the erratic spikes of activity and complicated connections that are typical of neural behavior. When hosting a neural network, the chip is remarkably power efficient. And the researchers say their architecture can scale arbitrarily large, raising the prospect of a neural network supercomputer.
In 2012, an organization called the Internet Engineering Task Force (IETF) created a new email standard that supports addresses with non-Latin and accented Latin characters (e.g. 武＠メール.グーグル). In order for this standard to become a reality, every email provider and every website that asks you for your email address must adopt it. That’s obviously a tough hill to climb. The technology is there, but someone has to take the first step.
At Tumblr, blogs (or Tumblelog) are one of our most highly trafficked faces on the internet. One of the most convenient aspects of tumblelogs is their highly cacheable nature, which is fantastic because of the high views/post ratio the Tumblr network offers our users. That said, it's not entirely trivial to scale out the perimeter proxy tier, let alone the caching tier, necessary for serving all of those requests.
In the post I'll show you the way we developed quite simple architecture based on HAProxy, PHP, Redis and MySQL that seamlessly handles approx 1 billion requests every week. There’ll be also a note of the possible ways of further scaling it out and pointed uncommon patterns, that are specific for this project.
We make sense of the world around us by turning data into information. For years, research in fields such as machine learning (ML), data mining, databases, information retrieval, natural language processing, and speech recognition have steadily improved their techniques for revealing the information lying within otherwise opaque datasets. But computer science is now on the verge of a new era in data analysis because of several recent developments, including: the rise of the warehouse-scale computer (WSC), the massive explosion in online data, the increasing diversity and time-sensitivity of queries, and the advent of crowdsourcing. Together these trends — often referred to collectively as Big Data — have the potential for ushering in a new era in data analysis, but to realize this opportunity requires us to confront several significant scientific challenges:
Continuous data streams are ubiquitous and represent such a high volume of data that they cannot be stored to disk, yet it is often crucial for them to be analyzed in real-time. Stream processing is a programming paradigm that processes these immediately, and enables continuous analytics. Our objective is to make it easier for analysts, with little programming experience, to develop continuous analytics applications directly. We propose enhancing a spreadsheet, a pervasive tool, to obtain a programming platform for stream processing. We present the design and implementation of an enhanced spreadsheet that enables visualizing live streams, live programming to compute new streams, and exporting computations to be run on a server where they can be shared with other users, and persisted beyond the life of the spreadsheet. We formalize our core language, and present case studies that cover a range of stream processing applications.
Khronos released OpenCL SPIR 1.2 as a provisional specification, keeping it there over a protracted period to solicit feedback over the first version of the standard. Since that provisional release, Khronos finalized OpenCL 1.2 SPIR in early 2014 and has been working on building up their developer and user bases for SPIR.
Monitoring migrations is not an easy task. While in today's economy, survey data about economic confidence or public opinion are collected on a daily basis, that is not the case for migration statistics, which come from Censuses, population registers, and, occasionally, ad-hoc surveys--and are often outdated and inconsistent across countries. [...]
Get FREE eBooks. Learn best practices for building reactive applications.
Atomic Scala (Sample Chapters)by Bruce Eckel and Dianne Marsh
This should be your first Scala book, not your last. We show you enough to become familiar and comfortable with the language – competent, but not expert. You’ll write useful Scala code, but you won’t necessarily be able to read all the Scala code you encounter.
Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related to Google's Internet advertising business. Mesa is designed to satisfy a complex and challenging set of user and systems requirements, including near real-time data ingestion and queryability, as well as high availability, reliability, fault tolerance, and scalability for large data and query volumes. Specifically, Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day. Mesa is geo-replicated across multiple datacenters and provides consistent and repeatable query answers at low latency, even when an entire datacenter fails. This paper presents the Mesa system and reports the performance and scale that it achieves.
This article guides you in the installation of the new generation Hadoop based on YARN. It is based on the most recent version of Hadoop at the time of this writing (2.2.0) and includes HDFS, YARN and MapReduce configurations for both single-node and cluster environments.
AMPLab, the University of California, Berkeley, research group responsible for making Spark a household name in big data, has a lot more tricks up its sleeve. They range from databases to machine learning, and even include tools that could help treat cancer.