Intel Xeon Phi coprocessors allow symmetric heterogeneous clustering models, in which MPI processes are run fully on coprocessors, as opposed to offload-based clustering. These symmetric models are attractive, because they allow effortless porting of CPU-based applications to clusters with manycore computing accelerators.
Edison is Intel's smallest computer yet, and is intended for use in small, flexible electronics that can be worn around the body. The computer has Intel's extremely low-power Quark processor, and Bluetooth and Wi-Fi wireless connectivity to communicate with other devices.
CMake is a cross-platform, open-source build system. A special file named “toolchain” is required for cross-compilation, and this file should define all tools (compiler, linker, libraries, etc.) needed for building an application. To invoke Cmake with the toolchain file, use the following command-line option:
Along with the rise of General Purpose computing on Graphics Processing Units (GPGPU), GPUs themselves are evolving rapidly from fixed-function rasterization engines to more general processors. Today, discrete GPUs are typically connected to the CPU via the PCI Express* (PCIe) bus, which significantly limits the data transfer rate between the devices. Explicit boundaries for different memory spaces/hierarchies and high latency synchronization between devices result in quite a coarse-grained level of abstraction. Most OpenCL workloads today target the GPU only, leaving the CPU to do mainly scheduling, file and network I/O, and other “host” types of orchestration. In this approach the costs of PCIe transfers might be prohibitive if tasks are small and not amortized well by execution speed of a GPU.
This document is a design and coding guide for developing high performance OpenCL applications for the Intel® Xeon Phi™ coprocessor. It will take you from the Intel Xeon Phi coprocessor architecture and microarchitecture, through the key OpenCL constructs and show you how to use them efficiently to best utilize the Intel Xeon Phi coprocessor HW. Since exploiting HW parallelism is essential for performance applications, we will show you how to improve the parallelism of your OpenCL application on Intel Xeon Phi coprocessor. With this knowledge, you will be ready to design and program your application to perform best on Intel Xeon Phi coprocessor through OpenCL.
This document assists the user in optimizing applications on the Intel® Xeon Phi™ Coprocessor. It is intended for use with the Intel® VTune™ Amplifier XE performance profiler. It gives an architectural overview and details about which events and metrics to use to analyze performance, along with tuning suggestions.
This whitepaper is a transcription of George Chrysos’ presentation at the Hot Chips conference held in September 2012, covering details about the Intel® Xeon Phi™ coprocessor. Note that the results quoted in this paper were measured in development labs at Intel Corporation on prototype hardware and systems.
This document is designed to help users get started writing code and running Message Passing Interface (MPI) applications (using the Intel® MPI library) on a development platform that includes the Intel® Xeon Phi™ Coprocessor.
The new Intel Xeon processor E7 v2 product family is designed to make data more valuable for your business through in-memory computing – one of the more recent advances in data management and analytic solutions, which stores the entire data set in main memory rather than traditional hard disk storage. In-memory database and analytics solutions enable significant performance gains in analyzing complex and diverse datasets. We’re talking about analysis in seconds or minutes rather than hours or days. This is how you get to real-time insight.
The OpenMP (Open Multi-Processing) specification is a standard for a set of compiler directives, library routines, and environment variables that can be used to specify shared memory parallelism in Fortran and C/C++ programs.
This project implements OpenMP support in the Clang C language family front-end for the LLVM compiler. The current scope of the project is to support the OpenMP 3.1 specification.
The flow graph feature available in Intel® Threading Building Blocks (Intel® TBB) allows users to easily create both dependence graphs and reactive, messaging passing graphs that execute on top of Intel TBB tasks. Users programmatically create nodes and edges that express the computations performed by their application and the dependencies between these computations.
Intel® SDK for OpenCL* Applications 2013 is a comprehensive software development environment for OpenCL applications on the 3rd and the future 4th Generation Intel® Core™ processors, which support OpenCL 1.2 on Windows 7* and Windows 8* operating systems
As with most computing systems, the Intel® Many Integrated Core Architecture programming model can be divided into two categories: application programming and system programming.
In this guide, application programming refers to developing user applications or codes using either the Intel® Composer XE 2013 or 3rd party software development tools. These tools typically contain a development environment that includes compilers, libraries, and assorted other tools.
The Intel® Composer XE 2013 for Linux* includes the Intel® Debugger, which provides several approaches to analyzing and tracking down coding issues for heterogeneous applications that run on a host system and create offload processes on the Intel® Xeon Phi™ Coprocessor. In addition, the Intel® Debuggers permit debugging of coprocessor code only.
Monte Carlo uses statistical computing method to solve complex scientific computing problems. It innovatively uses random numbers to simulate the uncertainty of inputs to a problem and makes use of computer to process the repeated sampling of the parameter to solve the problem that otherwise impossible to obtain a deterministic result. This method was originally pioneered by nuclear physicists involved in Manhattan projects in late 40s. It is named after the biggest casino in the principality of Monaco.