opencl, opengl, webcl, webgl
25.8K views | +0 today
Follow
Your new post is loading...
Your new post is loading...
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU Pro Tip: Fast Dynamic Indexing of Private Arrays in CUDA

GPU Pro Tip: Fast Dynamic Indexing of Private Arrays in CUDA | opencl, opengl, webcl, webgl | Scoop.it
Sometimes you need to use small per-thread arrays in your GPU kernels. The performance of accessing elements in these arrays can vary depending on a number of factors. In this post I'll cover sever...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

How OpenCL Could Open the Gates for FPGAs

How OpenCL Could Open the Gates for FPGAs | opencl, opengl, webcl, webgl | Scoop.it
"The silver bullet in HLS is the ability to take a sequential description that has been written in C and then find this parallelism, the concurrencies, without the user having to think. That was a ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Tutorial on the OpenCL 2.0 Generic Address Space - TechEnablement

Tutorial on the OpenCL 2.0 Generic Address Space - TechEnablement | opencl, opengl, webcl, webgl | Scoop.it
The OpenCL 2.0 generic address space makes writing OCL apps easier by removing the requirement of decorating all pointers with a points to address space.
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Analysis and Modeling of the Timing Behavior of GPU Architectures

Analysis and Modeling of the Timing Behavior of GPU Architectures | opencl, opengl, webcl, webgl | Scoop.it
Graphics processing units (GPUs) offer massive parallelism. Since a couple of years GPUs can also be used for more general purpose applications; a wide variety of applications can be accelerated ef...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Fast Subgraph Matching on Large Graphs using Graphics Processors

Fast Subgraph Matching on Large Graphs using Graphics Processors | opencl, opengl, webcl, webgl | Scoop.it
Subgraph matching is the task of finding all matches of a query graph in a large data graph, which is known as an NP-complete problem. Many algorithms are proposed to solve this problem using CPUs....
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Is the CPU slowly turning into a GPU? - Blog - StreamComputing

Is the CPU slowly turning into a GPU? - Blog - StreamComputing | opencl, opengl, webcl, webgl | Scoop.it
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Comparison of OpenCL performance on different platforms using VexCL and Blaze

Comparison of OpenCL performance on different platforms using VexCL and Blaze | opencl, opengl, webcl, webgl | Scoop.it
This technical report provides performance numbers for several benchmark problems running on several different hardware platforms. The goal of this report is twofold. First, it helps us better unde...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Unlocking Bandwidth for GPUs in CC-NUMA Systems

Unlocking Bandwidth for GPUs in CC-NUMA Systems | opencl, opengl, webcl, webgl | Scoop.it
Historically, GPU-based HPC applications have had a substantial memory bandwidth advantage over CPU-based workloads due to using GDDR rather than DDR memory. However, past GPUs required a restricte...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Persistent Mapped Buffers in OpenGL - CodeProject

Persistent Mapped Buffers in OpenGL - CodeProject | opencl, opengl, webcl, webgl | Scoop.it
Summary of techniques to stream data from CPU to GPU in OpenGL with focusing on new method called persistent mapped buffers.; Author: Bartlomiej Filipek; Updated: 3 Feb 2015; Section: OpenGL; Chapter: Multimedia; Updated: 3 Feb 2015...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Exploiting Concurrency Patterns with Heterogeneous Task and Data Parallelism

Exploiting Concurrency Patterns with Heterogeneous Task and Data Parallelism | opencl, opengl, webcl, webgl | Scoop.it
Parallel programming of an application requires not only domain knowledge of the application, but also programming environment support and in-depth awareness of the target architecture. Often, all ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Locality-aware parallel block-sparse matrix-matrix multiplication using the Chunks and Tasks programming model

Locality-aware parallel block-sparse matrix-matrix multiplication using the Chunks and Tasks programming model | opencl, opengl, webcl, webgl | Scoop.it
We present a library for parallel block-sparse matrix-matrix multiplication on distributed memory clusters. The library is based on the Chunks and Tasks programming model [Parallel Comput. 40, 328 ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Characterizing and Enhancing Global Memory Data Coalescing on GPUs

Characterizing and Enhancing Global Memory Data Coalescing on GPUs | opencl, opengl, webcl, webgl | Scoop.it
Effective parallel programming for GPUs requires careful attention to several factors, including ensuring coalesced access of data from global memory. There is a need for tools that can provide fee...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

A Real-time GPU Implementation of the SIFT Algorithm for Large-Scale Video Analysis Tasks

A Real-time GPU Implementation of the SIFT Algorithm for Large-Scale Video Analysis Tasks | opencl, opengl, webcl, webgl | Scoop.it
The SIFT algorithm is one of the most popular feature extraction methods and therefore widely used in all sort of video analysis tasks like instance search and duplicate/near-duplicate detection. W...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Accelerating Bioinformatics with NVBIO

Accelerating Bioinformatics with NVBIO | opencl, opengl, webcl, webgl | Scoop.it
NVBIO is an open-source C++ template library of high performance parallel algorithms and containers designed by NVIDIA to accelerate sequence analysis and bioinformatics applications. NVBIO has a t...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code)

Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code) | opencl, opengl, webcl, webgl | Scoop.it
Computing systems have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at t...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU-accelerated HMM for Speech Recognition

GPU-accelerated HMM for Speech Recognition | opencl, opengl, webcl, webgl | Scoop.it
Speech recognition is used in a wide range of applications and devices such as mobile phones, in-car entertainment systems and web-based services. Hidden Markov Models (HMMs) is one of the most pop...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Speech Recognition on Modern Graphic Processing Units

Speech Recognition on Modern Graphic Processing Units | opencl, opengl, webcl, webgl | Scoop.it
Speech Recognition run on Graphic Processing Units (GPUs) has shown some promising performance improvements ranging 2-10x speedups when compare to execution on CPUs. GPU has continued to introduce ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Intel Posts OpenCL 2.0 QuickSort Tutorial (Compare to TE CUDA Version) - TechEnablement

Intel Posts OpenCL 2.0 QuickSort Tutorial (Compare to TE CUDA Version) - TechEnablement | opencl, opengl, webcl, webgl | Scoop.it
Intel Engineer Robert Ioffe has posted an OpenCL QuickSort tutorial that utilizes nested parallelism and Workgroup-scan functions. In particular, the tutorial shows how to use the OpenCL™ 2.0 enqueue_kernel functions that queue kernels from the device...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Video: How to Build a Cheap Supercomputer

Video: How to Build a Cheap Supercomputer | opencl, opengl, webcl, webgl | Scoop.it
In this video, Rasim Muratovic shows you how to to build a cheap super computer using Raspberry Pi devices. In related news, the $35 Raspberry Pi 2 is out with a faster processor and twice the mem...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Cryptography on Graphics Processing Unit: A Survey

Cryptography on Graphics Processing Unit: A Survey | opencl, opengl, webcl, webgl | Scoop.it
The profession of shelter advertisement by transfigure it into an unreadable arrange name decipher text, only those who possess a recondite keyboard can read the express into bewail text is Cryptog...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Using OpenCL 1.2 with 1.1 devices - Blog - StreamComputing

Using OpenCL 1.2 with 1.1 devices - Blog - StreamComputing | opencl, opengl, webcl, webgl | Scoop.it
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Scaling Recurrent Neural Network Language Models

Scaling Recurrent Neural Network Language Models | opencl, opengl, webcl, webgl | Scoop.it
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale wi...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Pointer Analysis for Semi-Automatic Code Parallelizers

Pointer Analysis for Semi-Automatic Code Parallelizers | opencl, opengl, webcl, webgl | Scoop.it
Code parallelizers are employed these days to reduce the efforts needed in manually parallelizing sequential code. But they are ineffective when it comes to handling programming constructs like poi...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Reliable Initialization of GPU-enabled Parallel Stochastic Simulations Using Mersenne Twister for Graphics Processors

Reliable Initialization of GPU-enabled Parallel Stochastic Simulations Using Mersenne Twister for Graphics Processors | opencl, opengl, webcl, webgl | Scoop.it
Parallel stochastic simulations tend to exploit more and more computing power and they are now also developed for General Purpose Graphics Process Units (GP-GPUs). Consequently, they need reliable ...
more...
No comment yet.