opencl, opengl, w...
Follow
Find
17.8K views | +4 today
 
Scooped by Mikael Bourges-Sevenier
onto opencl, opengl, webcl, webgl
Scoop.it!

uBench: Performance Impact of CUDA Block Geometry | hgpu.org

uBench: Performance Impact of CUDA Block Geometry | Benchmarking, Computer science, CUDA, nVidia, nVidia GeForce GTX 480, nVidia GeForce GTX 680, Performance
more...
No comment yet.
Your new post is loading...
Your new post is loading...
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Different Optimization Strategies and Performance Evaluation of Reduction on Multicore CUDA Architecture

Different Optimization Strategies and Performance Evaluation of Reduction on Multicore CUDA Architecture | opencl, opengl, webcl, webgl | Scoop.it
The objective of this paper is to use different optimization strategies on multicore GPU architecture. Here for performance evaluation we have used parallel reduction algorithm. GPU on-chip shared ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

On Vectorization of Deep Convolutional Neural Networks for Vision Tasks

On Vectorization of Deep Convolutional Neural Networks for Vision Tasks | opencl, opengl, webcl, webgl | Scoop.it
We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN). While the success mainly stems from the...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Tangram: a High-level Language for Performance Portable Code Synthesis

Tangram: a High-level Language for Performance Portable Code Synthesis | opencl, opengl, webcl, webgl | Scoop.it
We propose Tangram, a general-purpose high-level language that achieves high performance across architectures. In Tangram, a program is written by synthesizing elemental pieces of code snippets, ca...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Can Portability Improve Performance? An Empirical Study of Parallel Graph Analytics

Can Portability Improve Performance? An Empirical Study of Parallel Graph Analytics | opencl, opengl, webcl, webgl | Scoop.it
Due to increasingly large datasets, graph analytics - traversals, all-pairs shortest path computations, centrality measures, etc. - are becoming the focus of high-performance computing (HPC). Becau...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU Pro Tip: CUDA 7 Streams Simplify Concurrency

GPU Pro Tip: CUDA 7 Streams Simplify Concurrency | opencl, opengl, webcl, webgl | Scoop.it
Heterogeneous computing is about efficiently using all processors in the system, including CPUs and GPUs. To do this, applications must execute functions concurrently on multiple processors. CUDA A...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Global finite element matrix construction based on a CPU-GPU implementation

Global finite element matrix construction based on a CPU-GPU implementation | opencl, opengl, webcl, webgl | Scoop.it
The finite element method (FEM) has several computational steps to numerically solve a particular problem, to which many efforts have been directed to accelerate the solution stage of the linear sy...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU concurrency: Weak behaviours and programming assumptions

GPU concurrency: Weak behaviours and programming assumptions | opencl, opengl, webcl, webgl | Scoop.it
Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current specifications of languages and hardware are inconclusive; thus programmers often rely on folklore...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU-Quicksort in OpenCL 2.0: Nested Parallelism and Work-Group Scan Functions - CodeProject

GPU-Quicksort in OpenCL 2.0: Nested Parallelism and Work-Group Scan Functions - CodeProject | opencl, opengl, webcl, webgl | Scoop.it
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add; Author: Android on Intel; Updated:...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Hybrid Multicore Algorithms for Some Semi-Numerical Applications and Graphs

Hybrid Multicore Algorithms for Some Semi-Numerical Applications and Graphs | opencl, opengl, webcl, webgl | Scoop.it
The computing industry has undergone several paradigm shifts in the last few decades. Fueled by the need of faster computing, larger data and real time processing needs parallel computing has emerg...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Harnessing Aspect Oriented Programming on GPU: Application to Warp-Level Parallelism (WLP)

Harnessing Aspect Oriented Programming on GPU: Application to Warp-Level Parallelism (WLP) | opencl, opengl, webcl, webgl | Scoop.it
Stochastic simulations involve multiple replications in order to build confidence intervals for their results, and Designs Of Experiments (DOEs) to explore their parameters set. In this paper, we p...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Parallel Implementation of the Finite Element Method on Graphics Processors for the Solution of Incompressible Flows

Parallel Implementation of the Finite Element Method on Graphics Processors for the Solution of Incompressible Flows | opencl, opengl, webcl, webgl | Scoop.it
In recent years clock speeds and memory bandwidths of Graphics Processing Units (GPUs) increased dramatically compared to CPUs. Also GPU vendors developed and freely released new programming tools ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Parallel Algorithms for Counting Problems on Graphs Using Graphics Processing Units

Parallel Algorithms for Counting Problems on Graphs Using Graphics Processing Units | opencl, opengl, webcl, webgl | Scoop.it
The availability of Graphics Processing Units (GPUs) with multicore architecture have enabled parallel computations using extensive multi-threading. Recent advancements in computer hardware have le...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

OpenCL Implementation of LiDAR Data Processing

OpenCL Implementation of LiDAR Data Processing | opencl, opengl, webcl, webgl | Scoop.it
When designing a safety system, the faster the response time, the greater the reflexes of the system to hazards. As more commercial interest in autonomous and assisted vehicles grows, the number on...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs

maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUs | opencl, opengl, webcl, webgl | Scoop.it
This paper describes maxDNN, a computationally efficient convolution kernel for deep learning with the NVIDIA Maxwell GPU. maxDNN reaches 96.3% computational efficiency on typical deep learning net...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU computing architecture for irregular parallelism

GPU computing architecture for irregular parallelism | opencl, opengl, webcl, webgl | Scoop.it
Many applications with regular parallelism have been shown to benefit from using Graphics Processing Units (GPUs). However, employing GPUs for applications with irregular parallelism tends to be a ...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Gunrock: A High-Performance Graph Processing Library on the GPU

Gunrock: A High-Performance Graph Processing Library on the GPU | opencl, opengl, webcl, webgl | Scoop.it
For large-scale graph analytics on the GPU, the irregularity of data access and control flow and the complexity of programming GPUs have been two significant challenges for developing a programmabl...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Taming the complexities of the C11 and OpenCL memory models

Taming the complexities of the C11 and OpenCL memory models | opencl, opengl, webcl, webgl | Scoop.it
We study how the C11 memory model can be simplified and how it can be extended. Our first contribution is to propose a mild strengthening of the model that enables the rules pertaining to sequentia...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

An investigation of fast real-time GPU-based image blur algorithms - CodeProject

An investigation of fast real-time GPU-based image blur algorithms - CodeProject | opencl, opengl, webcl, webgl | Scoop.it
In this blog post I'm going to start exploring the topic of blur filters.; Author: Android on Intel; Updated: 21 Jan 2015; Section: Product Showcase; Chapter: Third Party Products; Updated: 21 Jan 2015...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

A Novel Implementation of QuickHull Algorithm on the GPU

A Novel Implementation of QuickHull Algorithm on the GPU | opencl, opengl, webcl, webgl | Scoop.it
We present a novel GPU-accelerated implementation of the QuickHull algorithm for calculating convex hulls of planar point sets. We also describe a practical solution to demonstrate how to efficient...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Reproducible and Accurate Matrix Multiplication for GPU Accelerators

Reproducible and Accurate Matrix Multiplication for GPU Accelerators | opencl, opengl, webcl, webgl | Scoop.it
Due to non-associativity of floating-point operations and dynamic scheduling on parallel architectures, getting a bitwise reproducible floating-point result for multiple executions of the same code...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Run length encoding in OpenCV [w/code] | More Than Technical

Run length encoding in OpenCV [w/code] | More Than Technical | opencl, opengl, webcl, webgl | Scoop.it
Code snippet for RLE calculation on OpenCV Mats
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Accelerating mahout on heterogeneous clusters using HadoopCL

Accelerating mahout on heterogeneous clusters using HadoopCL | opencl, opengl, webcl, webgl | Scoop.it
MapReduce is a programming model capable of processing massive data in parallel across hundreds of computing nodes in a cluster. It hides many of the complicated details of parallel computing and p...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Kriging Interpolation Exhibits Strong Scaling Across GPUs - TechEnablement

Kriging Interpolation Exhibits Strong Scaling Across GPUs - TechEnablement | opencl, opengl, webcl, webgl | Scoop.it
Geostatistical interpolation (Kriging) is very useful plus the computationally intense code can be parallelized across multiple devices for 200x speedups.
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU Processing for UAS-Based LFM-CW Stripmap SAR

GPU Processing for UAS-Based LFM-CW Stripmap SAR | opencl, opengl, webcl, webgl | Scoop.it
Unmanned air systems (UAS) provide an excellent platform for synthetic aperture radar (SAR), enabling surveillance and research over areas too difficult, dangerous, or costly to reach using manned ...
more...
No comment yet.