opencl, opengl, w...
Follow
Find
18.5K views | +10 today
Your new post is loading...
Your new post is loading...
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Performance Comparison of GPU, DSP and FPGA implementations of image processing and computer vision algorithms in embedded systems

The objective of this thesis is to compare the suitability of FPGAs, GPUs and DSPs for digital image processing applications. Normalized cross-correlation is used as a benchmark, because this algor...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

High-Performance GPGPU Programming with OCaml

We present an OCaml GPGPU library with a DSL embedded into OCaml to express GPGPU kernels. The level of performance achieved is measured through different examples. We also discuss the use of GPGPU...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Dandelion: a Compiler and Runtime for Heterogeneous Systems

Computer systems increasingly rely on heterogeneity to achieve greater performance, scalability and energy efficiency. Because heterogeneous systems typically comprise multiple execution contexts w...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Coupling a Generalized DEM and an SPH Models Under a Heterogeneous Massively Parallel Framework

The interaction of flows and solid objects is a recurring problem in several engineering disciplines. The objective of this work is to present a fully coupled model, based on the fundamental conser...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

A Parallel Intermediate Representation for Embedded Languages

This thesis presents a parallel intermediate representation for embedded languages called PIRE, and its incorporation into the Feldspar language. The original Feldspar backend translates the parall...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GALAMOST: GPU-accelerated large-scale molecular simulation toolkit

A new molecular simulation toolkit composed of some lately developed force fields and specified models is presented to study the self-assembly, phase transition, and other properties of polymeric s...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Performance Analysis of a Large Memory Application on Multiple Architectures

The Graph500 Breadth-First Search benchmark has emerged as a well-documented PGAS-style application that both scales to large data set sizes and has documented implementations on multiple platforms...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Parametric GPU Code Generation for Affine Loop Programs

Partitioning a parallel computation into finitely sized chunks for effective mapping onto a parallel machine is a critical concern for source-to-source compilation. In the context of OpenCL and CUD...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Cudagrind: A Valgrind Extension for CUDA

Valgrind, and specifically the included tool Memcheck, offers an easy and reliable way for checking the correctness of memory operations in programs. This works in an unintrusive way where Valgrind...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Facial Expression Recognition - Review

Expression recognition (happy, sad, disgust, surprise, angry, fear expressions) is application of advanced object detection, pattern recognition and classification task. Facial expression recogniti...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Compiler Optimizations for SIMD/GPU/Multicore Architectures

In modern computer architectures, both SIMD (single-instruction multiple-data) instruction set extensions and GPUs can be used to accelerate the general purpose applications. In addition, the multi...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

3D Non-Local Means denoising via multi-GPU

Non-Local Means (NLM) algorithm is widely considered as a state-of-the-art denoising filter in many research fields. High computational complexity led to implementations on Graphic Processor Unit (...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Characterizing the Challenges and Evaluating the Efficacy of a CUDA-to-OpenCL Translator

The proliferation of heterogeneous computing systems has led to increased interest in parallel architectures and their associated programming models. One of the most promising models for heterogene...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Automatic run-time mapping of polyhedral computations to heterogeneous devices with memory-size restrictions

Tools that aim to automatically map parallel computations to heterogeneous and hierarchical systems try to divide the whole computation in parts with computational loads adjusted to the capabilitie...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

High performance sequence mining using pairwise statistical significance

With the amount of sequence data deluge as a result of next generation sequencing, there comes a need to leverage the large-scale biological sequence data. Therefore, the role of high performance c...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Direct deconvolution of radio synthesis images using L1 minimisation

We introduce an algorithm for the deconvolution of radio synthesis images that accounts for the non-coplanar-baseline effect, allows multiscale reconstruction onto arbitrarily positioned pixel grid...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

[Phoronix] Intel Cilk Plus Multi-Threading Support Going Into GCC

Phoronix is the leading technology website for Linux hardware reviews, open-source news, Linux benchmarks, open-source benchmarks, distribution screenshots, interviews, and computer hardware tests.
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Parallel and Distributed Implementations of Multiple and Two-Dimensional Pattern Matching Algorithms

String matching is a fundamental problem in the area of scientific computing. When two different one-dimensional strings are taken as an input, the so called "input string" and the so cal...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU Based Generation and Real-Time Rendering of Semi-Procedural Terrain Using Features

Generation and real-time rendering of terrain is a complex and multifaceted problem. Besides the obvious trade-offs between performance and quality, many different generation and rendering solution...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Advanced Optimization Techniques for Sparse Grids on Modern Heterogeneous Systems

GPU based heterogeneous systems provide a peak performance in the order of TFlop/s and an advantageous ratio between performance and energy consumption. However, reaching high performance on GPUs i...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Parallel Computing Using GPU for Efficient Traffic Simulation

Parallel Computing can be made possible using the multiple cores of the Graphics Processing Unit (GPU) thanks to the modern programmable GPU models. This allows the use of parallel computing techni...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

Performance Portability Evaluation for OpenACC on Intel Knights Corner and Nvidia Kepler

OpenACC is a programming standard designed to simplify heterogeneous parallel programming by using directives. Since OpenACC can generate OpenCL and CUDA code, meanwhile running OpenCL on Intel Kni...
more...
No comment yet.
Scooped by Mikael Bourges-Sevenier
Scoop.it!

GPU-accelerated triangle-triangle intersection tester algorithm

The goal of the project is to develop a triangle-triangle collision algorithm. A reference triangle is given as well as a variably-sized array of many other triangles. The algorithm must check if o...
more...
No comment yet.