Friday, October 23, 2009

Conjugate Gradient and OpenCL

I've just finished a conjugate gradient implementation for OpenCL. It has not performance yet, but I'm working on this to fix the bugs and/or optimize the code.

Here you are a PDF that makes an overview on the subject.

Friday, October 16, 2009

Nvidia's Next Generation: Fermi - key architectural highlights

Third Generation Streaming Multiprocessor (SM)
  • 32 CUDA cores per SM, 4x over GT200
  • 8x the peak double precision floating point performance over GT200
  • Dual Warp Scheduler simultaneously schedules and dispatches instructions from two independent warps
  • 64 KB of RAM with a configurable partitioning of shared memory and L1 cache

Second Generation Parallel Thread Execution ISA
  • Unified Address Space with Full C++ Support
  • Optimized for OpenCL and DirectCompute
  • Full IEEE 754-2008 32-bit and 64-bit precision
  • Full 32-bit integer path with 64-bit extensions
  • Memory access instructions to support transition to 64-bit addressing
  • Improved Performance through Predication

Improved Memory Subsystem
  • NVIDIA Parallel DataCache™ hierarchy with Configurable L1 and Unified L2
  • Caches
  • First GPU with ECC memory support
  • Greatly improved atomic memory operation performance

NVIDIA GigaThread™ Engine
  • 10x faster application context switching
  • Concurrent kernel execution
  • Out of Order thread block execution
  • Dual overlapped memory transfer engines
more information: http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf

Thursday, October 15, 2009

ATI Stream Software Development Kit (SDK) v2.0 Beta Program

What’s New in v2.0-beta4

  • First beta release of ATI Stream SDK with OpenCL™ GPU support.
  • ATI Stream SDK v2.0 OpenCL™ is certified OpenCL™ 1.0 conformant by Khronos.
  • Added Microsoft® Windows® 7 support.
  • Added native Microsoft® Windows® 64-bit support.
  • Float comparisons in kernels no longer produce a runtime error.
  • Various other issues from previous v2.0 beta releases have been resolved.
More information: http://developer.amd.com/GPU/ATISTREAMSDKBETAPROGRAM/Pages/default.aspx

Thursday, October 8, 2009

OpenCL BLAS - Makefile for MAC

Thanks to Mario Rometsch for a version of OpenCL BLAS Makefile for MacOS. You can download it on the SourceForge.

OpenCL BLAS Makefile for MacOS