Right now, the following BLAS level 1 functions are available:
sDOT :: single precision dot product or scalar product (dot<-xy)
sNRM2 :: single precision vector 2-norm
sSCAL :: single precision product of vector by scalar (x<-ax)
sAXPY :: single precision AXPY (y<-ax + y) You can download the OpenCL code which was tested on NVIDIA Tesla C870 and GPU Computing SDK 2.3
SourceForge Project
Please join up with your contribution!
Update: OpenCL BLAS now is a discontinued project.