PyOpenCL: Arrays

Setup code

In [1]:
import pyopencl as cl
import numpy as np
import numpy.linalg as la
In [2]:
a = np.random.rand(1024, 1024).astype(np.float32)
In [3]:
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

Creating arrays

This notebook demonstrates working with PyOpenCL's arrays, which provide a friendlier (and more numpy-like) face on OpenCL's buffers. This is the module where they live:

In [4]:
import pyopencl.array

Now transfer to a device array.

In [5]:
a_dev = cl.array.to_device(queue, a)

Works like a numpy array! (shape, dtype, strides)

In [6]:
a_dev.shape
Out[6]:
(1024, 1024)
In [7]:
a_dev.dtype
Out[7]:
dtype('float32')
In [8]:
a_dev.strides
Out[8]:
(4096, 4)

Working with arrays

Goal: Wanted to double all entries.

In [9]:
twice_a_dev = 2*a_dev

Easy to turn back into a numpy array.

In [10]:
twice_a = twice_a_dev.get()

Check!

In [12]:
#check

print(la.norm(twice_a - 2*a))
0.0

Can just print the array, too.

In [13]:
print(twice_a_dev)
[[ 0.45063514  1.25913811  0.10170967 ...,  1.38052452  0.45464128
   1.52395189]
 [ 1.65396667  1.72338498  0.80199856 ...,  0.64524269  1.00564325
   0.13862719]
 [ 1.36213672  0.0432039   1.37479746 ...,  0.5632273   1.13196242
   1.91092789]
 ..., 
 [ 1.36576688  1.31865883  0.89605123 ...,  0.63803053  1.20221436
   0.63805646]
 [ 1.50485742  0.4254683   1.32611811 ...,  1.25427306  0.31774497
   0.70451695]
 [ 1.26437032  1.26890802  1.6362865  ...,  1.93974745  0.34652343
   1.80210614]]

Easy to evaluate arbitrary (elementwise) expressions.

In [14]:
import pyopencl.clmath
In [15]:
cl.clmath.sin(a_dev)**2 - (1./a_dev) + 5
Out[15]:
array([[  0.61173439,   3.75829315, -14.66122818, ...,   3.95671225,
          0.65171814,   4.16420889],
       [  4.33232307,   4.41549158,   2.65859413, ...,   2.0009141 ,
          3.24345064,  -9.42238712],
       [  3.92814422, -41.29164886,   3.94786716, ...,   1.52626753,
          3.52071476,   4.62019348],
       ..., 
       [  3.93382311,   3.85857034,   2.95563579, ...,   1.96371865,
          3.65625668,   1.9638536 ],
       [  4.13802481,   0.34387445,   3.87071657, ...,   3.74981856,
         -1.26932764,   2.28021336],
       [  3.7673583 ,   3.77517986,   4.31044197, ...,   4.64925671,
         -0.7418952 ,   4.50481367]], dtype=float32)

Low-level Access

Can still do everything manually though!

In [16]:
prg = cl.Program(ctx, """
    __kernel void twice(__global float *a)
    {
      int gid0 = get_global_id(0);
      int gid1 = get_global_id(1);
      int i = gid1 * 1024 + gid0;
      a[i] = 2*a[i];
    }
    """).build()
twice = prg.twice
In [17]:
twice(queue, a_dev.shape, None, a_dev.data)
Out[17]:
<pyopencl.cffi_cl.Event at 0x7f156e9a8550>
In [19]:
print(la.norm(a_dev.get() - 2*a), la.norm(a))
0.0 591.081

But the hardcoded 1024 is ... inelegant. So fix that!

(Also with arg dtype setting.)