Practice problem: Dimension-Independent Finite Difference Kernel

A particular type of problem that is often tricky to address with a single codebase is handling varying dimensionality, i.e. for example handling 1D, 2D and 3D cases in a single code. In this problem, we will practice that for a simple finite difference code that applies a second-order centered finite difference operator:

$$ f''(x) \approx \frac{f(x+h) - 2 f(x) + f(x-h)}{h^{2}} $$

along each axis, summing the results. This implements an $n$-dimensional Laplacian ($\triangle$) or div-grad operator.

To keep things simple, we will not worry about boundary conditions. Also, to keep things simple, we will assume that we have exactly 20 data points in each direction, and we will assume the grid spacing $h$ is 1.

In [26]:
from mako.template import Template
import pyopencl as cl
import pyopencl.array
import pyopencl.clrandom
import numpy as np
In [27]:
tpl = Template("""
    kernel void fdiff(global float *out, global float * ary)
    {
        out[...] = ...
    }
    """, strict_undefined=True)
In [36]:
# Solution



tpl = Template("""

    kernel void fdiff(global float *out, global float * ary)

    {

        int ibase = 

            <% stride = 1 %>

            %for iax in range(dim):  

                + (get_global_id(${iax}) + 1)*${stride}

                <% stride *= 20 %>

            %endfor

            ; 

        

        out[ibase] = -2*${dim}*ary[ibase]

            <% stride = 1 %>

            %for iax in range(dim):  

                + ary[ibase - ${stride}] + ary[ibase + ${stride}]

                <% stride *= 20 %>

            %endfor

            ;

    }

    """, strict_undefined=True)
In [37]:
dim = 3
code = tpl.render(dim=dim)
print(code)

cl_context = cl.create_some_context()
queue = cl.CommandQueue(cl_context)

prg = cl.Program(cl_context, code).build()
knl = prg.fdiff

a = cl.clrandom.rand(queue, (20,)*dim, dtype=np.float32)
out = cl.array.empty_like(a)

knl(queue, (18,)*dim, None, out.data, a.data)
queue.finish()
    kernel void fdiff(global float *out, global float * ary)
    {
        int ibase = 
            
                + (get_global_id(0) + 1)*1
                
                + (get_global_id(1) + 1)*20
                
                + (get_global_id(2) + 1)*400
                
            ; 
        
        out[ibase] = -2*3*ary[ibase]
            
                + ary[ibase - 1] + ary[ibase + 1]
                
                + ary[ibase - 20] + ary[ibase + 20]
                
                + ary[ibase - 400] + ary[ibase + 400]
                
            ;
    }
    
In [ ]: