Thursday, 3 January 2013

Let's Do Some Digital Signal Processing (Part 1) - Dot Product

Whenever I am asked to program a new DSP processor, or even write DSP algorithms on general purpose processors, (there isn't much difference these days, from a programming perspective) I always start with the basics. I install the development tools and the simulator then start hacking two simple algorithms :
Dot Product
One-Pole Filter
The nice thing about these functions is that they are really simple but are great for exercising the optimization capabilities of the C compiler. For this blog we'll look at the dot product, we'll come to the one-pole filter later.
The dot product performs the following mathematical operation on two data sets :
In otherwise, it successively multiplies each of the values in one array by each of the values in the other array, as follows :
Here is the code we will compile. This code should run on pretty much any processor with a C compiler.


#include <stdio.h>

#define COUNT 48 /* The filter size must be modulo 8 */
#define USE_ARRAY_INDEX 1 /* Set to '1' to use array indexing or '0' to use pointers */

float a[] = {
1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0,
11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0,
21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0,
31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0,
41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0
};
float b[] = {
41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0,
51.0, 52.0, 53.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 60.0,
61.0, 62.0, 63.0, 64.0, 65.0, 66.0, 67.0, 68.0, 69.0, 70.0,
71.0, 72.0, 73.0, 74.0, 75.0, 76.0, 77.0, 78.0, 79.0, 80.0,
81.0, 82.0, 83.0, 84.0, 85.0, 86.0, 87.0, 88.0
};

float dotp (const float *, const float *, const int count);

void main (void)

{
float sum;
sum = dotp (a, b, COUNT);
printf ("Dot product = %f\n", sum);
}


float dotp (const float *x, const float *y, const int count)
{
int i;
float sum = 0.0;
for (i = 0; i < count; i++)
{
#if USE_ARRAY_INDEX
sum += x[i] * y[i];
#else
sum += *x++ * *y++;
#endif
}
return sum;
}

If you try to compile this code on an embedded device then the compiler might not support stdio so you will have to remove the #include and the printf lines.

You will notice that within the code there is some conditional compilation (#if USE_ARRAY_INDEX) that selects whether to use arrays and indexes or pointers. When I was taught to program C, as an undergraduate, I was told unambiguously "always use pointers because they are quicker than arrays and indices" this isn't always the case and I might look into why this is the case in a future blog.

For a simple function like the dot-product it is safe to assume that most modern compilers will optimize either form of this C code as efficiently as if we had written it in assembly code but this function is the starting point for creating more complex functions.

So if most compilers will give the same level of optimization with both pointers and arrays/indices then why do I include both ? I have found that it is a good habit to get into because when working with more complex functions it becomes increasingly difficult to predict which method will give the best results. On modern VLIW/SIMD machines I have often been surprised to find that one method may give near 100% optimized code but the other is very slow. So for very little effort on my behalf I always try both methods and see which works best.

If all goes well, when you compile and run this example then you should see the following output :
Dot product = 85064.000000

We will look at how to benchmark algorithms in a later blog.

No comments:

Post a Comment