The Numerix-DSP Blog

Saturday, 17 January 2015

Let's Do Some More Digital Signal Processing - Infinite Impulse Response (IIR) Filters

Following on from my earlier blog : Let's Do Some Digital Signal Processing (Part 2) - One Pole Filter I thought it was about time that I took a look at Infinite Impulse Response (IIR) filters.

IIR filters give much higher performance than Finite Impulse Response filters because they use feedback as well as feedforward in order to implement the filter response.

This code shows how to implement an IIR biquad filter structure.

#include <stdio.h>

#define COUNT 8 /* Number of samples to process */

#define FILTER_STAGES 1 /* Single 2nd ordrer biquad */

// Fs = 44.1 kHz, Fc = 1 kHz LPF
double LPFCoefficientArray [] =
{
3.43448050764769180000e-003, 6.86896101529538370000e-003, 3.43448050764769180000e-003,
-1.89857508921973460000e+000, 9.12313011250325380000e-001
};

float impulseSrc[] = {
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
};
float stepSrc[] = {
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0
};
float Dst[] = {
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
};

double FilterState [2 * FILTER_STAGES];

void iir_init (double *pState,
const int NumberOfStages);

double iir_filter (const double Source,
double *pState,
const double *pCoeffs,
const int NumberOfStages);

void main (void)

{
int i;

iir_init (FilterState, FILTER_STAGES); /* Initialize the filter */
for (i = 0; i < COUNT; i++)
Dst[i] = iir_filter (impulseSrc[i], FilterState, LPFCoefficientArray, FILTER_STAGES);
printf ("One Pole Impulse = %f, %f, %f, %f, %f, %f, %f, %f\n", Dst[0], Dst[1], Dst[2], Dst[3], Dst[4], Dst[5], Dst[6], Dst[7]);

iir_init (FilterState, FILTER_STAGES); /* Initialize the filter */
for (i = 0; i < COUNT; i++)
Dst[i] = iir_filter (stepSrc[i], FilterState, LPFCoefficientArray, FILTER_STAGES);
printf ("One Pole Step = %f, %f, %f, %f, %f, %f, %f, %f\n", Dst[0], Dst[1], Dst[2], Dst[3], Dst[4], Dst[5], Dst[6], Dst[7]);

}

void iir_init (double *pState,
const int NumberOfStages)

{
int i;

/* Initialise the filter state array to 0 */
for (i = 0; i < (NumberOfStages * 2); i++)
{
*pState++ = 0.0;
}

} /* End of iir_init () */

double iir_filter (const double Source,
double *pState,
const double *pCoeffs,
const int NumberOfStages)

{
register double FeedbackSumOfProducts;
register int i;
register double TempInputData;

TempInputData = Source;

for (i = 0; i < NumberOfStages; i++)
{
FeedbackSumOfProducts = TempInputData - (*(pCoeffs+3) * *pState) -
(*(pCoeffs+4) * *(pState+1)); /* Feedback */
TempInputData = (*pCoeffs * FeedbackSumOfProducts) +
(*(pCoeffs + 1) * *pState) +
(*(pCoeffs + 2) * *(pState+1)); /* Feedforward and save result for next time round */

*(pState+1) = *pState; /* Move delayed samples */
*pState = FeedbackSumOfProducts;

pState += 2; /* Increment array pointers */
pCoeffs += 5;
}

return (TempInputData); /* Save output */

} /* End of iir_filter () */

The result of running this code is :
One Pole Impulse = 0.003434, 0.013390, 0.025722, 0.036620, 0.046059, 0.054038, 0.060575, 0.065706
One Pole Step = 0.003434, 0.016824, 0.042546, 0.079167, 0.125226, 0.179264, 0.239839, 0.305545

The filter implemented in this example is a low-pass filter with 1 kHz cut-off frequency and a sampling rate of 44.1 kHz (CD frequency). This can be observed by modifying the above code to generate a longer impulse response (e.g. 1024) and then passing this data to a Fast Fourier Transform (FFT).

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/

Wednesday, 7 January 2015

Timing C Code Running On the XMOS xCORE Architecture

I recently had to show a customer how long a piece of code took to execute on the XMOS xCORE.

The nice thing about the xCORE is that it includes hardware timers that can be directly accessed from the application.

In order to be able to benchmark the code I knew there would be a small overhead to call the timers from the top level C code so my first task was to calculate this.

Having calculated the overhead I was now free to benchmark the actual code then remove the overhead to print the correct time.

As an aside, one of the great benefits of the XMOS architecture is how debug_printf's are implemented. Essentially the output of a printf can be routed over the inter-processor communications link called xCONNECT, in real-time. This allows debug_printf's to be located directly inside real-time code, unlike other microcontrollers where locating a printf inside an interrupt service routine is guaranteed to cause a run-time exception (AKA crash). Note, the code below uses regular printfs so that it can be compiled to run on any CPU, with suitable #defines/#ifs around the XMOS benchmarking code.

Here is the benchmarking code showing the performance of a simple piece of C code, with inline assembly:

#include <stdio.h>
#include "xbenchmark.h"

int main (void)
{

int start_time, end_time, overhead_time;

overhead_time = xbench_init();

start_time = xbench_get_time();
asm ("nop");
asm ("nop");
end_time = xbench_get_time();

printf ("Elapsed time = %d cycles\n", end_time - start_time - overhead_time);

return (0);
}

Here is the xbenchmark.h file:

// XMOS Benchmarking Functions

int xbench_get_time(void);

int xbench_init(void);

Here is the xbenchmark.xc file:

// XMOS Benchmarking Functions

// Include files

#include <xs1.h>

timer t;

int xbench_get_time(void)

{

int time;

t :> time;

return(time);

}

int xbench_init(void)

{

int start_time, end_time;

start_time = xbench_get_time();

end_time = xbench_get_time();

return(end_time - start_time);

}

Here is the command for compiling the C code:

xcc -O3 app.c xbenchmark.xc -target=XCORE-200-EXPLORER -o app.xe

And here is the output:

Elapsed time = 2 cycles

Phew, two nops takes two cycles :-)

Notes

XMOS use the incredible LLVM compiler, which I am sure will one day take over from GCC. It is almost never necessary to revert to assembly code but if you need to then you can see from the above example just how easy it is to use.

In order to read and write to the XMOS hardware (such as the timer, as shown above) it is necessary to use XC, rather than C, as shown above. The XC functions are documented in the XMOS Programming Guide : https://www.xmos.com/download/public/XMOS-Programming-Guide-(documentation)(E).pdf.

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/

DSP Tech Brief : C Code To Implement The Q Format Number System

I recently posted an "Introduction To The Q Format Number System" : http://realgonegeek.blogspot.co.uk/2015/01/introduction-to-q-format-number-system.html.

This afternoon I was driving back from a meeting and thought I should put some code together to demonstrate the operations. Here it is (you will notice that I have used a somewhat more useful word length than 8 bits for this piece of code) :

#include <stdio.h>

double QFormatIntegerToFloat (const int x,
const int n)

{
return (((double)x) / ((double)(1 << n)));
}

int FloatToQFormatInteger (const double x,
const int m,
const int n)

{
int i;
int y;
int mask = 0;

y = (int)(x * ((double)(1 << n))); // Generate the integer number in m.n format

for (i = 0; i < (m+n); i++) // Calculate the mask to ensure we return the correct number of valid bits
{
mask = (mask << 1) + 1;
}

return (y & mask);
}

int MpyQFormatInteger (const int x,
const int y,
const int n)

{
long long tmp;
tmp = ((long long)x * (long long)y) >> n;
return ((int)tmp);
}

int main()
{
int a, b, c, d, e;

printf ("%lf = 0x%x\n", 2.0 , FloatToQFormatInteger (2.0 , 8, 24));
printf ("%lf = 0x%x\n", 1.5 , FloatToQFormatInteger (1.5 , 8, 24));
printf ("%lf = 0x%x\n", 0.5 , FloatToQFormatInteger (0.5 , 8, 24));
printf ("%lf = 0x%x\n", 0.25, FloatToQFormatInteger (0.25, 8, 24));
printf ("%lf = 0x%x\n", 0.75, FloatToQFormatInteger (0.75, 8, 24));
printf ("\n");

a = FloatToQFormatInteger (0.75, 8, 24);
b = FloatToQFormatInteger (2.0 , 8, 24);
c = MpyQFormatInteger (a, b, 24);

printf ("Result = 0x%x : %lf\n", c, QFormatIntegerToFloat (c , 24));

d = FloatToQFormatInteger (0.5 , 8, 24);
e = MpyQFormatInteger (c, d, 24);

printf ("Result = 0x%x : %lf\n", e, QFormatIntegerToFloat (e , 24));

return (0);
}

Here is the output :

2.000000 = 0x2000000
1.500000 = 0x1800000
0.500000 = 0x800000
0.250000 = 0x400000
0.750000 = 0xc00000

Result = 0x1800000 : 1.500000
Result = 0xc00000 : 0.750000

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/

Sunday, 4 January 2015

New DSP Software Updates

Prompted by the release of wxWidgets v3.x I have used the holidays as an opportunity to update and test a load of my software libraries and utilities.
They can be downloaded from here :

SigLib (v8.53) DSP Library : http://www.numerix-dsp.com/siglib.html. Includes some new functions for Q format integer support and it has been tested on Mac OS X and Raspberry Pi.

The following example programs and libraries have also been updated and these can be downloaded from http://www.numerix-dsp.com/files/ :

System Analyzer- V2.10 - A simple System Analyzer that uses wxWidgets V3.0.2 (A free portable GUI library), NGL, PortAudio (A free portable Audio I/O library) and can optionally use SigLib. This application includes the full GUI source code.

Numerix Host Library - V3.00 - A library of host I/O routines, including full source code. Functions include console and File I/O, including .WAV files. These functions can be used on any OS (UNIX, Windows etc.) and also on any other embedded processor or DSP that supports console and file I/O via the debug environment. These functions can also be used with Gnuplot/C to run any of the SigLib example programs.

Numerix Graphics Library - V2.60 - A simple library of graphics routines for the wxWidgets V3.0.x based GUI applications. Full source code is included.

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/

Friday, 2 January 2015

DSP Tech Brief : Introduction To The Q Format Number System

One area of Digital Signal Processing that causes sleepless nights for Design Engineers is when they first encounter fixed point maths. Issues such as overflow and underflow,. that are automatically resolved when using floating point numbers, can add an extra layer of complexity when implementing fixed point algorithms.
Unfortunately in embedded system design it is typically not possible to avoid fixed point issues because floating point silicon requires many more transistors, particularly to implement the multiplier, and hence is more expensive and slower than equivalent fixed point implementations.
There are many different ways to represent numbers using integers including signed 2's Complement, Signed Offset Binary etc. but one of the most useful is the Q Format.
Q format numbers split the fixed point word into two components - the integer (m bits) and fractional (n bits) parts to give a number represented in the format m.n. This makes the word length of the integer equal to m+n bits.
The Q format allows the algorithm to be implemented using fixed point operations but requires the designer to manage the data both into and out of the algorithm.

A regular integer number creates overflow during the multiplication and addition operations.
Some processors provide a partial workaround for this by allowing optional use of fractional numbers for multiplication, in which case underflow occurs but this still doesn't avoid overflow with addition operations.

8 bit integer multiplication example

Treating fixed point numbers as integers means that we have the possibility of overflow.
The maximum 8 bit signed number is 127 (0x7f)
127 x 127 = 16129 which is a 16 bit number and has overflowed our 8 bit word length

8 bit fractional multiplication example

Treating fixed point numbers as fractions means that we have the possibility of underflow but not overflow.
The maximum 8 bit signed number is 127/128 = 0.9921875 (0x7f)
0.9921875 x 0.9921875 = 0.98443603515625 which has not overflowed our 8 bit word length but the lsbs will underflow to give us 126/128 = 0.984375.

8 bit (2.6) Q format example

We have 2 bits for the integer (m) and 6 bits for the fractional (n) components.
The value of 1.0 is represented by 0b01000000 (0x40)
If we now represent the following numbers in 2.6 Q format :
2.0 - 0b10000000 (0x80)
1.5 - 0b01100000 (0x60)
0.5 - 0b00100000 (0x20)
0.25 - 0b00010000 (0x10)
0.75 - 0b00110000 (0x30)

We can now perform some simple maths :
0.75 x 2 = 0x30 x 0x80 = 0x1800 (0b1100000000000)
Note the word growth, which means we need a longer word length for the partial result. We now need to scale this back to give us our 8 bit result (or we could keep this extended format if this is part of a multiply accumulate (MAC) instruction).
Scaling (right shifting) by the number of fractional bits (6) gives us 0b1100000 (0x60) or 1.5, which is the result we would expect.
We could now perform the same multiply operation with :
1.5 x 0.5 = 0x60 x 0x20 = 0xC00 (0b110000000000)
And again, if we perform the required scaling by 6 bits we get 0b110000 (0x30) = 0.75
We can see from this example that it is possible to handle numbers with both integer and fractional components using the Q format.

FIR Filtering
When implementing a regular FIR filter where |h(n)| <= 1.0 then the Q format number system allows for word growth in the accumulator.
The worst case word growth allows 2^N MAC operations before overflow, where N is the number of guard bits in the accumulator word.
Using Q format numbers allows m integer bits to be used for the guard bits hence in the above example we can implement 2^2 = 4 MAC operations and guarantee no overflow.
In more realistic FIR filters where only the central coefficients tend towards the maximum it is possible to implement larger filters before experiencing overflow but these scenarios should always be tested thoroughly.

IIR Filtering
Unlike FIR filters, where |h(n)| is typically <= 1.0, in IIR filters the coefficient magnitude can often be > 1.0. Here are the coefficients of a very simple 2nd order Low Pass Butterworth Filter :
float CoefficientArray [] =
{
9.57767909330728970000e-002, 1.61076720320250940000e-001, 9.57767909330728970000e-002,
-1.26453884117041130000e+000, 6.17169143356808060000e-001
};
Implementing a filter like this in a fixed point DSP is clearly much more easy when using Q format numbers than either integer or fractional fixed point numbers.

As an example of where you might use Q format numbers is in high quality (>= 24 bit) audio. Using a 32 bit word length allows several options for the values of m and n, for example :
32 bit word using m.n = 8.24 gives
8 integer bits for word growth and 24 fractional bits
32 bit word using m.n = 5.27 gives
5 integer bits for word growth and 27 fractional bits for fractional bit growth
In these examples it is common to use a 64 bit result / accumulator register then this partial result is converted back to 32 bit Q format for further processing stages.

Please see this blog entry for C code for implementing the Q format number system : http://realgonegeek.blogspot.co.uk/2015/01/c-code-to-implement-q-format-number.html.

Further fixed point references are available at :
https://www.xmos.com/support/appnotes
https://github.com/xcore?query=DSP
http://www.numerix-dsp.com/appsnotes/

This topic is covered in much more detail on the University of Oxford DSP course : http://realgonegeek.blogspot.co.uk/2014/12/the-next-round-of-university-of-oxford.html.

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/

Monday, 29 December 2014

Upgraded Version of Digital Filter Plus Supports Q Format Fixed Point Numbers And Apple Mac OS X

The Digital Filter Plus filter design package has now been upgraded to support Q format fixed point number systems upto 64 bits long.
Digital Filter Plus is now also compiled for Apple Mac OS X.
Digital Filter Plus can be downloaded from here : http://www.numerix-dsp.com/dfplus/.
A free version for evaluation and non-commercial applications can be downloaded from here : http://www.numerix-dsp.com/free/.

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/

Saturday, 25 October 2014

Building wxWidgets for Windows

I'm a big fan of wxWidgets (http://wxwidgets.org/) and have used it since its earliest days.

With the release of v3.x I have thought I would update many of the programs I have written over the years to be compatible with this version. The update process has been remarkably painless.

Here is how I build wxWidgets and applications to run under it.

Download wxWidgets-3.0.2.zip from : http://sourceforge.net/projects/wxwindows/files/3.0.2/
Extract to : C:\wxWidgets-3.0.2

Download Bakefile v0.2.9 from : https://github.com/vslavik/bakefile/releases

In : C:\wxWidgets-3.0.2\build\msw\
nmake -f makefile.vc BUILD=release clean
nmake -f makefile.vc RUNTIME_LIBS=static BUILD=release

I almost always use "RUNTIME_LIBS=static" because it saves having to include dlls with the application.

Set the WXWIN environment variable :
WXWIN=C:\wxWidgets-3.0.2

Open a command window in : C:\wxWidgets-3.0.2\samples\ - e.g. animate (C:\wxWidgets-3.0.2\samples\animate\)
nmake -f makefile.vc RUNTIME_LIBS=static BUILD=release
cd vc_mswud
anitest

Occasionally I will use the following for debugging but it is not very often because I still haven't found a better solution than printf (or fprintf for logging to a file) :
nmake -f makefile.vc RUNTIME_LIBS=static BUILD=debug

If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.

Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/