Wednesday, 2 September 2020

What To Consider When Integrating DSP Algorithms

 Here is a little article I wrote for Electronics Weekly, with my good friend Dunstan Power from ByteSnap Design.

Click on the image below to view the full article.

 #signalprocessing #dsp #machinelearning

Electronics Weekly DSP Article


Thursday, 27 August 2020

Answers To Questions From The Data Science Festival Lunch & Learn "The Frequency Domain And How It Can Be Used To Aid Artificial Intelligence"

Thank you very much for attending the Lunch & Learn.
The presentation can be downloaded from here: https://www.numerix-dsp.com/ai.
The recording is available on the Data Science Festival YouTube channel : https://www.youtube.com/datasciencefestival.
Direct link: https://www.youtube.com/watch?v=6XBM0_G7iwk.

Q. Can you give definition of 'DSP'?
A. Yes. DSP is the digital processing of real-time signals. These are usually 1D signals such as voice, acoustics, radar, wireless etc. Real-world analog signals are converted to digital signals using an Analog to Digital Converter (ADC).

Q. Is the order-gram output you presented earlier a type of Fast Fourier Transform computation i.e from Spacial domain to Frequency Domain?
A. No, the ordergram is a method of ensuring that the fundamental frequency (and all harmonics) are in a fixed "location" regardless of the rotational speed of the machine.

Q. Web search results for 'ordergram' are sparse. Please may you recommend further reading, or is there an alternative term one may search for online?
A. You're right that doesn't show up many results. The ordergram is the result of Order Analysis and Mathworks have an excellent summary : https://uk.mathworks.com/help/signal/examples/order-analysis-of-a-vibration-signal.html.

Q. On the multi layer backpropagation slide, why did you choose a hidden layer length of 25? Is that significant?
A. I actually started by over specifying the hyper-parameters (input length and hidden layer length). My original choice was 256 (input) and 128 (hidden) - lets call this 256/128 but note the input length is half the FFT length.
I reduced both until the performance degraded. At 64/32 there was no noticable degradation but at 64/16 there was a clear drop-off.
I wrote a script that Iterated over a range of hidden layer lengths, from 10 to 63 (Input length -1), and ran this overnight.
It was a case of diminishing returns – above 25 (roughly half the number of inputs) there was very little benefit in using more hidden layers.

Q. Can a recording of this talk be circulated please. very interesting, but I have to step away for another meeting. Good luck to all.
A. Yes, You can relive the whole experience :-) using the link above.

Q. Could we see the reading recommendations list again please?
A. Sure, You can download the complete presentation using the link above.

Q. How did you get your labelled data?
A. It was sampled using a calibrated microphone : https://www.minidsp.com/products/acoustic-measurement/umik-1.
Each recording was stored in a file with a unique identifying name, that was used to track the performance and the results.
Although the simulation, training and real-time prediction code was written in C, the test framework was written in Python. The benefit of that was that the test framework could choose whether to use the simulation or real-time code for verification and regression test purposes.

Q. I missed the introduction, can you please quickly explain why we should use Frequency domain instead of Raw data
A. The frequency domain is a method for extracting key features such as resonant frequencies. This is more run-time efficient than relying on the Neural Network to extract these features.

Q. Are there any other functional transforms you like to try and apply when experimenting other than the fourier transform?
A. Yes, absolutely. I mentioned the Mel Frequency Cepstrum in the presentation and I think this would be worthy of further evaluation. Comparing the FFT to MFCCs is a trade-off between frequency resolution and MIPS.

Q. Did you do a comparison between using the time-domain signal versus the frequency-domain signal as inputs to you convolutional Neural Net classifier. To see the benefit of transforming the signal in to the frequency domain?
A. Nothing beyond having a play with Endolith's code, that I referenced in the presentation. This would be an interesting piece of research.

Q. The learning happens on a different machine and only the model is deployed on the edge right?
A. Absolutely correct. The model generated by the training process is stored in C array that is linked into the real-time code, at compile time.

Q. Given the number of pre-processing steps on the signal before it gets to the ML network, can one measure how much modifying these steps would 'break' the network's predictions? e.g. modifying the Hamming window, or sample frequency. I'm curious from a software/production risk perspective.
A. Yes, one could do this and it would be another excellent piece of research. I think the presented code would be a minimum but would benefit from additional algorithms such as zero crossing counting, peak detection etc..

Q. Why only a single hidden layer?
A. I was suspicious at first too but this was found to give excellent performance. From my earlier AI work with images, I belive that the reason only one hidden layer is necessary is because the DSP pre-processing extracts the features that the neural network finds easy to detect.

Q. What would be the benefit from additional hidden layers?
A. I suspect there would be a benefit if the signals were highly correlated but it would be a trade-off. My gut feeling from the existing project development is that there would be more to be gained by using a larger FFT or an even larger FFT followed by MFCC.

Q. What classes ( labels ) used in the output layer?
A. The output is the activation level for each category, with each category being referenced to the filename for labelling. Following the activation level output, there is a simple comparator that detects with category has the highest energy.

Q. Do you know a good code and data for hands on work in the frequency domain for AI?
A. This is a great question. The main benefit would be using the Frequency Domain for implementing the convolutional kernels.
Like most things in DSP, it is a trade-off. Larger convolutions gain more from using the frequency domain (less MIPS than the time domain) but there is more latency.
I will be presenting a paper on Frequency Domain Signal Processing, at the DSP Online Conference on the 24th September: https://www.dsponlineconference.com/session/Frequency_Domain_Signal_Processing.

Q. What would an alternative of classifying the same data using analog circuits look like?   Feasible?   Less or more expensive in terms of hardware resources  vs digital ?
A. I have heard about analog classifiers but I have no experience with them so I can't answer this directly except to share my experience with analog vs. digital signal processing. The key point is that DSP devices benefit from Moores Law but analog devices don't. So every year DSPs get higher performance and lower cost. From a signal point of view, the biggest challenge to analog is noise.

Q. As a turbine ages, how rapidly does its spectrogram change?  How often one would need to re-train the model.
A. The spectrogram does indeed change so the model needs to be trained on all different vibration modes, and ages, of the engine so that the classifier can detect all of the different variations with a single model.

Q. How different are the spectrograms from one turbine vs another?  i.e. is the model “portable” / applicable to different devices or device specific?
A. It is very engine model specific so it depends on the number of fans, the number of blades in each fan, the architecture of the turbine (radial, axial, by-pass etc).

Q. Did you try the Mel Cepstrum, instead of the Fast Fourier Transform ?
A. Yes, I performed a short evaluation of the Mel Frequency Cepstral Coefficients (MFCCs).

For those who are not familiar with the Mel Cepstrum, this uses the FFT to calculate the spectrum then it generates a logarithmically reduced set of frequency coefficients. The benefit is that the AI algorithm then requires less MIPS, memory, power consumption etc. MFCCs are great for speech recognition and similar applications.

It is a trade-off between frequency resolution and recognition performance so a Mel cepstrum would require a larger FFT at the input and possibly a similar number of MFCCs on the output as the pure FFT solution.
I think there is potential here so it is definitely something I hope to research further in the future.

Q. Why did you program the app in C and not use a standard API such as Tensorflow Lite ?
A. This was driven by the inferencing / prediction function. The primary goal was to use whatever accelerators are available on the target CPU to optimize the convolution operations and minimize the MIPS and power consumption.
For training, I could have used Tensorflow to build the model and just used the C code for deployment. But for any specific Neural Network system the training function is only a few minor changes from the prediction function so once I'd written the predictor I just carried on and wrote the training function.

Q. How many categories can this technique support ?
A. The project was tested with four categories however this could easily be extended.

From my experience, the number of categories supported depends on three main things:

  1.     How similar the signals are that need to be detected (their cross-correlation)
  2.     How much training data is available - more data would mean a greater ability to differentiate similar signals
  3.     The frame length - longer frame lengths would help differentiate more categories


For more details about Numerix AI Consultancy services, please see here: https://www.numerix-dsp.com/ai.


Monday, 24 August 2020

Data Science Festival Lunch & Learn – The Frequency Domain And How It Can Be Used To Aid Artificial Intelligence

Really pleased to have been invited to present at the Data Science Festival Lunch and Learn, this Thursday 27th.
The topic is "The Frequency Domain And How It Can Be Used To Aid Artificial Intelligence".
I will be presenting a general introdcuction and a high level overview of a project that I've been working on over the last few months.
Further details and registration are here : https://www.datasciencefestival.com/event/dsf-lunch-learn-the-frequency-domain-and-how-it-can-be-used-to-aid-artificial-intelligence/.


 

Friday, 24 July 2020

The 28th Annual Running Of The University Of Oxford Digital Signal Processing Course Will Now Be Held Online.

As part of the University Of Oxford Summer Engineering Program for Industry, the 28th running of the Digital Signal Processing course is moving online.

The course will be run over a period of 6 weeks between Monday 19th October and Sunday 29th November 2020.

Based on the classroom course, Digital Signal Processing (Theory and Application), this online course consists of weekly live online tutorials and also includes a software lab that can be run remotely. We'll include all the same material, many of the existing labs and all the interaction of the regular course.


Online tutorials are delivered via live video once each week and practical exercises are set to allow you to practice the theory during the week. 
You will also have access to the course VLE (virtual learning environment) to communicate with other students, view and download course materials and tutor support is available throughout.
Code examples will be provided although no specific coding experience is required. 
The live tutorials will be on Wednesday each week from 13:00 - 14:30 and 15:00 - 16:30 (GMT) with a 30-minute break in between.
You should allow for 10 - 15 hours study time per week in addition to the weekly lessons and tutorials.
After completing the course, you should be able to understand the workings of the algorithms we explore in the course and how they can solve specific signal processing problems.


Thursday, 18 June 2020

Developing Artificial Intelligence Applications using Python and TensorFlow course at the University of Oxford

I've been broadening my AI knowledge over the last few weeks by attending the "Developing Artificial Intelligence Applications using Python and TensorFlow" course at the University of Oxford. University of Oxford, Department for Continuing Education.

https://www.conted.ox.ac.uk/courses/developing-artificial-intelligence-applications-using-python-and-tensorflow.

Course Director is the very knowledgable Ajit Jaokar.

If any of you want to learn more about AI, I can't recommend the course highly enough.

I'm sure Ajit will be running more courses because this one was over subscribed so sign up ASAP.

#AI #artificialintelligence #machinelearning #universityofoxford #datascience

Sunday, 24 May 2020

A Simple Google Assistant On The Raspberry Pi

Here's a fun lockdown project that explains how to integrate a Google assistant on a Raspberry Pi : https://github.com/Numerix-DSP/GoogleAssistant .


Saturday, 16 May 2020

Timing DSP Code Running On ARM Cortex Architecture

A recent project reqired porting some DSP algorithms to the NXP LPC55S6x ARM Cortex-M33 based microcontroller.
It was necessary to benchmark the algorithms so I wrote the following code that utilizes the Cycle Count Register, which is part of the ARM Cortex-M Data Watchpoint and Trace (DWT) unit.

The code below includes macros for accessing the DWT and also calculates the overhead of calling the functions to read the timer register, before using the same functions to time some code.
This code has been compiled and tested on the NXP LPCXpresso55S69 Development Board but should run on any ARM device that includes the DWT module.

The benchmarking macros and functions are stored in benchmark_code.h:

// benchmark_code.h
// Macros and functions for benchmarking code

#ifndef BENCHMARK_CODE_H
#define BENCHMARK_CODE_H

// Timers
// DWT (Data Watchpoint and Trace) registers, only exists on ARM Cortex with a DWT unit
#define KIN1_DWT_CONTROL            (*((volatile uint32_t*)0xE0001000))         // DWT Control register
#define KIN1_DWT_CYCCNTENA_BIT      (1UL<<0)                                    // CYCCNTENA bit in DWT_CONTROL register
#define KIN1_DWT_CYCCNT             (*((volatile uint32_t*)0xE0001004))         // DWT Cycle Counter register
#define KIN1_DEMCR                  (*((volatile uint32_t*)0xE000EDFC))         // DEMCR: Debug Exception and Monitor Control Register
#define KIN1_TRCENA_BIT             (1UL<<24)                                   // Trace enable bit in DEMCR register

#define KIN1_InitCycleCounter()     KIN1_DEMCR |= KIN1_TRCENA_BIT               // TRCENA: Enable trace and debug block DEMCR (Debug Exception and Monitor Control Register
#define KIN1_ResetCycleCounter()    KIN1_DWT_CYCCNT = 0                         // Reset cycle counter
#define KIN1_EnableCycleCounter()   KIN1_DWT_CONTROL |= KIN1_DWT_CYCCNTENA_BIT  // Enable cycle counter
#define KIN1_DisableCycleCounter()  KIN1_DWT_CONTROL &= ~KIN1_DWT_CYCCNTENA_BIT // Disable cycle counter
#define KIN1_GetCycleCounter()      KIN1_DWT_CYCCNT                             // Read cycle counter register

void benchmark_init_cycle_counter ()
{
    KIN1_InitCycleCounter();                                // enable DWT hardware
    KIN1_ResetCycleCounter();                               // reset cycle counter
    KIN1_EnableCycleCounter();                              // start counting
}

#define benchmark_get_cycle_counter KIN1_GetCycleCounter

#endif /* BENCHMARK_CODE_H */


This code can be used in the following manner:


#include "fsl_debug_console.h"
#include "benchmark_code.h"

int main(void)
{
uint32_t start_time, end_time, overhead_time; // number of cycles

  benchmark_init_cycle_counter(); // Initialize benchmark cycle counter

start_time = benchmark_get_cycle_counter();  // get cycle counter
    __asm volatile ("nop");
    end_time = benchmark_get_cycle_counter();  // get cycle counter
    overhead_time = end_time - start_time;


  start_time = benchmark_get_cycle_counter(); // get cycle counter
   PRINTF("Put your code to be benchmarked here ...\r\n");
    end_time = benchmark_get_cycle_counter();  // get cycle counter
    printf ("Elapsed time = %d cycles\n", end_time - start_time - overhead_time);

    return(0);
}

Notes
There appears to be a +/- 1 cycle jitter on the results of any code timing instance. I have not got to the bottom of exactly why but regardless of the route cause, this is very accurate and definitely suitable for the vast majority of applications.

References





If you have found this solution useful then please do hit the Google (+1) button so that others may be able to find it as well.
Numerix-DSP Libraries : http://www.numerix-dsp.com/eval/