BertAnt wrote:the sensor would recognize what note present in the audio instantly
From what I've gathered so far,(mostly just looking at the results from what I've so far been able to build), that is not really so easy.
The fastest method I've found to date is counting the zero crossings - certainly no use for anything polyphonic, but detection reliably takes one cycle of the input waveform.
FFT, Goertzel, and the examples I posted before, all essentially do the same thing - correlation with sine/cosine functions, and their precision is always a function of the size of the window. For sure, the peak of the response will be at the desired frequency, but another note close by and lower in volume could easily be masked if the window is too short.
The way I think of it is like detuning oscillators until you can hear that lovely slow beating as they drift in and out of phase - now imagine listening to a very short burst of that sound, just a few cycles.
How sure could you be that there are two frequencies and not just one? You might even catch the point of cancellation and hear almost nothing!
The beating between the notes is what gives you the extra information telling you that there are two tones very close together - and for that, you need to hear a longer section of the waveform.
If you have a look at the 'waterfall plot' schematic that I noted earlier, it has an input for changing the averaging time, and it makes a huge difference to the shape of the plot - a setting of one approximates to just a couple of cycles (if I got my maths right) and all you see then are big round fuzzy lumps; multiply the time by five, and you can start to pick out individual notes and their harmonics. At twenty you can discern notes within chords, but the display is looking decidedly sluggish by this point!
The correlations I've used and the Goertzel algorithm are pretty much just different forms of the same thing in principle, and the response I got with a Goertzel were very much the same. (I ditched it in the end, the need to 'bin' the results every N samples oddly made it less CPU efficient - SSE assembly makes arbitrary length loops a little tricky)
From what I've read about quality 'Gutar to MIDI' convertors, they attain a fast response by analysis of the picking transient of the string (a variation of physical modelling I assume) - but I never saw any detail in the public domain about the algorithms, and of course, it would be of no use for anything other than plucked strings.

















