# A more specific question regarding global, segmental, and frequency weighted SNR...?

Suppose we have a speech signal with global energy 100, and in the band 0-4khz, and the signal duration is 4 seconds. If we introduce a white noise with global energy 100, from second 1 until 1.2 (i.e. 200 ms), what will be the global, segmental, and frequency weighted segmental SNR? Or if we instead introduce a noise in all 4 seconds of the signal duration, but this time the noise persists in the frequency band 1000-1200 with a global energy of 100. What then will be the quantities above?

Glad that first one helped. BTW, there is a wealth of information on the web that could have helped answer your 1st question had I known you were specifically dealing with human speech. There's been a lot of work done in that area! You might want to try a search on Google using the expressions

"signal to noise" speech weighted global segment

Unfortunately, a lot of the research -- like I'm finding on many topics these days -- is "pay for play". You either pay up, come up with a university reciprocal deal, or do without!

OK -- let's try to tackle this one...

First, we have 4KHz bandwidth and a speech signal with an abitrary power of 100 over a 4 second period.

Your white noise is assumed to be truly "white" (equal across the spectrum of 0~4KHz), also with a power of 100 but with a duration of only 200ms.

We can handle the global S/N easily since we know the total power of each. Let's call the total "speech power" 400 (100 x 4 seconds) and the "total noise power" 20 (100 x 0.2). Computing the S/N for that is easy ... 400 / 20 = 20 (or 20:1, however you prefer it).

However, the segmental is a bit tricker since we haven't defined a segment. For the period of 1 to 1.2 seconds, the S/N would be 1:1 since you have as much noise as you do signal. IF that happened to be the length of a segment being measured... For all others, the value would be infinite since there would be no noise within those segments.

Your bigger problem is discussing frequency weighted SNR, because we don't know from your description where the weighting should occur. The curve could be anything -- exponential from 0 to 4KHz, flat from 0 to 2KHz and linear from 2KHz to 4KHz -- could be anything at all. Further, while we can assume the noise, being white, has equal power at all frequencies, we don't know anything about the signal (voice) and how its power is distributed across the 4KHz spectrum. If the voice has more power at the higher frequency end, but we decide our frequency weighting favors sound at the lower end, that extra voice power isn't going to improve the SNR. Do you understand what I meant yesterday about needing the exact curve (formula for it, actually) for the frequency weighting?

OK -- on to the 2nd question:

You now have a different answer to the global value. You have as much noise power as signal power for the full duration of the signal, and the SNR becomes 1 (or 1:1). That's an easy one.

All segments are equal, and no matter where you measured, you'd still be at 1:1.

This time, you don't have white noise -- it's concentrated in the 1KHz ~ 2KHz region. However, and again, we need to know what your weighting curve and your signal looks like to manage a frequency weighted value.

--------------

Think of it this way to make this freq weighting business easier...

Let's say that we want a frequency weighting that is whose formula is (arbitrarily) P = ((F^2) * S) where "P" is the *weighted* power, "F" is our frequency in Hz, and "S" is our original input signal power.

What this would cause is a signficant increase in the value we place on signals as the frequency increases (we've multiplied by the SQUARE of the frequency). We would be assigning MUCH more weight to the voice signal as its components reached closer to 4KHz than to 0Hz.

Now, let's assume your 2nd situation where your noise is all stuffed into the 1KHz to 1.2KHz range. It won't get as much benefit from our "curve" as voice signals will that are higher in frequency because we've *heavily* weighted the curve towards the upper end. If the voice has a lot of energy in the upper range of your 0~4KHz spectrum, the noise won't look as "big" by comparison, and you'll get a different SNR number. Knowing the exact curve is critical to defining the result.

Does that make sense?

The answers post by the user, for information only, FunQA.com does not guarantee the right.