Jump to content

Sampling rate vs output question


Recommended Posts

Greetings,

 

I have trouble getting my head around a particular question regarding sampling rates and how it's able to be output without losing phase information. I have asked this question with speaker companies, electrical engineers and nobody have been able to give me a solid answer. So embarrassingly I am turning to public SNA forums.

 

So the question is this, how do my speakers render all the phase information in my high-res file if it has a lower frequency response?

 

For example, I want to play a high-res file at 96khz but my speakers can only go up to 25khz in frequency (so equivalent to 50khz sampling rate). I do this because I want to avoid reconstruction aliasing in my analog signal after it passes through my NOS DAC, and also to preserve the minimum perceivable phase information in my analog output which is about 10 microseconds (with a femtosecond clock, this is easily doable). But given the limitations of my loudspeakers, are the phase information above 25khz freq lost anyway?

 

To me it seems like manufacturers care about the audible frequency range (20-20) but do not consider our ability to detect phase information that are far smaller. Or do I have this all wrong?

Link to comment
Share on other sites



7 minutes ago, furtherpale said:

96khz is the sampling rate, not the frequency output.

Did you read my original post? I know the basics, as I indicated in my post the sampling rate is double of frequency. That wasn't my question.

Edited by Standards
Link to comment
Share on other sites

1 hour ago, Standards said:

So the question is this, how do my speakers render all the phase information in my high-res file if it has a lower frequency response?

They don't. They can only render whatever frequencies they're capable of rendering. I'm trying to understand exactly what it is you're worried about as they do not alter the phase of higher frequency information, it is usually just dramatically dropped in amplitude proportional to the frequency. Whether rendering ultrasonic frequencies at all alters human perception of sound is a point of great contention itself though, and that's up to you to decide.

  • Like 2
Link to comment
Share on other sites

15 minutes ago, Ittaku said:

They don't. They can only render whatever frequencies they're capable of rendering. I'm trying to understand exactly what it is you're worried about as they do not alter the phase of higher frequency information, it is usually just dramatically dropped in amplitude proportional to the frequency. Whether rendering ultrasonic frequencies at all alters human perception of sound is a point of great contention itself though, and that's up to you to decide.

Thanks @Ittaku that's the first concrete answer I've ever got, and makes sense to me. Is it also possible that phase information is conveyed differently in loudspeakers since they operate in the analog domain, so an activation (what's the right word here) of the driver diaphram may not line up with a single sample in the digital side, but could possibly activate on a sample above 25khz just because at that precise time, the analog signal is passed to the driver that contain that specific phase wave (if that makes sense).

 

I'm not talking about ultrasonic frequencies, I'm talking about the audio timing. I agree we cannot hear over 20khz, but we can detect phase information in the microseconds according to Barry Leshowitz. I'm not saying whether he's right or wrong, but iif we can, then what phase information are our speakers rendering or attenuating.

Edited by Standards
Link to comment
Share on other sites



1 minute ago, Standards said:

Thanks @Ittaku that's the first concrete answer I've ever got, and makes sense to me. Is it also possible that phase information is conveyed differently in loudspeakers since they operate in the analog domain, so an activation (what's the right word here) of the driver diaphram may not line up with a single sample in the digital side, but could possibly activate on a sample above 25khz just because at that precise time, the analog signal is passed to the driver that contain that specific phase wave (if that makes sense).

 

I'm not talking about ultrasonic frequencies, I'm talking about the audio timing. I agree we cannot hear over 20khz, but we can detect phase information in the microseconds according to Barry Leshowitz. I'm not saying whether / if we can hear phase information down to 10 microseconds, but iif we can, then what phase information are our speakers rendering or attenuating.

The phase information is all there in the electrical signal. Drivers do not substantially change phase information when rendering it except near their resonant frequency or according to the effects of a crossover. Tweeters especially tend to have almost no phase effect at all after the crossover frequency so they fairly accurately maintain phase. Note that crossovers change phase a minimum of 90 degrees at crossover points and often alternate drivers are connected out of phase completely. Psychoacoustic testing have shown we're not sensitive to the phase doing monstrous changes across the frequency response, except for whether the lowest frequencies are rendered in or out of phase - and even then it's not "better" in phase, we can just tell they're different somehow. Worrying about phase changes with speaker drivers is probably meaningless. Lining up phase between different drivers over the crossover points is very important, but the absolute phase appears not to be.

  • Like 1
Link to comment
Share on other sites

3 minutes ago, Ittaku said:

The phase information is all there in the electrical signal. Drivers do not substantially change phase information when rendering it except near their resonant frequency or according to the effects of a crossover. Tweeters especially tend to have almost no phase effect at all after the crossover frequency so they fairly accurately maintain phase. Note that crossovers change phase a minimum of 90 degrees at crossover points and often alternate drivers are connected out of phase completely. Psychoacoustic testing have shown we're not sensitive to the phase doing monstrous changes across the frequency response, except for whether the lowest frequencies are rendered in or out of phase - and even then it's not "better" in phase, we can just tell they're different somehow. Worrying about phase changes with speaker drivers is probably meaningless. Lining up phase between different drivers over the crossover points is very important, but the absolute phase appears not to be.

It seems I asked the question poorly. I'm mainly interested in the drop-off of phase information above 20khz, tweeter only, crossovers are irrelevant.

 

I think you've answered enough, and I've processed what you've said and I understand it better now. Thank you.

Link to comment
Share on other sites

8 hours ago, Standards said:

So the question is this, how do my speakers render all the phase information in my high-res file if it has a lower frequency response?

The short answer is "it doesn't".

 

Your speaker is a band-pass filter.... and so it distorts the audio waveform.

You add filters to the audio to undo that, and so have flat phase response (even though it has a bandpass frequency response) ... but this is not at all straightforward to do - and you will see lots of people over-simplifying the problem/solution.

 

 

In fact, if you correct the phase of a speaker incorrectly, it will make it worse..... even though the measurement you will get looks fantastic  (and this the problem in a nutshell, the measurement you used to base the correction on, was misleading).

 

Quote

are the phase information above 25khz freq lost anyway?

The frequency components above 25khz are lost, and the phase is distorted.

 

As for "timing" (ie. where part of the signal begins/ends), you do not need high sampling rates to capture these.   Even 44.1khz is fine.

 

However it may be easier to preserve them (ie. not blur them once you've captured them) when using high sampling rates.

Quote

To me it seems like manufacturers care about the audible frequency range (20-20) but do not consider our ability to detect phase information that are far smaller. Or do I have this all wrong?

It depends on exactly what you mean by "phase".

 

You seem to mean where exactly a part of the signal (eg. a short click) occurs in time ..... as opposed to whether signal components at 20Hz, 200Hz, 2khz and 20khz are all in time with each other.

 

To get the click to occur at just the precise time.....  You do not need high sampling rates.    That is a myth.

 

However once you have the signal (with it's infinite timing precision) .... then you may damage that precision if you convert it from one rate to another  (and this conversion happens more often than you might expect .... eg. inside a DAC).    So it is often better to make (and keep) audio in high sample rate formats, simply to avoid this conversion.

 

Edited by davewantsmoore
  • Like 1
Link to comment
Share on other sites

9 minutes ago, davewantsmoore said:

It depends on exactly what you mean by "phase".

 

You seem to mean where exactly a part of the signal (eg. a short click) occurs in time ..... as opposed to whether signal components at 20Hz, 200Hz, 2khz and 20khz are all in time with each other.

 

To get the click to occur at just the precise time.....  You do not need high sampling rates.

 

Thank you for the informative response. I do understand the traditional terminology for phase is the degree of delay between the frequencies but didn't know how best to describe "short click", I generalised it to phase. Is "click" the correct terminology?

 

The part where you said it's a myth to require high sampling rates to get the click right on the mark, are you able to elaborate further or point me in the direction of reading material? Thank you.

Link to comment
Share on other sites

12 minutes ago, Standards said:

I generalised it to phase. Is "click" the correct terminology?

No... I just used "click" to give the impression of a very short pulse, and where it falls in time.

 

Everything is interrelated... and just different views of the same thing (the signal waveform).

 

12 minutes ago, Standards said:

The part where you said it's a myth to require high sampling rates to get the click right on the mark, are you able to elaborate further or point me in the direction of reading material? Thank you.

Not in a short post... and not without misleading you (and causing arguments from others who are mislead .... likely by my poor explanation, rather than "facts").

 

In short.... imagine an analogue signal.    You can have a "spike"  (what I called a "click") occur at any point in time.   Were does it begin in time?  Where does it end in time?    The "time resolution" is infinite, right.   It can start at time = x  .... and you could move the start back or forward in time infinitely small amounts.

 

 

Let's say you now captured this with a digital format ... with the sampling rate twice as high [NOTE] as the highest frequency you want to represent.   (eg. 44.1khz for audio frequencies up to ~22khz). 

 

You can represent this infinite time resolution perfectly with your 44.1khz sampling rate.    If you adjust your analogue waveform to move the "click" backwards or forwards in time by one femtosecond .... or 0.0001 femtosecond, whatever....   You are still able to capture the waveform precisely.

 

People get confused about where the sampling points are (and their finite resolution) ... and the resulting waveform it represents (with it's resulting infinite resolution)


[NOTE]   This bit is important.    When you see people saying what I've said above is incorrect ... what they will do is violate this restriction.    What they're doing is mixing up the arguments about the SHAPE of the waveform ..... and the TIMING (where a part of the waveform begins/ends in time, and how accurately that can be represented).

 

We're just talking about the TIMING here.    The timing resolution of digital audio is infinite.

 

Link to comment
Share on other sites



1 hour ago, davewantsmoore said:

You can represent this infinite time resolution perfectly with your 44.1khz sampling rate.    If you adjust your analogue waveform to move the "click" backwards or forwards in time by one femtosecond .... or 0.0001 femtosecond, whatever....   You are still able to capture the waveform precisely.

I'm having trouble wrapping my head around this. 

 

Let's say we are recording a 22.05khz tone, which is the highest frequency which 44.1khz can capture. So there is 22 microseconds per sample. If a click occurs 5 microseconds after a sample, and finishes in 15 microseconds, does this get captured in the digital recording? If so, how is this possible? A click at a duration of 15 microseconds is above the 22.05khz tone.

Link to comment
Share on other sites

57 minutes ago, Standards said:

If a click occurs 5 microseconds after a sample, and finishes in 15 microseconds

No... because this click is too short.   ie.  it contains higher frequencies than 22khz.   (and we can't hear that fast).

 

 

If you instead look at a click which is 22khz .... ie. at minimum (one cycle of 22khz) 45 microseconds long.

 

Where does it begin on the recording?   It can begin anywhere.... the resolution is infinite.   It could begin at X.... it could begin at X plus 1 microsecond ..... or where ever.

 

We CAN hear the difference in where the 45us long click occurs in time down to very fine precision..... especially when it is slightly difference between multiple channels.

 

..... but we cannot hear the high(er) frequency content of the "click".

 

People routinely mix up the difference.

 

  • Like 1
Link to comment
Share on other sites



8 minutes ago, davewantsmoore said:

but we cannot hear the high(er) frequency content of the "click".

So we come back to my original point, which was that assuming Leshowitz is correct and that we can indeed hear a 10 microsecond click (not in frequency, but in timing), then a loudspeaker capped at 25khz in frequency response won't be able to fully render the click? Regardless whether the digital side is fed 44.1khz or 96khz. 

 

Again, I'm not saying I'm siding with Leshowiz, only that assuming he is correct in theory, and that I'm understanding his paper correctly. 

Link to comment
Share on other sites

11 hours ago, Standards said:

For example, I want to play a high-res file at 96khz but my speakers can only go up to 25khz in frequency (so equivalent to 50khz sampling rate). I do this because I want to avoid reconstruction aliasing in my analog signal after it passes through my NOS DAC, and also to preserve the minimum perceivable phase information in my analog output which is about 10 microseconds (with a femtosecond clock, this is easily doable). But given the limitations of my loudspeakers, are the phase information above 25khz freq lost anyway?

 

To me it seems like manufacturers care about the audible frequency range (20-20) but do not consider our ability to detect phase information that are far smaller. Or do I have this all wrong?

This won't answer your question directly, but it seems the Yamaha Professional Audio team do care about speakers reproducing temporal resolution down to 10 microseconds or even less, for a studio environment. Please read section 5.9 of this Yamaha white paper and make up your own mind. 

 

https://uk.yamaha.com/en/products/contents/proaudio/docs/audio_quality/05_audio_quality.html

 

As said above already, much of this subject has been debated and continues to draw debates; nothing new I can add to it. I just want to point out that some manufacturers are aware of the issues in respond to your OP question.

  • Like 2
Link to comment
Share on other sites

Apologies up front, this could be a red herring or rabbit hole.

 

This compilation paper by Bohdan Raczynski pointing to the need for linear phase speakers may, or may not, be what OP is thinking about. http://www.bodziosoftware.com.au/Attributes_Of_Linear_Phase_Loudspeakers.pdf

 

If the content within is relevant, but have already been discussed before on SNA, perhaps someone could just post a link to previous threads to avoid going over old grounds here. 

  • Like 1
Link to comment
Share on other sites



9 hours ago, LHC said:

This won't answer your question directly, but it seems the Yamaha Professional Audio team do care about speakers reproducing temporal resolution down to 10 microseconds or even less

...

https://uk.yamaha.com/en/products/contents/proaudio/docs/audio_quality/05_audio_quality.html

Very nice article, I'm still going through it and the other paper you quoted will come later.

 

So at this point my original point has been answered, most loudspeakers and companies do not design them to render to 10ns in temporal resolution. <-- I have graduated from saying "click"!

 

(I know the Yamaha article says 6us, but doesn't quote references - or I've missed it).

Link to comment
Share on other sites

56 minutes ago, Standards said:

(I know the Yamaha article says 6us, but doesn't quote references - or I've missed it).

The reference to 6us is in the previous section 4. That reference have been debated before on SNA, not everyone here accepts the validity of the reasoning that led to the 6us limit. I just want to acknowledge that. It is related to what Dave has said about timing resolution being infinite. 

Link to comment
Share on other sites

17 hours ago, Standards said:

So we come back to my original point, which was that assuming Leshowitz is correct and that we can indeed hear a 10 microsecond click (not in frequency, but in timing)

 

Ahh... Leshowitz

 

The type of signal used in this experiment is a square wave, ie.  pulsed DC.

 

Yes.  If we could record and reproduce this .... practically and without masking .... then yes, you could hear it.

 

 

But real signals aren't so extreme.   They don't have infinitely high spectral content, followed by zero (DC).

 

Leshowitz experiment doesn't show you can hear high frequencies ....  it shows that DC (or very low) will mask high frequency noise (which is what the rise/fall spectra of the pulsed DC would become in practise).

 

 

... and it shows that the (low pass filter effect) frequency response right up to the top of the audible range is important.

 

That being said.... real music recordings don't have much power there (ie. above about 10khz) ... so in practise it's actually NOT important to have a flat FR upto 20khz (although it would be if the recording had content of significant SPL there)

 

.... but what it means is that where the content DOES have significant power .... you better not be distorting the amplitude.

 

ie. if you have wiggles in your FR at 5khz 8khz 10khz (whatever) .... and your music has power there.    You will damage the reproduction.

 

 

 

BUT.... that is only one channel.

 

If you look at this in 2 channels.    THEN you can have one signal starting now... and one signal starting 1us later on the other channel   (you don't need high frequencies to do this, because the signal does have to rise, fall, and then begin again).

 

.... and you can get results which say you need very very (!!!) accurate timing.   10us... even smaller results have been shown.

 

.... but even low sampling rate audio has the ability to do this time precision.

 

 

 

 

I hope all that makes sense (I'm rushing).    Like I've said before, this stuff is so much easier when we can draw pictures, and talk about them as they're being drawn.    Pictures, thousand words.... interactivity, etc.   (frequency response and timing are views of the same thing)

 

 

In very very short terms.

 

#1.  You don't need frequencies above 20khz.

#2.  You do need high timing precision.

#3.  You don't want to filter or resample the audio, as it is likely you will damage the timing precision [and signal amplitude / shape] you have previously captured.

 

 

#3 is why MQA captures all the HF information.... or why if an album is recorded today in 384khz, then I'd prefer to get it in that format (unless someone has been super duper duper careful with it) .....  not because #1 is wrong.

 

It's why high sampling rate DACs (even at the expense of bits, ie. noise) are a thing..... but why feeding those high rate converters, with very precise upsamplers is a thing (rather than letting the DACs use their own internal upsampling, because it's not good enough).

  • Like 2
Link to comment
Share on other sites

1 hour ago, davewantsmoore said:

#3.  You don't want to filter or resample the audio, as it is likely you will damage the timing precision [and signal amplitude / shape] you have previously captured.

Could you elaborate on that Dave?

 

I note it's common practice to record at 96/24 and distribute as a 44/16 CD.  Is that processing to 44/16 a thing that will "damage the timing precision [and signal amplitude / shape] you have previously captured", and if so how serious is that damage, in your view?

Link to comment
Share on other sites



  • Recently Browsing   0 members

    • No registered users viewing this page.




×
×
  • Create New...
To Top