Sleeping through tanh
Finally! SIX BLESSED HOURS of sleep, after weeks of halting at 3.5 hours and being unable to resume. Good old Tanh, the
formula of Life, turned out to be the answer.
Specifically: I finally found a proper way to compress the audio in my bedtime OTR playlist. I've been trying various shareware progs for years, with no real luck.
Most of the available "normalizers" act linearly without altering the internal shape of the waves. This method fails because the algorithms don't account for silence. They run a straightforward RMS of the whole file, which means that a clip with pauses will have an
apparent mean that's considerably lower than its
actual sound. It will then end up WAY too loud after processing. A 'peaky' clip has the same problem in the other direction.
Non-sleep finally raised my grumpy energy to a threshold where it overflowed into coding energy.
I've always known what's needed. Reduce the
deltas. The old AM broadcast compression circuit was just right. A non-linear AGC (soft-clipping or diode clipping) with syllable-length attack and release, leading to an overall equalization
within the wave. Louder cycles should come down a bit, and softer cycles should come up to 'fill in' the envelope.
Turns out I didn't need the attack and release. After I got the non-linear equalization properly tuned, it's unquestionably
good enough.
Here's a short bit of waveform to show what I mean by equalizing.
Original is above, equalized is below. You can see that the peaks end up about the same and the low zones 'fill in' to some extent. It's not dramatic in the graph, but it's ENOUGH. I can turn down the physical volume knob on the speaker; I can still understand the speech without straining; then after I've faded into dreamland, nothing is salient or surprising enough to wake me. Even 'peaky' clips (like
wartime news reports with a mix of studio and SW remotes) stay firmly clamped within the same range.
I won't bother to upload the entire script because it's customized to run a batch job on my own folders, and it doesn't have a GUI or anything.
But in case someone wants to use the formula, here's the relevant bit of Python code:
= = = = =
# Overall idea: Normalize and floatify so that
# highest peak becomes 1, then tanh the normalized array.
# First run through to find highest peak:
Peak=-1000000
for i in range(RealBlock):
if SourceArray[i] > Peak: Peak = SourceArray[i]
# Now Peak holds the highest actual DAC value.
# Normalize and tanh all at once:
for i in range(RealBlock):
NormedArray[i]= 1.2 * math.tanh( 1.2 * float(SourceArray[i]) / float(Peak) )
# Now put back in DAC-style numbers, multiplying by a constant Standard Peak
# which is about 1/3 of full range:
for i in range(RealBlock):
SourceArray[i] = int(float(NormedArray[i]) * float(StandardPeak))
# Place the result in Dest file
fpDst.writeframes(SourceArray.tostring())
= = = = =
Note: It's coincidence that 1.2 appears twice. The outer 1.2 brings the level back up after the tanh takes it down by about .75, and the 1.2 inside the tanh function serves to shape the equalizing. This parameter was 'tuned' by trial and error.
You might be able to rework the essential formula into a macro for Audacity or Sox or Ffmpeg or whatever.
Labels: Grand Blueprint, Zero Problems