Suppose you're using FSK with signals at 17.5kHz and 18.5kHz, modulating at 600 baud. What do these signals look like sampled at 44.1ksps? How can you detect them?

(This notebook got big and unwieldy, so I'm continuing in bleep-2.ipynb.)

In [1]:

%pylab inline
import seaborn
style.use('seaborn-poster')
rcParams['lines.linewidth'] = 0.5
random.seed(0)

/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/usr/lib/python3/dist-packages/matplotlib/__init__.py:874: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))

Populating the interactive namespace from numpy and matplotlib

In [2]:

fs = 44100
baud = 600
freqs = 17500, 18500
message = 'What hath God wrought?'
message_bytes = list(map(ord, message))
' '.join('%02x' % b for b in message_bytes)

Out[2]:

'57 68 61 74 20 68 61 74 68 20 47 6f 64 20 77 72 6f 75 67 68 74 3f'

In [3]:

message_bits = array([1 if byte & (1 << i) else 0 for byte in message_bytes for i in range(8)])
subplot(311); stem(message_bits, '.-')
str(list(message_bits)), len(message_bits)

Out[3]:

('[1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0]',
 176)

In [4]:

msglen = int(ceil(fs * (len(message_bits)) / baud))
t = arange(msglen) / fs  # the time in seconds of each sample
msglen, t

Out[4]:

(12936, array([0.00000000e+00, 2.26757370e-05, 4.53514739e-05, ...,
        2.93265306e-01, 2.93287982e-01, 2.93310658e-01]))

In [5]:

bitidx = floor(t * baud).astype(int)  # the index of the bit we'd ideally be transmitting in each sample
plot(bitidx)
bitidx[::150]

Out[5]:

array([  0,   2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24,
        26,  28,  30,  32,  34,  36,  38,  40,  42,  44,  46,  48,  51,
        53,  55,  57,  59,  61,  63,  65,  67,  69,  71,  73,  75,  77,
        79,  81,  83,  85,  87,  89,  91,  93,  95,  97, 100, 102, 104,
       106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130,
       132, 134, 136, 138, 140, 142, 144, 146, 148, 151, 153, 155, 157,
       159, 161, 163, 165, 167, 169, 171, 173, 175])

In [6]:

subplot(211); stem(t[:300], message_bits[bitidx][:300])
subplot(212); plot(t, message_bits[bitidx])
message_bits[bitidx]  # The actual bit we'd ideally be transmitting at each sample

Out[6]:

array([1, 1, 1, ..., 0, 0, 0])

In [7]:

fs/baud

Out[7]:

73.5

Let's apply a triangular window with a 73-sample slope to that signal.

In [8]:

def box_filter(sig, width):
    cs = sig.cumsum()  # this may induce rounding error for long floating-point signals
    return concatenate((cs[:width], cs[width:] - cs[:-width])) / width

In [9]:

bw = box_filter(message_bits[bitidx], 73)  # bits windowed
subplot(811); plot(t, bw)
bw

Out[9]:

array([0.01369863, 0.02739726, 0.04109589, ..., 0.        , 0.        ,
       0.        ])

In [10]:

c0 = sin(17500 * 2 * pi * t)  # carrier 0
c1 = sin(18500 * 2 * pi * t)  # carrier 1
subplot(811); xlim(0, 0.001); plot(t, c0); plot(t, c1)
c0

Out[10]:

array([ 0.        ,  0.60380441, -0.96262425, ...,  0.78183148,
       -0.2467574 , -0.3884348 ])

In [11]:

# formatting from https://stackoverflow.com/a/29188910
subplot(811); xticks(linspace(0, .30, 16)); gca().xaxis.set_major_formatter(FormatStrFormatter('%.2f')); plot(t, bw * c1)
bw * c1

Out[11]:

array([ 0.        ,  0.01327392, -0.03483577, ...,  0.        ,
       -0.        ,  0.        ])

In [12]:

sig = bw * c1 + (1 - bw) * c0
subplot(811); plot(t, sig)
subplot(812); xlim(0, .008); plot(t, sig)
sig

Out[12]:

array([ 0.        ,  0.60053574, -0.95790012, ...,  0.78183148,
       -0.2467574 , -0.3884348 ])

In [13]:

gca().set_yscale('log')
sig_f = fft.fft(sig)
freq = fs * arange(len(sig_f)) / len(sig_f)
ylim(1, len(sig))
xlim(0, fs/2)
plot(freq, abs(sig_f))
sig_f

Out[13]:

array([-0.16018656+0.j        , -0.16008777-0.02442436j,
       -0.19167646+0.01235782j, ..., -0.16908622+0.00585703j,
       -0.19167646-0.01235782j, -0.16008777+0.02442436j])

The spreading of the sharp peaks there is a result of the windowing of the signal; the near-triangular window we're using here is dead simple and makes the spreading a lot less than a rectangular window would. The unequal height of the peaks might be a result of unequal numbers of 1s and 0s in the message, or it might be a result of giving the zeroes an extra sample, but I'm guessing the former.

To recover the bits, we can correlate the signal with the two carrier signals. In reality we'd have to do this with not just the real carriers but also their quadrature counterparts (the imaginary parts of the complex exponentials they came from), because we don't know their phase. But in reality we'll just use the Goertzel algorthm instead. Or maybe even something even simpler.

In [14]:

subplot(811); plot(t, sig * c0)
subplot(812); plot(t, sig * c1)
sig * c0

Out[14]:

array([0.        , 0.36260613, 0.92209788, ..., 0.61126047, 0.06088921,
       0.15088159])

That looks promising. Let's window it a bit.

In [15]:

# Note that using a sum table here is cheating a bit since we're in floating point,
# but it's okay for such a short signal
cs_w = box_filter(sig * c1, 73)
subplot(811); plot(t, cs_w)
cs_w    

Out[15]:

array([0.        , 0.00398573, 0.01510879, ..., 0.08690622, 0.09558987,
       0.08836754])

That looks sort of okay. How about if we compare the two correlated signals?

In [16]:

cs0_w = box_filter(sig * c0, 73)
subplot(811); plot(t, cs0_w)
subplot(812); plot(t, cs_w)
subplot(813); plot(t, cs_w > cs0_w)
subplot(814); plot(t, message_bits[bitidx])
cs0_w

Out[16]:

array([0.        , 0.00496721, 0.01759869, ..., 0.49949984, 0.4977551 ,
       0.49928414])

Okay, that seems to have worked reasonably well, aside from the phase shift produced by the windowing. What about if there's noise?

In [17]:

noise = rand(len(sig))
subplot(811); plot(t, noise)
noise

Out[17]:

array([0.5488135 , 0.71518937, 0.60276338, ..., 0.93613178, 0.54873621,
       0.17713388])

In [18]:

ns = sig + noise  # noisy signal
ns0 = box_filter(ns * c0, 73)
ns1 = box_filter(ns * c1, 73)
subplot(811); plot(t, ns0)
subplot(812); plot(t, ns1)
subplot(813); plot(t, ns1 > ns0)
subplot(814); plot(t, message_bits[bitidx])
ns1

Out[18]:

array([0.        , 0.00873241, 0.01285624, ..., 0.09829513, 0.09040483,
       0.0882162 ])

Well, it didn't completely fall apart. There were a couple of extra transitions induced by noise during a transition; hysteresis is the standard way to clean those up, though I'm not going to touch that here.

What does the spectrum of the signal look like?

In [19]:

gca().set_yscale('log')
ns_f = fft.fft(ns)
freq = fs * arange(len(ns)) / len(ns)
ylim(1, len(ns))
xlim(0, fs/2)
plot(freq, abs(ns_f))
ns_f

Out[19]:

array([ 6.41384625e+03 +0.j        ,  3.40931632e+01 -3.32793284j,
       -3.49511780e-02-41.65632472j, ..., -2.28415030e+01+22.53356689j,
       -3.49511780e-02+41.65632472j,  3.40931632e+01 +3.32793284j])

It's kind of an easy problem if your signal is 20dB above the noise. What if it's just barely above the noise?

In [20]:

rns = sig + 10 * noise  # really noisy signal
subplot(411); plot(t, rns)
rns

Out[20]:

array([ 5.48813504,  7.75242941,  5.06973365, ..., 10.14314929,
        5.24060474,  1.38290404])

In [21]:

gca().set_yscale('log')
plot(abs(fft.fft(rns)))
rns

Out[21]:

array([ 5.48813504,  7.75242941,  5.06973365, ..., 10.14314929,
        5.24060474,  1.38290404])

In [22]:

rns0 = box_filter(rns * c0, 73)
rns1 = box_filter(rns * c1, 73)
subplot(811); plot(t, rns0)
subplot(812); plot(t, rns1)
subplot(813); plot(t, rns1 - rns0)
subplot(814); plot(t, rns1 > rns0)
subplot(815); plot(t, message_bits[bitidx])
rns1

Out[22]:

array([ 0.        ,  0.05145256, -0.00741679, ...,  0.20079531,
        0.04373947,  0.08685417])

That still mostly worked but there are for sure some points where it didn't. It's a little hard to see though. Let's try some hysteresis after all.

In [23]:

def _schmitt(diff, size):
    val = 0  # presume diff starts negative
    for item in diff:
        threshold = size/2 if val == 0 else -size/2
        val = 1 if item > threshold else 0
        yield val

def schmitt(diff, size):
    return array(list(_schmitt(diff, size)))

subplot(811); plot(t, message_bits[bitidx])
subplot(812); plot(t, schmitt(rns1 - rns0, 0))  # no hysteresis
subplot(813); plot(t, schmitt(rns1 - rns0, .05))
subplot(814); plot(t, schmitt(rns1 - rns0, .1))
subplot(815); plot(t, schmitt(rns1 - rns0, .2))
subplot(816); plot(t, schmitt(rns1 - rns0, .5))
subplot(817); plot(t, message_bits[bitidx])
rns1 - rns0

Out[23]:

array([ 0.        , -0.01267006, -0.00468669, ..., -0.44152594,
       -0.62325049, -0.57961247])

Well, that does seem to help, but it doesn't solve the problem completely; once you crank the hysteresis high enough to eliminate all the false transitions, it eliminates a lot of the true ones too. You could probably use this approach with some error correction purely in the digital domain, with no PRML or anything — after all, most of the bits are fine at the penultimate hysteresis level there.

This is leaving out the fact that hypothetically we don't know the phase of the carrier signals. But you could probably lock onto those pretty well and avoid the extra loss of noise immunity that comes from taking the absolute values here:

In [24]:

subplot(811); plot(t, abs(ns1) - abs(ns0))
subplot(812); plot(t, schmitt(abs(ns1) - abs(ns0), .1))
subplot(813); plot(t, message_bits[bitidx])
subplot(814); plot(t, schmitt(abs(rns1) - abs(rns0), .2))
subplot(815); plot(t, abs(rns1) - abs(rns0))
abs(ns1) - abs(ns0)

Out[24]:

array([ 0.        , -0.00215033, -0.00270957, ..., -0.41548685,
       -0.42427376, -0.42778619])

Hmm, perhaps part of the problem is that we're in effect windowing the bits too much; after passing through two 73-sample-wide box filters, an isolated 1 bit between two 0s, or an isolated 0 between two 1s, won't reach full response levels, which seems like it could hurt our noise immunity:

In [25]:

subplot(811); plot(t, box_filter(bw, 73))
box_filter(bw, 73)

Out[25]:

array([0.00018765, 0.00056296, 0.00112591, ..., 0.        , 0.        ,
       0.        ])

If we use narrower box-filter windows, we won't attenuate noise as much, but we also won't attenuate the signal. This is sort of like the simple-moving-average version of using a root-raised-cosine window instead of a raised-cosine window. My thought is that you want to window at the sender to avoid producing a really wideband signal (in this case, one with audible sidebands) and you want to window at the receiver so you can filter out wideband frequency noise.

In [26]:

bw2 = box_filter(message_bits[bitidx], 37)
subplot(811); plot(t, bw2)
subplot(812); plot(t, box_filter(bw2, 37))
bw2

Out[26]:

array([0.02702703, 0.05405405, 0.08108108, ..., 0.        , 0.        ,
       0.        ])

In [27]:

sig2 = bw2 * c1 + (1 - bw2) * c0
rns2 = sig2 + 10 * noise
gca().set_yscale('log')
plot(abs(fft.fft(rns2)))
rns2

Out[27]:

array([ 5.48813504,  7.74924908,  5.0743301 , ..., 10.14314929,
        5.24060474,  1.38290404])

In [28]:

subplot(811); plot(t, sig2)
subplot(812); plot(t, rns2)
subplot(813); xlim(0, .008); plot(t, rns2)
sig2

Out[28]:

array([ 0.        ,  0.59735542, -0.95330366, ...,  0.78183148,
       -0.2467574 , -0.3884348 ])

In [29]:

rns2_c0 = box_filter(rns2 * c0, 37)
rns2_c1 = box_filter(rns2 * c1, 37)
subplot(811); plot(t, rns2_c0)
subplot(812); plot(t, rns2_c1)
subplot(813); fill(t, where(t == max(t), 0, schmitt(rns2_c1 - rns2_c0, .8)))
subplot(814); fill(t, where(t == 0, 0, message_bits[bitidx]))
rns2_c1

Out[29]:

array([ 0.        ,  0.10147287, -0.01478009, ...,  0.22902223,
       -0.02087547,  0.24205015])

I think that may be better, but I'm not really sure. On the other hand, it is an awful lot of noise. Let's see what happens with noise 10dB lower.

In [30]:

rns3 = sig2 + 3.2 * noise
subplot(211).set_yscale('log'); plot(abs(fft.fft(rns3)))
rns3

Out[30]:

array([1.75620321, 2.88596139, 0.97553914, ..., 3.77745318, 1.50919849,
       0.17839363])

In [31]:

rns3_c0 = box_filter(rns3 * c0, 37)
rns3_c1 = box_filter(rns3 * c1, 37)
subplot(811); plot(t, rns3_c0)
subplot(812); plot(t, rns3_c1)
subplot(813); plot(t, rns3_c1 - rns3_c0)
subplot(814); fill(t, schmitt(rns3_c1 - rns3_c0, .8))
subplot(815); fill(t, where(t == 0, 0, message_bits[bitidx]))
rns3_c1

Out[31]:

array([0.        , 0.03779034, 0.01544073, ..., 0.10600469, 0.04352777,
       0.13868389])

In [32]:

# From vocoder.py
import wave

def read_wav(filename):
    "Returns a Numpy array of the left channel of a .wav file."
    with wave.open(filename) as wav:
        assert wav.getnchannels() == 2, filename
        assert wav.getsampwidth() == 2, filename
        assert wav.getframerate() == 44100, filename
        nframes = wav.getparams().nframes
        samples = wav.readframes(nframes)

    # XXX note that the right channel is still stored in memory, just
    # inaccessible through this view
    return ndarray(shape=(nframes*2,), dtype='<h', buffer=samples)[::2]

def write_wav(filename, samples):
    buf = zeros(len(samples) * 2, dtype='<h')
    buf[1::2] = buf[::2] = numpy.round(samples)
    with wave.open(filename, 'w') as outf:
        outf.setnchannels(2)
        outf.setsampwidth(2)
        outf.setframerate(44100)
        outf.writeframes(buf.tobytes())

def dump(filename, sig):
    max_s = max(sig)
    min_s = min(sig)
    sig = (sig - (max_s + min_s)/2) / (max_s - min_s)
    write_wav(filename, 32767 * sig)

In [33]:

dump('ultrafsk.wav', sig2)

In [34]:

%%bash
play -q ultrafsk.wav

Of course I couldn't hear that. But I can definitely for sure hear this. (These are SoX commands. Install SoX.)

In [35]:

%%bash
play -q -r 11025 ultrafsk.wav

Let's try recording this laptop's speaker with its own microphone.

In [36]:

%%bash
# uncomment the next line to actually do this again:
#rec -q -r 44100 testout.wav trim 0 0:01 &
play -q ultrafsk.wav
wait
play -q testout.wav

In [37]:

testout = read_wav('testout.wav')
tot = arange(len(testout)) / fs  # testout time
subplot(411); plot(tot, testout)
testout

Out[37]:

array([ 147,  171,  234, ..., -493, -459, -485], dtype=int16)

You can clearly see the burst of ultrasound there near the beginning, which is pretty good evidence that both the laptop's speaker and microphone are passing it, even though I can't hear it. (This is normal; audio equipment is normally rated to go to 20kHz.) What does its spectrum look like?

In [38]:

subplot(211).set_yscale('log')
tos = fft.fft(testout)
plot(abs(tos[:len(tos)//2]))
tos

Out[38]:

array([ 934179.             +0.j        ,
        519515.61618426  +3163.66613733j,
        791230.45140102+152554.68504232j, ...,
       1072553.58266427 +75348.60789066j,
        791230.45140102-152554.68504232j,
        519515.61618426  -3163.66613733j])

That looks pretty promising, although the 18.5kHz peak is dismayingly lower than the 17.5kHz peak by almost 10dB; I thought maybe the high-frequency rolloffs of the speakers and microphone was already kicking in, but when I repeated the experiment I usually didn't get this result. Now I think it might have been multipath fading. I think the peaks at odd multiples of 1kHz are some kind of problem in the laptop's recording hardware; I've seen them before on this laptop.

Let's do the crudest possible detection of an unknown-phase 18.5kHz wave.

In [39]:

c0t = tot * 17500 * 2 * pi  # carrier 0 theta
c0c = cos(c0t)
c0s = sin(c0t)
subplot(411); plot(tot, c0c * testout)
subplot(412); plot(tot, c0s * testout)
c0c

Out[39]:

array([ 1.        , -0.79713251,  0.27084047, ...,  0.36534102,
        0.27084047, -0.79713251])

So far that is not very fucking inspiring, although there does seem to be a little bit of asymmetry toward the beginning. Let's try running a low-pass box filter over it again and see if that cleans it up.

In [40]:

subplot(411); plot(tot, box_filter(c0c * testout, 37))
subplot(412); plot(tot, box_filter(c0s * testout, 37))
None

Well, it's definitely detecting some kind of signal, and somewhat alarmingly, it looks like it's experiencing a phase shift during the pulse, suggesting that either the playback and recording clocks are running at different speeds, or the distance from the speaker to the microphone was changing. The phase shift looks to be about 90°, which works out to about

In [41]:

1000*1000/17500/4

Out[41]:

14.285714285714286

14 μs, which is

In [42]:

14 * 343

Out[42]:

5000 microns, or 5 millimeters. And actually it's totally plausible that the microphone and speaker could have moved by that much, since they're on totally different parts of the laptop, with the hinge in between. Good thing we're not using QPSK.

I was sort of expecting we might see multipath interference, since I did this in a pretty echoey bedroom, but I'm not seeing it yet.

Let's look at the magnitude:

In [43]:

c0m = sqrt(box_filter(c0c * testout, 37)**2 + box_filter(c0s * testout, 37)**2)  # carrier 0 magnitude
subplot(411); plot(tot, c0m)
subplot(412); xlim(0.02, 0.32); plot(tot, c0m)
subplot(413); fill(t, where(t == max(t), 0, 1 - message_bits[bitidx]))
c0m

Out[43]:

array([ 3.97297297,  2.80547315,  3.85746607, ..., 12.71870699,
       17.35055758,  4.80839189])

That does kind of look like it encodes our desired signal, although for some reason the amplitude dropped to almost half around millisecond 200. Let's look at the 18.5kHz signal, which may not be as pretty.

In [44]:

c1t = tot * 18500 * 2 * pi  # carrier 1 theta
c1c = cos(c1t)
c1s = sin(c1t)
c1m = sqrt(box_filter(c1c * testout, 37)**2 + box_filter(c1s * testout, 37)**2)  # carrier 1 magnitude
subplot(811); plot(tot, box_filter(c1c * testout, 37))
subplot(812); plot(tot, box_filter(c1s * testout, 37))
subplot(813); plot(tot, c1m)
subplot(814); xlim(0.02, 0.32); plot(tot, c1m)
subplot(815); fill(t, where(t == 0, 0, message_bits[bitidx]))
c1m

Out[44]:

array([ 3.97297297,  2.24026016,  4.53190035, ..., 16.40043827,
        8.59627017, 18.37483381])

Okay, while that clearly captured some kind of information about our original signal, it also looks like total shit. I suspect that what's happening here is that the 10dB stronger signal at 17.5 kHz is leaking over too much, because there is way more noise here than you would expect from looking at the region when there was no ultrasound, and also the interference dies down when we're transmitting 1 bits.

So, the dumbest possible demodulation scheme may not work here.

Not quite the dumbest possible demodulation scheme¶

Maybe we can improve matters by using a less derpy filter, maybe make it a bit wider, and also cancel out entirely the leakage by putting a null at the beat frequency. 18.5 kHz - 17.5 kHz = 1 kHz, so let's place it at 44 samples.

In [45]:

c1cld = box_filter(box_filter(box_filter(c1c * testout, 44), 20), 3)  # carrier 1 cosine less derpy
c1sld = box_filter(box_filter(box_filter(c1s * testout, 44), 20), 3)  # carrier 1 sine less derpy
c1mld = sqrt(c1cld**2 + c1sld**2)                                      # carrier 1 magnitude less derpy
subplot(811); plot(tot, c1cld)
subplot(812); plot(tot, c1sld)
subplot(813); plot(tot, c1mld)
subplot(814); xlim(0.02, 0.32); plot(tot, c1mld)
subplot(815); fill(t, where(t == 0, 0, message_bits[bitidx]))
c1mld

Out[45]:

array([0.05568182, 0.11475714, 0.21198083, ..., 1.96398406, 2.20469144,
       2.437543  ])

Well, that's a bit better. But not enough. At least it canceled out the extreme oscillations of leakage from the 17.5kHz signal, which were actually large enough to invert things here at times.

Let's try the 44-sample trick with the 0 carrier, without the extra cascaded filters:

In [46]:

c0mld = sqrt(box_filter(c0c * testout, 44)**2 + box_filter(c0s * testout, 44)**2)  # carrier 0 magnitude less derpy
subplot(511); plot(tot, c0mld, label="carrier 0 magnitude"); legend()
subplot(512); xlim(0.02, 0.32); plot(tot, c0mld, label="zoomed"); legend()
subplot(513); fill(t, where(t == max(t), 0, 1 - message_bits[bitidx]), label="zero bits"); legend()
extrashift =11  # 11.5 samples of shift on c1mld not present in c0mld
compare_vscale = lambda: ylim(-800, 400)
subplot(514); xlim(.02, .32); compare_vscale();
plot(tot[:-extrashift], c1mld[extrashift:] - c0mld[:-extrashift], label="combined signals"); legend(loc=4)
subplot(515); xlim(.02, .32); compare_vscale(); plot(tot, -c0mld, label="carrier 0 inverted alone"); legend(loc=4)
c0mld

Out[46]:

array([ 3.34090909,  2.35914787,  3.24377829, ..., 16.27394997,
       13.42463531,  8.56497466])

That does seem considerably cleaner, although the c1mld signal is providing only minimal assistance there. It seems clear that decent performance is going to need some kind of slowly varying threshold level, not just a Schmitt trigger at a fixed level; we haven't used an equal-weight code like the two-of-five code used in decimal applications, so there are legitimately parts of the message that have more zeroes (like the spaces) and parts that have more ones. That means that a simple average is going to have a hard time, but maybe it's good enough. Also, a code with more transitions could maybe provide easier clock recovery.

We're getting smaller peaks and troughs for isolated 0s and 1s because widening the smoothing filter to 44 samples, while very helpful for separating the two carriers, also makes the transitions take longer. It's not quite as bad as with our initial 73-sample filter, but you can definitely see the effect, and it's probably bad. Probably the solution is to narrow the smoothing filter used on transmission to compensate, resulting in spreading out the sidebands over more bandwidth; this may worsen the leakage problem again, but I don't think it will be as bad as it was originally. Also maybe that filter could gain a stage of its own.

Also maybe we could use frequencies that are a bit further apart and whose beat frequency divides evenly into 44100 Hz. For example, 17.5 kHz and 18970 Hz, which beat at 30 samples. Then we could use a cascade of two 30-sample-wide box filters both on input and on output and life will be good.

In [47]:

17500 + 44100/30

Out[47]:

18970.0

Comb-filter separation¶

If we want to use a couple of simple comb filters on the input to attenuate the competing signals more, what lags should we use? Consider, for example, 7:

In [48]:

def odd_forward_comb(n):
    k = zeros(fs)
    k[0] = 1
    k[n] = -1
    return k

k7f = fft.fft(odd_forward_comb(7))
subplot(211).set_yscale('log'); plot(abs(k7f[:len(k7f)//2]))
k7f

Out[48]:

array([0.00000000e+00+0.j        , 4.97334522e-07+0.00099733j,
       1.98933759e-06+0.00199466j, ..., 4.47600773e-06-0.00299199j,
       1.98933759e-06-0.00199466j, 4.97334522e-07-0.00099733j])

This has a series of sharp nulls, one of which is pointed almost exactly at the 18970Hz carrier:

In [49]:

argmin(k7f[15000:20000])+15000

Out[49]:

Compare its frequency response with a similar comb filter with a lag of 5:

In [50]:

subplot(411).set_yscale('log'); xlim(17000, 19000); ylim(.01, 2); xticks([17000, 17500, 18000, 18500, 18970])
plot(abs(k7f), label="7"); plot(abs(fft.fft(odd_forward_comb(5))), label="5"); plot(abs(fft.fft(odd_forward_comb(31))), label="31")
legend()
None

We can plot the attenuations given to a particular frequency by different lags in a very simple way:

In [51]:

lags = arange(61)
c0_attenuations = abs(cos(lags * 17500 * 2 * pi / 44100)-cos(0))
subplot(411); stem(c0_attenuations)
subplot(412).set_yscale('log'); stem(c0_attenuations)
c0_attenuations

Out[51]:

array([0.        , 1.79713251, 0.72915953, 0.63465898, 1.85329088,
       0.00496922, 1.73305187, 0.82635182, 0.54378934, 1.90096887,
       0.01982751, 1.66168584, 0.92526991, 0.45745374, 1.93969262,
       0.04442719, 1.58374367, 1.02493069, 0.3765102 , 1.96907729,
       0.07852379, 1.5       , 1.1243437 , 0.30176318, 1.98883083,
       0.12177843, 1.4112871 , 1.22252093, 0.23395556, 1.99875692,
       0.17376123, 1.31848665, 1.31848665, 0.17376123, 1.99875692,
       0.23395556, 1.22252093, 1.4112871 , 0.12177843, 1.98883083,
       0.30176318, 1.1243437 , 1.5       , 0.07852379, 1.96907729,
       0.3765102 , 1.02493069, 1.58374367, 0.04442719, 1.93969262,
       0.45745374, 0.92526991, 1.66168584, 0.01982751, 1.90096887,
       0.54378934, 0.82635182, 1.73305187, 0.00496922, 1.85329088,
       0.63465898])

In [52]:

c1_attenuations = abs(cos(lags * 18970 * 2 * pi / 44100)-cos(0))
subplot(411); stem(c1_attenuations)
subplot(412).set_yscale('log'); xticks(lags[::2]); stem(c1_attenuations)
c1_attenuations

Out[52]:

array([0.        , 1.90525125, 0.36104037, 1.25158676, 1.18346117,
       0.41625633, 1.8734082 , 0.00243595, 1.932684  , 0.30893735,
       1.31848665, 1.11444178, 0.47431623, 1.83731   , 0.00973193,
       1.95557281, 0.26020112, 1.3838349 , 1.04486483, 0.53493721,
       1.79713251, 0.0218524 , 1.97380616, 0.21506912, 1.44731315,
       0.97506931, 0.59782393, 1.75307147, 0.0387383 , 1.98729523,
       0.17376123, 1.50861213, 0.90539525, 0.66267001, 1.70534154,
       0.06030738, 1.99597429, 0.13647868, 1.5674332 , 0.83618209,
       0.72915953, 1.65417525, 0.08645454, 1.99980107, 0.10340313,
       1.6234898 , 0.76776704, 0.79696856, 1.59982189, 0.11705241,
       1.99875692, 0.0746957 , 1.67650882, 0.7004834 , 0.86576673,
       1.54254626, 0.1519519 , 1.99284693, 0.05049626, 1.72623195,
       0.63465898])

If we want to optimize for comb-filter nulls, we could maybe use frequencies that are periodic at an integer number of samples, like 10 and 14 or 15 or something like that. Unfortunately having a large LCM means that these periods generate a lot of the same frequencies:

In [53]:

multiply.outer(44100/array([10, 15]), arange(10))

Out[53]:

array([[    0.,  4410.,  8820., 13230., 17640., 22050., 26460., 30870.,
        35280., 39690.],
       [    0.,  2940.,  5880.,  8820., 11760., 14700., 17640., 20580.,
        23520., 26460.]])

These comb filters are sufficiently strong that we can visually see the effect easily, and the attenuation at low frequencies is also noticeably helpful. Here I'm using a 31-sample lag to attenuate the 18500Hz signal used in the first tests; its nulls are both more numerous and narrower, and one of them happens to hit 18500Hz almost exactly.

In [54]:

def sub_ff_comb(sig, lag):
    return concatenate((sig[:lag], sig[lag:] - sig[:-lag]))

subplot(611); plot(t, sig2)
subplot(612); plot(t, sub_ff_comb(sig2, 5))
subplot(613); plot(t, sub_ff_comb(sig2, 31))
testlim = lambda: xlim(0, .35)
subplot(614); testlim(); plot(tot, testout)
subplot(615); testlim(); plot(tot, sub_ff_comb(testout, 5))
subplot(616); testlim(); plot(tot, sub_ff_comb(testout, 31))
3

Out[54]:

Combined with the other smoothing filters we were using previously seems to give good results, although the combined smoothing effects are attenuating some of our peaks.

In [55]:

a = sqrt(box_filter(box_filter(c0c * sub_ff_comb(testout, 31), 44), 20)**2 +
         box_filter(box_filter(c0s * sub_ff_comb(testout, 31), 44), 20)**2)
b = sqrt(box_filter(box_filter(c1c * sub_ff_comb(testout, 5), 44), 20)**2 +
         box_filter(box_filter(c1s * sub_ff_comb(testout, 5), 44), 20)**2)

subplot(511); testlim(); plot(tot, a)
subplot(512); testlim(); plot(tot, b)
subplot(513); testlim(); plot(tot, b-a)
subplot(514); testlim(); plot(tot[:-extrashift], c1mld[extrashift:] - c0mld[:-extrashift], label="previous results")
legend(loc=4)
subplot(515); fill(t, where(t==0, 0, message_bits[bitidx]))
3

Out[55]:

Let's look at the transitions themselves, by using a similar comb filter to approximate the derivative over a short distance without amplifying high-frequency noise too much. This compensates for the signal drift and somewhat for the attenuation of the peaks — the transitions still reach almost the same steepest slope.

In [56]:

baderiv = sub_ff_comb(b-a, 30)/30
threshold = quantile(baderiv, .96)
subplot(411); testlim(); plot(tot, b-a)
subplot(412); testlim(); plot(tot, baderiv);
subplot(413); testlim(); fill(tot, where(tot==tot.max(), 0, schmitt(baderiv, threshold)))
subplot(414); xlim(-.018, .332); fill(t, where(t==0, 0, message_bits[bitidx]))
baderiv

Out[56]:

array([ 0.        , -0.00083324,  0.00134265, ..., -0.14149   ,
       -0.14818087, -0.16480344])

Well, that looks like an almost decent demodulation. But repeating the experiment a few times, I never got quite such good results; this suggests that I overfit the parameters above (and maybe the algorithm) to this data. Here's using an adaptive threshold on the levels rather than the transitions, which doesn't work as well:

In [57]:

baadj = (b-a)[100:-100]-box_filter(b-a, 200)[:-200]
totadj = tot[100:-100]
subplot(411); testlim(); plot(totadj, baadj)
subplot(412); testlim(); fill(totadj, where((totadj == totadj.min()) | (totadj==totadj.max()), 0, schmitt(baadj, 5*threshold)))
subplot(413); xlim(-.018, .332); fill(t, where(t==0, 0, message_bits[bitidx]))
0

Out[57]:

I think part of the problem is that the trapezoidal window we're using to transmit has enough spectral leakage as to make it really hard to distinguish the frequencies, especially if one of them is badly attenuated. This inference is in part due to being able to hear what sounds like quiet white noise down in audible frequencies during the transmission. Here we see that the 17500-Hz signal amplitude is only attenuated by 30× (30dB) at 18500 Hz.

This is not really a problem with the corners of the trapezoidal window so much as with the width of the untouched area in the middle. Using a more Gaussian window approximated by four 37-sample box filters gets the attenuation to 3000× (70dB).

In [58]:

pulselen = int(ceil(5 * fs/baud))
pulse_t = arange(pulselen) / fs
window = box_filter(where(((2/baud) < pulse_t) & (pulse_t <= 3/baud), 1, 0), 37)
# window = exp(-((pulse_t - 2.75/baud)*1000)**2) # vaguely similar actual Gaussian
pulse = c0[:pulselen] * window
pulse_ext = concatenate((pulse, zeros(fs-len(pulse))))  # extend it out to a second so our frequencies are in Hz
pulse_spectrum = fft.fft(pulse_ext)

subplot(411); plot(pulse_t, pulse); plot(pulse_t, window)
subplot(412).set_yscale('log'); xlim(0, fs/2); ylim(.0001, 100); plot(abs(pulse_spectrum))
subplot(413).set_yscale('log'); xlim(17000, 20000); ylim(.00001, 100); plot(abs(pulse_spectrum))

pulselen

Out[58]:

What does the spectrum of a simple CPFSK signal look like?

In [59]:

phi_t = (where(message_bits[bitidx], 17500, 18500)/fs * 2 * pi).cumsum()
cpfsk = cos(phi_t)
subplot(511);plot(t, sub_ff_comb(phi_t, 1))
subplot(512); plot(t, cpfsk)
subplot(513); xlim(0, .008); plot(t, cpfsk)
subplot(514).set_yscale('log'); xlim(0, fs/2); ylim(.1, 10000); plot(abs(fft.fft(concatenate((cpfsk, zeros(fs-len(cpfsk)))))))
subplot(515).set_yscale('log'); xlim(0, fs/2); ylim(.1, 10000); plot(abs(fft.fft(concatenate((sig2, zeros(fs-len(sig2)))))))
cpfsk

Out[59]:

array([-0.79713251,  0.27084047,  0.36534102, ..., -0.44985987,
       -0.03917084,  0.51839257])

Not too bad I guess? And it looks like we can demodulate it too, using the same approaches we were using before:

In [60]:

cpfsk_c0 = sqrt(box_filter(box_filter(c0c[:len(cpfsk)] * sub_ff_comb(cpfsk, 31), 44), 20)**2 +
                box_filter(box_filter(c0s[:len(cpfsk)] * sub_ff_comb(cpfsk, 31), 44), 20)**2)
cpfsk_c1 = sqrt(box_filter(box_filter(c1c[:len(cpfsk)] * sub_ff_comb(cpfsk, 5), 44), 20)**2 +
                box_filter(box_filter(c1s[:len(cpfsk)] * sub_ff_comb(cpfsk, 5), 44), 20)**2)

subplot(411); plot(t, cpfsk_c0)
subplot(412); xlim(0, .008); plot(t, cpfsk_c0)
subplot(413); plot(t, cpfsk_c1)
subplot(414); xlim(0, .008); plot(t, cpfsk_c1)
cpfsk_c0

Out[60]:

array([9.05832395e-04, 2.06537820e-03, 3.09585351e-03, ...,
       1.71696240e-05, 1.10494901e-05, 2.10186226e-05])

In [60]: