Think DSP

Think DSP

1.2 Spectral decomposition

  • The most important topic in this book is spectral decomposition, which is the idea that any signal can be expressed as the sum of sinusoids with different frequencies.
  • The most important mathematical idea in this book is the Discrete Fourier Transform, or DFT, which takes a signal and produces its spectrum. The spectrum is the set of sinusoids that add up to produce the signal.
  • the most important algorithm in this book is the Fast Fourier Transform, or FFT, which is an efficient way to compute the DFT.

1.3 ~ 1.5

1.6 Wave objects

  • Given a Signal, you can make a Wave.
  • Given a Wave, you can make a Spectrum, and vice versa.
  • A Wave object contains three attributes: ys is a NumPy array that contains the values in the signal.
  • ts is an array of the times where the signal was evaluated or sampled.
  • framerate is the number of samples per unit of time. The unit of time is usually seconds, but it doesn’t have to be. In one of my examples, it’s days.
  • To modify a wave, you can access the ts and ys directly. For example: wave.ys *= 2, wave.ts += 1. The first line scales the wave by a factor of 2, making it louder. The second line shifts the wave in time, making it start 1 second later.

Chapter 3. Non-periodic signals

3.4 Spectrogram

  • To recover the relationship between frequency and time, we can break the chirp into segments and plot the spectrum of each segment. The result is called a short-time Fourier Transform (STFT).
  • The most common way to visualize a STFT is a spectrogram, which shows time on the x-axis and frequency on the y-axis. Each column in the spectrogram shows the spectrum of a short segment, using color or grayscale to represent amplitude.

3.6 Leakage

  • DFT treats waves as if they are periodic. that is, it assumes that the finite segment is a complete period from an infinite signal that repeats over all time. In practice, this assumption is often false, which creates problems.
  • One common problem is discontinuity at the beginning and end of the segment. Because DFT assumes that the signal is periodic, it implicitly connects the end of the segment back to the beginning to make a loop. If the end does not connect smoothly to the beginning, the discontinuity creates additional frequency components in the segment that are not in the signal.

  • Figure 3.4 (middle) shows the spectrum of this segment. Again, the peak is at 440 Hz, but now there are additional components spread out from 240 to 640 Hz. This spread is called spectral leakage, because some of the energy that is actually at the fundamental frequency leaks into other frequencies. In this example, leakage happens because we are using DFT on a segment that becomes discontinuous when we treat it as periodic.

3.7 Windowing

  • We can reduce leakage by smoothing out the discontinuity between the beginning and end of the segment, and one way to do that is windowing.

  • A “window” is a function designed to transform a non-periodic segment into something that can pass for periodic. Figure 3.5 (top) shows a segment where the end does not connect smoothly to the beginning.
  • Figure 3.5 (middle) shows a “Hamming window”, one of the more common window functions. No window function is perfect, but some can be shown to be optimal for different applications, and Hamming is a good, all-purpose window.
  • Figure 3.5 (bottom) shows the result of multiplying the window by the original signal. Where the window is close to 1, the signal is unchanged. Where the window is close to 0, the signal is attenuated. Because the window tapers at both ends, the end of the segment connects smoothly to the beginning.
  • Figure 3.4 (right) shows the spectrum of the windowed signal. Windowing has reduced leakage substantially, but not completely.
  • Wave class provides window, and NumPy provides hamming like below.


Chapter 5. Autocorrelation

5.1 Correlation

  • In general, correlation between variables means that if you know the value of one, you have some information about the other. There are several ways to quantify correlation, but the most common is the Pearson productmoment correlation coefficient, usually denoted ρ. For two variables, x and y, that each contain N values:

  • Pearson’s correlation is always between -1 and +1 (including both). If ρ is positive, we say that the correlation is positive, which means that when one variable is high, the other tends to be high. If ρ is negative, the correlation is negative, so when one variable is high, the other tends to be low.

Chapter 6. Discrete Cosine Transform (DCT)

6.2 Synthesis with arrays

  • Writing this computation in terms of linear algebra makes the code smaller and faster. Linear algebra provides concise notation for operations on matrices and vectors. For example, we could write synthesize like this:
  • M = cos(2πt ⊗ f) , y = Ma : where a is a vector of amplitudes, t is a vector of times, f is a vector of frequencies, and ⊗ is the symbol for the outer product of two vector

6.3 Analysis

  • Suppose I give you a wave and tell you that it is the sum of cosines with a given set of frequencies. How would you find the amplitude for each frequency component? In other words, given ys, ts and fs, can you recover amps?
  • In terms of linear algebra, the first step is the same as for synthesis: we compute M = cos(2πt ⊗ f). Then we want to find a so that y = Ma; in other words, we want to solve a linear system. NumPy provides linalg.solve, which does exactly that.

  • Like this, we can get amps from yt, fs, ts. but this algorithm that Solve a linear system of equations is so slow.

6.4 Orthogonal matrices

  • y = Ma, so we can get a if we can compute inverse M (inverse matrix) fast.
  • In particular, if M is orthogonal, the inverse of M is just the transpose of M, written M^T. In Numpy transposing an array is a constant-time operation.
  • Again, a matrix is orthogonal if its transpose is also its inverse.

6.5 DCT-IV

  • If we choose ts and fs carefully, we can make M orthogonal. There are several ways to do it, which is why there are several versions of the Discrete Cosine Transform (DCT)
  • One simple option is to shift ts and fs by a half unit. This version is called DCT-IV, where “IV” is a roman numeral indicating that this is the fourth of eight versions of the DCT.

  • Notice, First, I added 0.5 to ts and fs. Second, I canceled out time_units, which simplifies the expression for fs. (compare to previous version dct. not noted here)
  • Then M^TM is almost 2I, which means M is almost orthogonal. the inverse of M is just M/2.  Now we can write a more efficient version of analyze.

  • Instead of using np.linalg.solve, we just multiply by M/2.
  • Combining test2 and analyze2, we can write an implementation of DCT-IV.

  • Again, ys is the wave array. We don’t have to pass ts and fs as parameters; dct_iv can figure them out based on N, the length of ys.  We can test it like this.

  • The result is very small (1e-16), which is what we expect due to floating-point errors.

6.6 Inverse DCT

  • Finally, notice that analyze2 and synthesize2 are almost identical. The only difference is that analyze2 divides the result by 2. We can use this insight to compute the inverse DCT

  • inverse_dct_iv solves the synthesis problem: it takes the vector of amplitudes and returns the wave array, ys.

Chapter 7. Discrete Fourier Transform

7.1 Complex exponentials

e^φ = 1 + φ + (φ^ 2)/2! + (φ^ 3)/3! + ...

  • This definition works with real numbers, imaginary numbers and, by a simple extension, with complex numbers.
  • This definition works with real numbers, imaginary numbers and, by a simple extension, with complex numbers. Applying this definition to a pure imaginary number, iφ, we get

e^iφ= 1 + iφ − (φ ^2)/2! − i(φ ^3)/3! + ..

  • By rearranging terms, we can show that this is equivalent to:

e^( iφ) = cos φ + i sin φ

  • This formula implies that e iφ is a complex number with magnitude 1; if you think of it as a point in the complex plane, it is always on the unit circle. And if you think of it as a vector, the angle in radians between the vector and the positive x-axis is the argument, φ.

In the case where the exponent is a complex number, we have: e^( a+iφ )= (e^ a)*( e^(iφ)) = A*e^(iφ)

where 'A' is a real number that indicates amplitude and e^(iφ) is a unit complex number that indicates angle.

  • When the argument to np.exp is imaginary or complex, the result is a complex number.

  • This example confirms that e ^(iφ) is a complex number with magnitude 1 and angle φ radians

7.3 The synthesis problem

  • we can create compound signals by adding up complex sinusoids with different frequencies. And that brings us to the complex version of the synthesis problem: given the frequency and amplitude of each complex component, how do we evaluate the signal?

7.4 Synthesis with matrices

  • As we saw in Section 6.2, we can also express the synthesis problem in terms of matrix multiplication

  • Again, amps is a NumPy array that contains a sequence of amplitudes. fs is a sequence containing the frequencies of the components. ts contains the times where we will evaluate the signal.
  • args contains the outer product of ts and fs, with the ts running down the rows and the fs running across the columns (you might want to refer back to Figure 6.1). Each column of matrix M contains a complex sinusoid with a particular frequency, evaluated at a sequence of ts.
  • When we multiply M by the amplitudes, the result is a vector whose elements correspond to the ts; each element is the sum of several complex sinusoids, evaluated at a particular time.
  • Remember that we can think of a complex number in two ways: either the sum of a real and imaginary part, x + iy, or the product of a real amplitude and a complex exponential, Ae^(iφ0) . Using the second interpretation, we can see what happens when we multiply a complex amplitude by a complex sinusoid. For each frequency, f , we have:

Ae^(iφ0) · e^( i2π f t) = Ae^(i2π f t+φ0)

  • Multiplying by Ae^(iφ0) multiplies the amplitude by A and adds the phase offset φ0.
  • Now that we have the more general solution to the synthesis problem – one that handles complex amplitudes – we are ready for the analysis problem.

7.5 The analysis problem

  • The analysis problem is the inverse of the synthesis problem: given a sequence of samples, y, and knowing the frequencies that make up the signal, can we compute the complex amplitudes of the components, a?

7.6 Efficient analysis

  • Unfortunately, solving a linear system of equations is slow. For the DCT, we were able to speed things up by choosing fs and ts so that M is orthogonal. That way, the inverse of M is the transpose of M, and we can compute both DCT and inverse DCT by matrix multiplication.
  • We’ll do the same thing for the DFT, with one small change. Since M is complex, we need it to be unitary, rather than orthogonal, which means that the inverse of M is the conjugate transpose of M, which we can compute by transposing the matrix and negating the imaginary part of each element. See
  • The NumPy methods conj and transpose do what we want. Here’s the code that computes M for N = 4 components:

  • If M is unitary, M^*M = I, where M^* is the conjugate transpose of M, and I is the identity matrix. We can test wheter M is unitary like this:

We can use this result to write a faster version of analyze1:

7.7 DFT

  • As a function, analyze2 would be hard to use because it only works if fs and ts are chosen correctly. Instead, I will rewrite it to take just ys and compute freq and ts itself.
  • First, I’ll make a function to compute the synthesis matrix, M:

  • We are almost done; analyze3 computes something very close to the DFT, with one difference. The conventional definition of DFT does not divide by N

  • The inverse DFT is almost the same, except we don’t have to transpose and conjugate M, and now we have to divide through by N:

  • If I could go back in time, I might change the definition of DFT so it divides by N and the inverse DFT doesn’t. That would be more consistent with my presentation of the synthesis and analysis problems. Or I might change the definition so that both operations divide through by √ N. Then the DFT and inverse DFT would be more symmetric. But I can’t go back in time (yet!), so we’re stuck with a slightly weird convention. For practical purposes it doesn’t really matter.