Home>

I have a question in Python programming
It is a program that displays the spectrum
The program below is, but I would like to ask you to teach me what to do if the number of arrays does not match.
This is my first time to ask a question, so please be kind to me ---

ValueError: operands could not be broadcast together with shapes (27200,) (27264,)
Modified source code
# Module import
import numpy as np
import matplotlib.pyplot as plt
import soundfile as sf
import scipy
from scipy import signal as sg
#Read audio
x, fs = sf.read ('speech1.wav')
window_num = 256 # Number of window width data
stride_num = 128 # number of stride width data
#Spectrogram calculation
f, t, X1 = sg.stft (x, fs = fs, nperseg = window_num, noverlap = (window_num-stride_num))
#Decryption by reverse STFT
_, y = sg.istft (X1, fs = fs, nperseg = window_num, noverlap = (window_num-stride_num))
#Save output audio
sf.write ('outout.wav', y, fs)
# Display on graph
# - Waveform
plt.figure ('Original waveform')
plt.plot (x)
# --Decrypted waveform
plt.figure ('Decryption waveform')
plt.plot (y)
# --Signal difference waveform (I don't know here. I want to find the difference signal x−y between the input signal x and the output signal y)
plt.figure ('Signal difference waveform')
plt.plot (x-y)
What I tried

After receiving the answers to the questions, I tried to match the number of elements using nperseg and nooverlap from the following site. https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.istft.html

  • Answer # 1

    About the error in the question

    When I stft and istft, the length seems to change from the original array. Therefore, if you do not calculate with the same length,plt.plot (x-y)By the wayValueError: operands could not be broadcast together with shapes (27200,) (27264,)I get an error such as. This is the error reported by the questioner.

    The stft documentation includes In order to enable inversion of an STFT via the inverse STFT in istft, the signal windowing must obey the constraint of "Nonzero OverLap Add" (NOLA), and the input signal must have complete windowing coverage. (ie (x.shape [axis] --nperseg)% (nperseg-noverlap) == 0). If you want to istft properly, it means that (x.shape [axis] --nperseg)% (nperseg-noverlap) == 0 so that the input signal can be cut into windows exactly. In other words, the cause of the error was that the input signal was left over in the window. In fact, padded's default options auto-completed the remainder, calculated one more window, and the conversion result was bloated.

    Therefore,Slightly trim the back of the input array so that the window is neatly carvedI fixed it. This cleared the error.

    About errors not in the question

    However, I was troubled by another error before reaching the above.f, t, X1 = sg.stft (x, fs = fs, nperseg = window_num, noverlap = window_num-stride_num)By the way, another errorValueError: noverlap must be less than nperseg.Will occur.

    In conclusion, the soundfile audio file was a one-dimensional array in monaural and a two-dimensional array in stereo, and the meaning of the dimensions in two dimensions deviated from stft's expectations. ..

    Actually, I got a wav file from a site that provides both monaural and stereo samples, and compared the import format of soundfile and confirmed it. Looking at the shape, it looks like (8250520,) for monaural and (8250624, 2) for stereo. Note that the time direction is the first dimension.

    On the other hand, the stft documentation says Axis along which the STFT is computed;the default is over the last axis (ie axis = -1)., And the dimension with the last time direction (-1) is the default. .. Therefore, due to this gap, it was supposed to pass in monaural, but an error would occur in stereo. When I input stereo audio with the default settings, the length of the array in the time direction is only 2, and the first and only window size (= nperseg) is too small, which is strange compared to the set nooverlap. I think it was an error. Once you know the cause, you know the meaning of the error.

    As a countermeasure, it is possible to change this behavior with the axis option, but since istft has the same idea,"When analyzing a file captured by sound with scipy.signal, it is faster to switch the dimensions."I think. Therefore, the solution is to transpose (.T) the audio array during scipy.signal processing. This allowed us to eliminate this error. Since the questioner's audio file is monaural, it seems that no error occurred.

    This is the source with the above two modifications.

    # Module import
    import numpy as np
    import matplotlib.pyplot as pltimport soundfile as sf
    import scipy
    from scipy import signal as sg
    #Read audio
    x, fs = sf.read ('speech1.wav')
    window_num = 256 # Number of window width data
    stride_num = 128 # number of stride width data
    # ★ Fix Slightly trim the back of the input array so that the window is cut tightly
    x = x [:-((len (x) --window_num)% stride_num)]
    # ★ Correct Transpose before and after sg calculation.
    #Spectrogram calculation
    f, t, X1 = sg.stft (x.T, fs = fs, nperseg = window_num, noverlap = (window_num-stride_num))
    #Decryption by reverse STFT
    _, y = sg.istft (X1, fs = fs, nperseg = window_num, noverlap = (window_num-stride_num))翻译不全

Trends