Home>

Thank you for browsing.
I am planning to read the Excel.csv file, calculate the root mean square (RMS) for every 100 data, and output it to another Excel.csv file.
I tried to make a program on my own, but I can't execute it.
The Excel.csv file to read is as follows.
There are about 4500 rows of data in columns A and B, and we want to RMS column B.
Thank you.

Error message

I cannot write a program.

-------------------------------------------- -------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-2b1f58483765>in<module>()
     16 data = pd.read_csv ("P_emg-1524717370.csv", index_col = "time")
     17 df_emg1 = data.iloc [:, [0]]
--->18 RMS1 = window_rms (df_emg1, WINDOW_SIZE)
     19 df_RMS_emg1 = RMS1
     20
<ipython-input-4-2b1f58483765>in window_rms (a, window_size)
     11 a2 = np.power (a, 2)
     12 window = np.ones (window_size)/float (window_size)
--->13 return np.sqrt (np.convolve (a2, window, "same"))
     14 
     15 Load #csv file
~ \ Anaconda3 \ lib \ site-packages \ numpy \ core \ numeric.py in convolve (a, v, mode)
   1034 raise ValueError ('v cannot be empty')
   1035 mode = _mode_from_name (mode)
->1036 return multiarray.correlate (a, v [::-1], mode)
   1037
   1038
ValueError: object too deep for desired array
Applicable source code coding: utf-8

import csv
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt
import pandas as pd
WINDOW_SIZE = 100

Create RMS formula

def window_rms (a, window_size):
a2 = np.power (a, 2)
window = np.ones (window_size)/float (window_size)
return np.sqrt (np.convolve (a2, window,"same"))

Load csv file

data = pd.read_csv ("example.csv", index_col ="semit")
df_iwhr1 = data.iloc [:, [0]]
RMS1 = window_rms (df_iwhr1, WINDOW_SIZE)
df_RMS_iwhr1 = RMS1

Create csv file

csvfile = open ("new1.csv",'w', newline ="")
df_RMS_iwhr1.to_csv (csvfile)

python
Tried

Please describe what you tried for the problem here.

Supplemental information (FW/tool version etc.)

Please provide more detailed information here.

  • Answer # 1

    RMS is calculated by moving average.
    Inpandas, you can write like this.

    import pandas as pd
    import numpy as np
    window_size = 3
    rms = lambda d: np.sqrt ((d ** 2) .sum ()/d.size)
    data = pd.read_csv ('in.csv')
    data ['rms'] = data.iloc [:, 0] .rolling (window = window_size, min_periods = 1, center = True) .apply (rms)
    data.to_csv ('out.csv', index = None)

    [Supplement]

    A brief explanation.

    First, if there is a one-dimensional array (pandas.Series) such as [1,2,3,4,5,6,7,8,9], RMS is

    import pandas as pd
    import numpy as np
    row_data = pd.Series ([1,2,3,4,5,6,7,8,9])
    res = np.sqrt ((row_data ** 2) .sum ()/row_data.size)
    print (res)
    # 5.627314338711377

    can be obtained, so the expression to calculate this RMS is functionalized withlambda

    import pandas as pd
    import numpy as np
    row_data = pd.Series ([1,2,3,4,5,6,7,8,9])
    rms = lambda d: np.sqrt ((d ** 2) .sum ()/d.size)
    res = rms (row_data)
    print (res)
    # 5.627314338711377


    It becomes.

    Next, in the case of the code of this question, it is required to calculate the RMS of a certain interval instead of calculating the RMS of the entire column, so this part isSeries.rolling ( ).
    To explainSeries.rolling ()with a simple sample,for a one-dimensional array such as [1,2,3,4,5,6,7,8,9] Apply rolling (). sum ()

    row_data = pd.Series ([1,2,3,4,5,6,7,8,9])
    res = row_data.rolling (window = 3) .sum ()
    print (res)
    # 0 NaN
    # 1 NaN
    # 2 6.0
    # 3 9.0
    # 4 12.0
    # 5 15.0
    # 6 18.0
    # 7 21.0# 8 24.0
    #dtype: float64
    As in

    , you can find the total for each fixed interval (this time, the interval specified by window = 3).
    In this case, I want to apply my own function (rms), not the total for each fixed interval, so useSeries.rolling (). Apply ()

    row_data = pd.Series ([1,2,3,4,5,6,7,8,9])
    rms = lambda d: np.sqrt ((d ** 2) .sum ()/d.size)
    res = row_data.rolling (window = 3) .apply (rms)
    print (res)
    # 0 NaN
    # 1 NaN
    # 2 2.160247
    # 3 3.109126
    # 4 4.082483
    # 5 5.066228
    # 6 6.055301
    # 7 7.047458
    # 8 8.041559
    #dtype: float64


    It becomes.

    Lastly, regarding the rolling parameters, in the above example, the result ofindex = 0,1isNaN. This is because the number of input data for calculating this part does not satisfy the number of sections (Window = 3) and cannot be calculated. Therefore,min_periods = 1is passed as a parameter, and it is specified to calculate if there is at least one input data. In the above example, the result of the input dataindex = 0,1,2is inindex = 2. Therefore, by passingcenter = True, the result of index = 0,1,2 is output to index = 1.

  • Answer # 2

    df_iwhr1of the caller isDataFrame, soa2in thewindow_rmsfunctionshapebecomes(12345,1)and a presentation error has occurred.
    In this presentation code, all rows are targeted, so it is not necessary to use.iloc, anddata ['semit']is used to set the column value toPass it as a Series.
    Reference: ValueError: object too deep for desired array while using convolution

    import numpy as np
    import pandas as pd
    def window_rms (a, window_size):
        a2 = np.power (a, 2)
        window = np.ones (window_size)/float (window_size)
        return np.sqrt (np.convolve (a2, window, "same"))
    WINDOW_SIZE = 3
    data = pd.DataFrame ({'semit': [i + 1 for i in range (10)], 'iwhr1': [(i + 1) * 10 for i in range (10)]})
    print (data)
    RMS1 = window_rms (data ['semit'], WINDOW_SIZE)
    data ['RMS'] = RMS1
    print (data)

  • Answer # 3

    Because it has been converted into a DataFrame, it is a code that makes full use of it.

    data ['sq'] = np.sqrt (df_iwhr1) # square all elements
    data ['mean'] = data.rolling (window = WINDOW_SISE) .mean () ['sq'] # Moving average
    data ['rms'] = np.sqrt (data ['mean']) #Square root of the result


    Since the progress of the calculation is also in the data, please take out what you need and output it to a csv file