Home>

### python - handling of sigma in weighted least squares (wls) using curve_fit of scipyoptimize

I would like to estimate the parameters using the weighted least squares method with reference to the sample on the following site.
https://scipython.com/book/chapter-8-scipy/examples/weighted-and-non-weighted-least-squares-fitting/

sample
``````import numpy as np
from scipy.optimize import curve_fit
import pylab
x0, A, gamma = 12, 3, 5
n = 200
x = np.linspace (1, 20, n)
yexact = A * gamma ** 2/(gamma ** 2 + (x-x0) ** 2)
# Add some noise with a sigma of 0.5 apart from a particularly noisy region
# near x0 where sigma is 3
sigma = np.ones (n) * 0.5
sigma [np.abs (x-x0 + 1)<1] = 3
noise = np.random.randn (n) * sigma
y = yexact + noise
def f (x, x0, A, gamma):
"" "The Lorentzian entered at x0 with amplitude A and HWHM gamma." ""
return A * gamma ** 2/(gamma ** 2 + (x-x0) ** 2)
def rms (y, y fit):
return np.sqrt (np.sum ((y-yfit) ** 2))
#Unweighted fit
p0 = 10, 4, 2
popt, pcov = curve_fit (f, x, y, p0)
yfit = f (x, * popt)
print ('Unweighted fit parameters:', popt)
print ('Covariance matrix:');print (pcov)
print ('rms error in fit:', rms (yexact, yfit))
print ()
#Weighted fit
popt2, pcov2 = curve_fit (f, x, y, p0, sigma = sigma, absolute_sigma = True)
yfit2 = f (x, * popt2)
print ('Weighted fit parameters:', popt2)
print ('Covariance matrix:');print (pcov2)
print ('rms error in fit:', rms (yexact, yfit2))
pylab.plot (x, yexact, label ='Exact')
pylab.plot (x, y,'o', label ='Noisy data')
pylab.plot (x, yfit, label ='Unweighted fit')
pylab.plot (x, yfit2, label ='Weighted fit')
pylab.ylim (-1,4)
pylab.legend (loc ='lower center')
pylab.show ()``````
Creation code

Currently, I only use the code for Unweighted fitting that I am making for comparison.

``````import numpy as np
import matplotlib.pyplot as plt
import math
from scipy.optimize import curve_fit
#Function I want to find
def func (x, a):
return ((-(a + (b * x))) + ((a + ((b * x) ** 2))-(4 * b * math.log (0.1))) ** 0.5)/(2 * b)
#Original data of the distribution function I want to find
x = np.array ([6.26379, 8.57417, 8.66527, 8.75069, 11.6708, 12.3487, 14.5032, 15.7422, 21.7646, 23.0518, 26.5069, 26.4035, 26.321, 23.0045, 19.2654, 17.9425, 14.5669, 13.513, 10.4902, 9.95136, 9.77395])
y = np.array ([3.709910308, 3.300454417, 3.219869361, 2.879991517, 2.250120678, 2.24981186, 1.859931899, 1.839996231, 1.560029151, 1.360016958, 1.210037387, 1.527926405, 1.320005022, 1.340038138, 1.618120234, 1.410033737, 1.83006856, 1.849465)
plt.plot (x, y,'bo', label ='Experimental data')
plt.legend ()
# Fit the observation data to find the coefficient
popt, pcov = curve_fit (func, x, y)
popt
#Drawing each curve
plt.plot (x, y,'bo', label = "Experimental data")
plt.plot (x, func (x, * popt),'ro', label = "Unweighted fitting: a = {: .3f}". format (* popt))
plt.legend ()``````
Parts related to Weighted fitting

I am currently creating this, but when I write the following code, I get an error if sigma is not defined. What kind of code should I add if I try to use experimental data instead of using noise for both x and y?

``````# Fit the observation data to find the coefficient
#WLS
popt2, pcov2 = curve_fit (func, x, y, sigma = sigma, absolute_sigma = True)
popt2``````

The error is

``````-------------------------------------------------- -------------------------
NameError Traceback (most recent call last)
<ipython-input-44-d451ecf14c51>in<module>
2 # WLS
3
---->4 popt2, pcov2 = curve_fit (func, x, y, sigma = sigma, absolute_sigma = True)
5 popt2
NameError: name'sigma' is not defined``````

It's not enough words, but I would appreciate your help.
Thanks for your cooperation.

• Answer # 1

The sigma in the weighted least squares method of curve_fit represents exactly that "weight", so please define it by the questioner.

Reference: Yamamoto's Laboratory --SciPy Fitting

sigma: None or M-length sequence or MxM array, optional Default: sigma = None
Uncertainty setting of ydata. If we define the residual as r = ydata --f (xdata, * popt)., The interpretation of sigma depends on the dimension.
One-dimensional sigma is the standard deviation of the y data error. In this case, the optimization function (minimized by fitting) is chisq = sum ((r/sigma) ** 2).
The two-dimensional sigma is a covariance matrix of the errors in the y data, the optimization function is chisq = r.T @ inv (sigma) @ r.
None (default) is equivalent to a 1-d sigma filled with 1.

In this case, the function to find the standard deviation sigma from x, y sigma = s (x, y): If you define it as a sequence x sequence → sequence, you can calculate it by entering the observed values ​​of x and y as they are.

By the way, if you define sigma_scaler = s1 (x, y): scalar x scalar → scalar and use numpy.vectorize, you can easily create the function of sequence x sequence → sequence.

Reference: How to use numpy.vectorize