LinearRegression () returns a list in a list (sklearn)

I am doing multidimensional linear regression in Python ( sklearn ), but for some reason the coefficients are not true as a list. Instead, an IN A LIST list is returned:

from sklearn import linear_model clf = linear_model.LinearRegression() # clf.fit ([[0, 0, 0], [1, 1, 1], [2, 2, 2]], [0, 1, 2]) clf.fit([[394, 3878, 13, 4, 0, 0],[384, 10175, 14, 4, 0, 0]],[3,9]) print 'coef array',clf.coef_ print 'length', len(clf.coef_) print 'getting value 0:', clf.coef_[0] print 'getting value 1:', clf.coef_[1] 

This returns the values ​​in list [[]] instead of list []. Any idea why this is happening? Exit:

 coef array [[ 1.03428648e-03 9.54477167e-04 1.45135995e-07 0.00000000e+00 0.00000000e+00 0.00000000e+00]] length 1 getting value 0: [ 1.03428648e-03 9.54477167e-04 1.45135995e-07 0.0000000 0e+00 0.00000000e+00 0.00000000e+00] getting value 1: Traceback (most recent call last): File "regress.py", line 8, in <module> print 'getting value 1:', clf.coef_[1] IndexError: index out of bounds 

But it works:

 from sklearn import linear_model clf = linear_model.LinearRegression() clf.fit ([[0, 0, 0], [1, 1, 1], [2, 2, 2]], [0, 1, 2]) # clf.fit([[394, 3878, 13, 4, 0, 0],[384, 10175, 14, 4, 0, 0]],[3,9]) print 'coef array',clf.coef_ print 'length', len(clf.coef_) print 'getting value 0:', clf.coef_[0] print 'getting value 1:', clf.coef_[1] 

Output:

 coef array [ 0.33333333 0.33333333 0.33333333] length 3 getting value 0: 0.333333333333 getting value 1: 0.333333333333 
+4
python list regression
Jul 18 '12 at 20:12
source share
4 answers

This is fixed by updating two files in the SciKit-Learn folder.

The code is here: https://github.com/scikit-learn/scikit-learn/commit/d0b20f0a21ba42b85375b1fbc7202dc3962ae54f

+2
Jul 19 '12 at 21:31
source share

There seems to be a problem with scipy.linalg. If you trace the chain of calls, it first goes to https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/base.py#L218 , and then it reaches the if statement in https: / /github.com/scipy/scipy/blob/master/scipy/linalg/basic.py#L468 . This if distinguishes your two tests. In the first case, m,n=2,6 , and in the second, m,n=3,3 .

+2
Jul 18 2018-12-18T00:
source share

I have never used the module for multidimensional linear regression that you are talking about, so I can not understand why this is happening. But if you just want to solve your problem, you can smooth the list:

 flat_list = clf.coef_[0] 

If there can be several subscriptions in the list (and you want to combine them all into a flat list), you can use a more general way to smooth it:

 flat_list = [item for sublist in clf.coef_ for item in sublist] 

Strike>

EDIT: Expecting a real explanation / solution from package developers, you can rely on such a solution:

 if isinstance(clf.coef_[0], list): clf.coef_ = clf.coef_[0] 

This aligns the list only if there is a count inside it.

+1
Jul 18 2018-12-18T00:
source share

This is a really wrong question about the Python language; this should be a question for sklearn developers. But ... if you know that this is the format in which your data will be returned, you can simply:

 print 'getting value 0:', clf.coef_[0][0] print 'getting value 1:', clf.coef_[0][1] ^^^ 
-one
Jul 18 '12 at 20:26
source share



All Articles