Skip to content Skip to sidebar Skip to footer

Reshape A Data For Sklearn

I have a list of colors: initialColors = [u'black' u'black' u'black' u'white' u'white' u'white' u'powderblue' u'whitesmoke' u'black' u'cornflowerblue' u'powderblue' u'powderblue'

Solution 1:

Quick answer:

  • Do features_train = features_train.reshape(-1, 1);
  • Do NOT do labels_train = labels_train.reshape(-1, 1). Leave labels_train as it is.

Some details:

It seems you are confused about the why 2D data array input is required for estimators. Your training vectors X has a shape (n_samples, n_features). So features_train.reshape(-1, 1) is correct for your case here, since you have only 1 feature and want to let numpy to infer how many samples are there. This indeed solves your first error.

Your target values y has a shape (n_samples,), which expects a 1D array. When you do labels_train = labels_train.reshape(-1, 1), you convert it to a 2D column-vector. That's why you got the second warning. Note that it's a warning, meaning fit() figured it out and did the correct conversion, i.e. your program continues to run and should be correct.

When you do:

features_train = features_train.reshape(1, -1)
labels_train = labels_train.reshape(1, -1)

First, it is a wrong conversion for features_train for your case here because X.reshape(1, -1) means you have 1 sample and want to let numpy to infer how many features are there. It is not what you want but fit() doesn't know and will process it accordingly, giving you the wrong result.

That being said, your last error does not come from features_train = features_train.reshape(1, -1). It is from labels_train = labels_train.reshape(1, -1). Your labels_train has now a shape (1, 29) which is neither a row nor a column-vector. Though we might know it should be interpreted as a 1D array of target values, fit() is not that smart yet and don't know what to do with it.

Solution 2:

When you do:

features_train = features_train.reshape(1, -1) labels_train = labels_train.reshape(1, -1)

First, it is a wrong conversion for features_train for your case here because X.reshape(1, -1) means you have 1 sample and want to let numpy to infer how many features are there. It is not what you want but fit() doesn't know and will process it accordingly, giving you the wrong result.

That being said, your last error does not come from features_train = features_train.reshape(1, -1). It is from labels_train = labels_train.reshape(1, -1). Your labels_train has now a shape (1, 29) which is neither a row nor a column-vector. Though we might know it should be interpreted as a 1D array of target values, fit() is not that smart yet and don't know what to do with it.

Post a Comment for "Reshape A Data For Sklearn"