Python Forum
comparing floating point arrays to arrays of integers in Numpy - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: comparing floating point arrays to arrays of integers in Numpy (/thread-34378.html)



comparing floating point arrays to arrays of integers in Numpy - amjass12 - Jul-26-2021

I am using sklearns OneHotEncoder and MultiLabelBinarizer to one-hot encode some target labels.

I am working on a multi-label classification task and am one-hot encoding my targets.

I have been using OnehotEncoder principally however I have done the encoding also using MultiLabelBinarizer just to be sure that target encodings are both done correctly. They are, however the dtype of OneHotEncoder is 'float64' by default while for MultiLabelBinarizer is 'int64'... A preview of each is as follows:


MultiLabelBinarizer
array([[1, 0, 0, ..., 0, 0, 0],
       [0, 1, 0, ..., 0, 0, 0], .... etc

OneHotEncoder
array([[1., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.], ... etc
When I compare these two arrays using np.array_equal()

np.array_equal(mlb, ohe) #variables of the above although I haven't shown the code that generates it..
The arrays are identical (boolean = True)

This is good as it confirms both encoders one-hot encode my target labels the same way.. My question though is:

Why does it show true, even though the encodings themselves are correct - the data types are different, one with a floating point and one without, how is this behaviour explained?


thank you!