How To Use Mask Indexing On Numpy Arrays Of Classes?
Solution 1:
So your array is dtype=object
(print it) and each element points to an instance of your class:
items = np.array([TestClass() for _ in range(10)])
Now try:
items.active
items
is an array; active
is an attribute of your class, not an attribute of the array of your objects. Your definition does not add any functionality to the class ndarray
. The error isn't in the masking; it's in trying to get the instance attribute.
Many operations on arrays like this have be done iteratively. This kind of array is similar to a plain Python list.
[obj.active for obj in items]
or to turn it back into an array
np.array([obj...])
items[[True,False,True,...]]
should work, but that's because the mask is a boolean list or array already.
====================
Lets modify your class so it shows something interesting. Note I am assigning active
to instances, not, as you did, to the class:
In [1671]: class TestClass:
...: def __init__(self,val):
...: self.active = bool(val%2)
In [1672]: items = np.array([TestClass(i) for i in range(10)])
In [1674]: items
Out[1674]:
array([<__main__.TestClass object at 0xb106758c>,
<__main__.TestClass object at 0xb117764c>,
...
<__main__.TestClass object at 0xb269850c>], dtype=object)
# print of the array isn't interesting. The class needs a `__str__` method.
This simple iterative access to the attribute:
In [1675]: [i.active for i in items]
Out[1675]: [False, True, False, True, False, True, False, True, False, True]
np.frompyfunc
provides a more powerful way of accessing each element of an array. operator.attrgetter('active')(i)
is a functional way of doing i.active
.
In [1676]: f=np.frompyfunc(operator.attrgetter('active'),1,1)
In [1677]: f(items)
Out[1677]: array([False, True, False, True, False, True, False, True, False, True], dtype=object)
but the main advantage of this function appears when I change the shape of the array:
In [1678]: f(items.reshape(2,5))
Out[1678]:
array([[False, True, False, True, False],
[True, False, True, False, True]], dtype=object)
Note this array is dtype object. That's what frompyfunc
does. To get an array of booleans we need to change type:
In [1679]: f(items.reshape(2,5)).astype(bool)
Out[1679]:
array([[False, True, False, True, False],
[ True, False, True, False, True]], dtype=bool)
np.vectorize
uses frompyfunc
, and makes the dtype a little more user friendly. But in timings it's a bit slower.
===============
Expanding on Jon's comment
In [1702]: class TestClass:
...: def __init__(self,val):
...: self.active = bool(val%2)
...: def __bool__(self):
...: return self.active
...: def __str__(self):
...: return 'TestClass(%s)'%self.active
...: def __repr__(self):
...: return str(self)
In [1707]: items = np.array([TestClass(i) for i in range(5)])
items
now display in an informative manner; and convert to strings:
In [1708]: items
Out[1708]:
array([TestClass(False), TestClass(True), TestClass(False),
TestClass(True), TestClass(False)], dtype=object)
In [1709]: items.astype('S20')
Out[1709]:
array([b'TestClass(False)', b'TestClass(True)', b'TestClass(False)',
b'TestClass(True)', b'TestClass(False)'],
dtype='|S20')
and convert to bool
:
In [1710]: items.astype(bool)
Out[1710]: array([False, True, False, True, False], dtype=bool)
In effect astype
is applying the conversion method to each element of the array. We could also define __int__
, __add__
, This shows that it is easier to add functionality to the custom class than to the array class itself. I wouldn't expect to get the same speed as with native types.
Post a Comment for "How To Use Mask Indexing On Numpy Arrays Of Classes?"