Skip to content Skip to sidebar Skip to footer

Override A Dict With Numpy Support

Using the base idea from How to 'perfectly' override a dict?, I coded a class based on dictionaries that should support assigning dot delimited keys, i.e. Extendeddict('level1.leve

Solution 1:

The problem is in the np.array constructor step. It digs into its inputs trying to create a higher dimensional array.

In [99]: basic={'test.field':'test'}

In [100]: eb=Extendeddict(basic)

In [104]: eba=np.array([eb],object)
<keys: 0,[0]>
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-104-5591a58c168a> in <module>()
----> 1 eba=np.array([eb],object)

<ipython-input-88-a7d937b1c8fd> in __getitem__(self, key)
     11         keys = self._keytransform(key);print key;print keys
     12iflen(keys) == 1:
---> 13             return self._store[key]14else:
     15             key1 = '.'.join(keys[1:])

KeyError: 0

But if I make an array, and assign the object it works fine

In [105]: eba=np.zeros((1,),object)

In [106]: eba[0]=eb

In [107]: eba
Out[107]: array([{'test': {'field': 'test'}}], dtype=object)

np.array is a tricky function to use with dtype=object. Compare np.array([[1,2],[2,3]],dtype=object) and np.array([[1,2],[2]],dtype=object). One is (2,2) the other (2,). It tries to make a 2d array, and resorts to 1d with list elements only if that fails. Something along that line is happening here.

I see 2 solutions - one is this round about way of constructing the array, which I've used in other occasions. The other is to figure out why np.array doesn't dig into dict but does with yours. np.array is compiled, so that may require reading tough GITHUB code.


I tried a solution with f=np.frompyfunc(lambda x:x,1,1), but that doesn't work (see my edit history for details). But I found that mixing an Extendeddict with a dict does work:

In [139]: np.array([eb,basic])
Out[139]: array([{'test': {'field': 'test'}}, {'test.field': 'test'}], dtype=object)

So does mixing it with something else like None or an empty list

In [140]: np.array([eb,[]])
Out[140]: array([{'test': {'field': 'test'}}, []], dtype=object)

In [142]: np.array([eb,None])[:-1]
Out[142]: array([{'test': {'field': 'test'}}], dtype=object)

This is another common trick for constructing an object array of lists.

It also works if you give it two or more Extendeddict with different lengths

np.array([eb, Extendeddict({})]). In other words if len(...) differ (just as with mixed lists).

Solution 2:

Numpy tries to do what it's supposed to do:

Numpy checks for each element if it's iterable (by using len and iter) because what you pass in may be interpreted as a multidimensional array.

There is a catch here: dict-like classes (meaning isinstance(element, dict) == True) will not be interpreted as another dimension (that is why passing in [{}] works). Probably they should check if it's a collections.Mapping instead of a dict. Maybe you can file a bug on their issue tracker.

If you change your class definition to:

classExtendeddict(collections.MutableMapping, dict):
     ...

or change your __len__-method:

def__len__(self):
        raise NotImplementedError

it works. Neither of these might be something that you want to do but numpy just uses duck typing to create the array and without subclassing directly from dict or by making len inaccessible numpy sees your class as something that ought to be another dimension. This is rather clever and convenient in case you want to pass in customized sequences (subclasses from collections.Sequence) but inconvenient for collections.Mapping or collections.MutableMapping. I think this a Bug.

Post a Comment for "Override A Dict With Numpy Support"