This module contains recursive algorithms for accessing, sorting, and analyzing most python objects. The most powerful functions are `get`, `give`, `sort`, and `maps`. Useful utils include `depth`, `flatten`, and `pipe`.
C:\Anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs:
C:\Anaconda3\lib\site-packages\numpy\.libs\libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll
C:\Anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll
  stacklevel=1)

depth and flattening

For nested lists and dicts, it's often convenient to know how deep the rabbit hole goes, and extract every element into a single iterable.

def depth(obj : Union[dict,list]):
    '''
    Recursively sort the number of layers of a nested
    list or dictionary `x`.
    '''
    if type(obj) is dict and obj:
        return 1 + max(depth(obj[a]) for a in obj)
    if type(obj) is list and obj:
        return 1 + max(depth(a) for a in obj)
    return 0

depth[source]

depth(obj:Union[dict, list])

Recursively sort the number of layers of a nested list or dictionary x.

depth([0,[1,[2]]])
3
depth({0:{0:{0:0}}})
3

flatten[source]

flatten(obj:Union[dict, list], parent_key='', sep=',')

Concatenate the nested input obj into an equivalent datastructure of depth 1. Uses the parent_key and sep arg to combine nested dictionary keys. https://stackoverflow.com/questions/6027558/flatten-nested-dictionaries-compressing-keys

flatten([0,[1,[2,[3]]]])
[0, 1, 2, 3]
i=flatten({'a':0,'b':{'c':1,'d':{'e':2}}})
print(i)
{'a': 0, 'b,c': 1, 'b,d,e': 2}

unflatten[source]

unflatten(d:dict, sep=',')

Un-flatten a flattened nested dictionary d with concatenated key separation sep. https://gist.github.com/fmder/494aaa2dd6f8c428cede

unflatten(i)
{'a': 0, 'b': {'c': 1, 'd': {'e': 2}}}

get and give

These functions access and assign elements, attributes, methods, and function calls of arbitrary python objects and return what sticks.

get[source]

get(obj:Union[object, dict, list, tuple, callable, ndarray], *args:Union[NoneType, str, int, float, tuple, list, dict], retnone:bool=True, call:bool=False, **kwargs)

Recursively accesses obj by the given ordered attributes/keys/indexes args.

If call, the last accessed object will try to call any **kwargs.

If retnone, returns None if the object can't be accessed by any of the args; else, returns obj.

a=np.array([[0,1],[2,3]])
get(a,0)
array([0, 1])
get(a,'doesnt exist')
get(a,'doesnt exist',retnone=False)
array([[0, 1],
       [2, 3]])
get(a,'mean')
<function ndarray.mean>
get(a,'mean',-1)
array([0.5, 2.5])
get(a,'mean',call=True,axis=-1)
array([0.5, 2.5])
get(a,'mean',call=True)
1.5
get(a,'mean','doesnt exist',call=True)
1.5

This function is extremely powerful. Because it tries to access different data structures, and skips any it can't access, get can be used in very flexible and exploratory programming styles in order to extract data and perform transformations.

Next, we will use a similar method to insert arbitrary methods and data into objects with give.

give[source]

give(obj:Union[object, dict, list, tuple, ndarray], *args:Union[str, int, float, tuple, list, dict], **kwargs:object)

Assign args element of obj to a value. If no kwargs, assign the last item of args. If kwargs, assign those.

a=[0]
give(a,0,1) #give the 0th element of a the value 1
print(a)
[1]

give only works on indexable objects:

a=0
give(a,0)
Not enough args or kwargs - require access to elements of obj.
a=np.array([[0,1],[2,3]])
give(a,(1,1),100)
a #arrays must be accessed using tuples
array([[  0,   1],
       [  2, 100]])
a=[[0,1],[2,3]]
give(a,1,1,100)
a #lists must be accessed using sequences
[[0, 1], [2, 100]]

give can be used to replace loops:

a=[0]
[give(a,0,a[0]+1) for i in range(10)]
print(a)
[10]

Let's use a more complex example. We'll use a networkx graph object to create a simple ring of nodes 0->1->2->0 and apply some attributes.

import networkx as nx
g=nx.DiGraph()
g.add_edges_from([(0,1),(1,2),(2,0)])
[get(g.edges,e,['weight']) for e in g.edges] #normally this would produce an error!
[{}, {}, {}]
[give(g.edges,e,weight=np.random.random(1)) for e in g.edges] #give returns none
[None, None, None]
for e in g.edges: #newly applied values
    print(e,g.edges[e]['weight'])
(0, 1) [0.29207978]
(1, 2) [0.19885117]
(2, 0) [0.66909138]
[get(g.nodes,n,'a') for n in g.nodes] #normally would give error!
[{}, {}, {}]
[give(g.nodes,n,a=np.random.random(1)) for n in g.nodes]
[get(g.nodes,n,'a') for n in g.nodes]
[array([0.26994865]), array([0.57217611]), array([0.32531337])]

These functions can be used in many more ways than those shown here, and encourage exploratory programming.

sort

Now we examine sorting data structures with sort using get.

sort[source]

sort(obj:Iterable[T_co], *args:Union[NoneType, str, int, float, tuple, list, dict], by:Union[NoneType, object, dict, list, tuple, callable, ndarray]=None, key:callable=<lambda>, sift:callable=<lambda>, reverse:bool=True)

Recursively sorts obj by args using key.

Treats obj as an iterator and wraps by around the elements of obj,

before optionally wrapping any remaining args, and returning the evaluated

tuples generated over obj. If by is None, searches for args.

If both by and args is None, sorts over obj only : default behavior.

sift defaults to keeping all elements, while

key defaults to treating empty objects as having value 0.

reverse sorts ascending by default.

sort([9,0,1,8,7])
[9, 8, 7, 1, 0]
sort([9,0,1,8,7],by=lambda t:t%3)
[(8, 2), (1, 1), (7, 1), (9, 0), (0, 0)]
sort(g.nodes,'a')
[array([0.57217611]), array([0.32531337]), array([0.26994865])]
sort(g.nodes,'a',by=g.nodes)
[(1, array([0.57217611])), (2, array([0.32531337])), (0, array([0.26994865]))]
sort(g.edges,'weight')
[array([0.66909138]), array([0.29207978]), array([0.19885117])]
sort(g.edges,'weight',by=g.edges)
[((2, 0), array([0.66909138])),
 ((0, 1), array([0.29207978])),
 ((1, 2), array([0.19885117]))]
g.add_edges_from([(0,0)]) #add another edge
sort(g.nodes,by=g.edges,key=lambda t:len(sort(list(t[-1])))) #then sort nodes over edges by in-degree
[(0, OutEdgeDataView([(0, 1), (0, 0)])),
 (1, OutEdgeDataView([(1, 2)])),
 (2, OutEdgeDataView([(2, 0)]))]
sort(g.nodes,by=g.in_degree) #same as above but relies on 'in_degree' attribute of networkx
[(0, 2), (1, 1), (2, 1)]

In conclusion, sort is capable of handling an immense number of possible data structures, and it's best understood by playing around with it and seeing what works!

A word of caution though: the more complex and custom the python object, the more difficult it is to typecast. Remember to transform your data so that the key lambda performs a valid comparison - since it's a lambda function, it's still up to you to make sure the data types are actually comparable in a way that admits a binary operation.

pipe[source]

pipe(func, otype=None, ftype=None, cast=<sidis.conversion.Caster object at 0x00000199FFF34E08>, *args, **kwargs)

Pipelines the func to act on a later object. Returns a partially evaluated function over any args and kwargs. The object is casted to type otype before being evaluated. The output of the function is casted to ftype

pipe(lambda x,y:x+y,otype=int,ftype=int,y=1.9)(1.9) #convert the input to int, and the output to int
2

Pipe is useful for converting function datatypes and passing them as arguments to other functions, as follows:

def maps(obj,
             *funcs,
             depth=0,
             zipit=False,
             to=None,
             squeeze=True):
    '''
    Sequentially map `funcs` over the elements of `obj`, "o".
    The first `depth` number of funcs are mapped sequentially f(g(h(...(o)..)))=x.
    The remaining number of funcs are mapped separately (u(x),v(x),...).
    Use partial `funcs` to fill in all args but `obj` if other parameters needed.
    If `keys`, return tuples of the object elements "o" along with map outputs.
    '''
    if not funcs:
        return obj
    else:
        obj=obj if hasattr(obj,'__iter__') and type(obj)!='str' else [obj] #asiter(obj)
        r=[o for o in obj]
        [[give(r,i,get(f,r[i])) for f in funcs[:depth]] for i in range(len(r))]
        [give(r,i,[get(f,r[i]) for f in funcs[depth:]]) for i in range(len(r))]
        if squeeze:
            r=np.ndarray.tolist(np.squeeze(np.array(r,dtype=object)))
        if zipit:
            r=list(zip(obj,r))
        return cast(r,to)

maps[source]

maps(obj, *funcs, depth=0, zipit=False, to=None, squeeze=True)

Sequentially map funcs over the elements of obj, "o". The first depth number of funcs are mapped sequentially f(g(h(...(o)..)))=x. The remaining number of funcs are mapped separately (u(x),v(x),...). Use partial funcs to fill in all args but obj if other parameters needed. If keys, return tuples of the object elements "o" along with map outputs.

maps(0,lambda t:t+1,lambda t:t+1,lambda t:t+1,depth=0) #add 1 separately to 0, three times
[1, 1, 1]
maps(0,lambda t:t+1,lambda t:t+1,lambda t:t+1,depth=1) #add 1 to 0, then add 1 to that separately twice
[2, 2]
maps(0,lambda t:t+1,lambda t:t+1,lambda t:t+1,depth=-1) #sequantially apply all functions
3
maps([0,1],lambda t:t+1,lambda t:t+1,lambda t:t+1) #apply functions over elements of iterable
[[1, 1, 1], [2, 2, 2]]
maps([0,1],lambda t:t+1,lambda t:t+1,lambda t:t+1,depth=-1)
[3, 4]
maps([0,1],lambda t:t+1,lambda t:t+1,lambda t:t+1,depth=-1,zipit=True) #zip the arguments with their func outputs
[(0, 3), (1, 4)]
maps(g.nodes,pipe(g.predecessors,None,list),pipe(g.successors,None,list),zipit=True) #use a pipeline
[(0, [[2, 0], [1, 0]]), (1, [[0], [2]]), (2, [[1], [0]])]
maps(g.nodes,pipe(g.predecessors,None,list),pipe(g.successors,None,list),zipit=True,to=dict) #convert the output
{0: (0, [[2, 0], [1, 0]]), 1: (1, [[0], [2]]), 2: (2, [[1], [0]])}