Components
Contents
Components#
Axis#
- class tensorkrowch.Axis(num, name, node=None, node1=True)[source]#
Axes are the objects that stick edges to nodes. Each instance of the
AbstractNodeclass has a list of \(N\) axes, each corresponding to one edge. Each axis stores information that facilitates accessing that edge, such as itsnameandnum(index). Additionally, an axis keeps track of itsbatchandnode1attributes.batch: If the axis name contains the word “batch”, the edge will be a batch edge, which means that it cannot be connected to other nodes. Instead, it specifies a dimension that allows for batch operations (e.g., batch contraction). If the name of the axis is changed and no longer contains the word “batch”, the corresponding edge will no longer be a batch edge. Furthermore, instances of the
StackNodeandParamStackNodeclasses always have an axis with name “stack” whose edge is a batch edge.node1: When two dangling edges are connected the result is a new edge linking two nodes, say
nodeAandnodeB. If the connection is performed in the following order:new_edge = nodeA[edgeA] ^ nodeB[edgeB]
Then
nodeAwill be thenode1ofnew_edgeandnodeB, thenode2. Hence, to access one of the nodes fromnew_edgeone needs to know if it isnode1ornode2.
Even though we can create
Axisinstances, that will not be usually the case, since axes are automatically created when instantiating a newnode.Other thing one must take into account is the naming of
Axes. Since the name of anAxisis used to access it from theNode, the same name cannot be used by more than oneAxis. In that case, repeated names get an automatic enumeration of the form"name_{number}"(underscore followed by number).To add a custom enumeration in a user-defined way, one may use brackets or parenthesis:
"name_({number})".- Parameters:
num (int) – Index of the axis in the node’s axes list.
name (str) – Axis name, should not contain blank spaces or special characters. If it contains the word “batch”, the axis will correspond to a batch edge. The word “stack” cannot be used in the name, since it is reserved for stacks.
node (AbstractNode, optional) – Node to which the axis belongs.
node1 (bool) – Boolean indicating whether
node1of the edge attached to this axis is the node that contains the axis (True). Otherwise, the node isnode2of the edge (False).
Examples
Although Axis will not be usually explicitly instantiated, it can be done like so:
>>> axis = tk.Axis(0, 'left') >>> axis Axis( left (0) )
>>> axis.is_node1() True
>>> axis.is_batch() False
Since “batch” is not contained in “left”,
axisdoes not correspond to a batch edge, but that can be changed:>>> axis.name = 'mybatch' >>> axis.is_batch() True
Also, as explained before, knowing if a node is the
node1ornode2of an edge enables users to access that node from the edge:>>> nodeA = tk.Node(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), axes_names=('left', 'right')) >>> new_edge = nodeA['right'] ^ nodeB['left'] ... >>> # nodeA is node1 and nodeB is node2 of new_edge >>> nodeA == new_edge.nodes[1 - nodeA.get_axis('right').is_node1()] True
>>> nodeB == new_edge.nodes[nodeA.get_axis('right').is_node1()] True
The
node1attribute is extended toresultantnodes that inherit edges.>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.randn(shape=(4, 5), axes_names=('left', 'right')) >>> edge1 = nodeA['right'] ^ nodeB['left'] >>> edge2 = nodeB['right'] ^ nodeC['left'] >>> result = nodeA @ nodeB ... >>> # result inherits the edges nodeA['left'] and edge2 >>> result['left'] == nodeA['left'] True
>>> result['right'] == edge2 True
>>> # result is still node1 of edge2, since nodeA was >>> result.is_node1('right') True
- property num#
Index in the node’s axes list.
- property name#
Axis name, used to access edges by name of the axis. It cannot contain blank spaces or special characters. If it contains the word “batch”, the axis will correspond to a batch edge. The word “stack” cannot be used in the name, since it is reserved for stacks.
- property node#
Node to which the axis belongs.
Nodes#
AbstractNode#
- class tensorkrowch.AbstractNode(shape=None, axes_names=None, name=None, network=None, data=False, virtual=False, override_node=False, tensor=None, edges=None, override_edges=False, node1_list=None, init_method=None, device=None, dtype=None, **kwargs)[source]#
Abstract class for all types of nodes. Defines what a node is and most of its properties and methods. Since it is an abstract class, cannot be instantiated.
Nodes are the elements that make up a
TensorNetwork. At its most basic level, a node is a container for atorch.Tensorthat stores other relevant information which enables to build any network and operate nodes to contract it (and train it!). Some of the information that is carried by the nodes includes:Shape: Every node needs a shape to know if connections with other nodes are possible. Even if the tensor is not specified, an empty node needs a shape.
Tensor: The key ingredient of the node. Although the node acts as a container for the tensor, the node does not contain it. Actually, for efficiency purposes, the tensors are stored in a sort of memory that is shared by all the nodes of the
TensorNetwork. Therefore, all that nodes contain is a memory address. Furthermore, some nodes can share the same (or a part of the same) tensor, thus containing the same address. Sometimes, to maintain consistency, when two nodes share a tensor, one stores its memory address, and the other one stores a reference to the former.Axes: A list of
Axesthat make it easy to access edges just using a name or an index.Edges: A list of
Edges, one for each dimension of the node. Each edge is attached to the node via anAxis. Edges are useful to connect several nodes, creating aTensorNetwork.Network: The
TensorNetworkto which the node belongs. If the network is not specified when creating the node, a newTensorNetworkis created to contain the node. Although the network can be thought of as a graph, it is atorch.nn.Module, so it is much more than that. Actually, theTensorNetworkcan contain different types of nodes, not all of them being part of the graph, but being used for different purposes.Successors: A dictionary with information about the nodes that result from
Operationsin which the current node was involved. SeeSuccessor.
Carrying this information with the node is what makes it easy to:
Perform tensor network
Operationssuch ascontractionof two neighbouring nodes, without having to worry about tensor’s shapes, order of axes, etc.Perform more advanced operations such as
stack()orunbind()saving memory and time.Keep track of operations in which a node has taken place, so that several steps can be skipped in further training iterations. See
TensorNetwork.trace().
Also, there are 4 excluding types of nodes that will have different roles in the
TensorNetwork:leaf: These are the nodes that form the
TensorNetwork(together with thedatanodes). Usually, these will be the trainable nodes. These nodes can store their own tensors or use other node’s tensor.data: These are similar to
leafnodes, but they are never trainable, and are used to store the temporary tensors coming from input data. These nodes can store their own tensors or use other node’s tensor.virtual: These nodes are a sort of ancillary, hidden nodes that accomplish some useful task (e.g. in uniform tensor networks a virtual node can store the shared tensor, while all the other nodes in the network just have a reference to it). These nodes always store their own tensors.
resultant: These are nodes that result from an
Operation. They are intermediate nodes that (almost always) inherit edges fromleafanddatanodes, the ones that really form the network. These nodes can store their own tensors or use other node’s tensor. The names of theresultantnodes are the name of theOperationthat originated it.
See
TensorNetworkandreset()to learn more about the importance of these 4 types of nodes.Other thing one should take into account are reserved nodes’ names:
“stack_data_memory”: Name of the
virtualStackNodethat is created inset_data_nodes()to store the whole data tensor from which eachdatanode might take just one slice. There should be at most one"stack_data_memory"in the network. To learn more about this, seeset_data_nodes()andadd_data().“virtual_result”: Name of
virtualnodes that are not explicitly part of the network, but are required for some situations during contraction. For instance, theParamStackNodethat results from stackingParamNodesas the first operation in the network contraction, ifauto_stackmode is set toTrue. To learn more about this, seeParamStackNode.“virtual_uniform”: Name of the
virtualNodeorParamNodethat is used in uniform (translationally invariant) tensor networks to store the tensor that will be shared by allleafnodes. There might be as much"virtual_uniform"nodes as shared memories are used for theleafnodes in the network (usually just one).
For
"virtual_result"and"virtual_uniform", these special behaviours are not restricted to nodes having those names, but also nodes whose names contain those strings.Although these names can in principle be used for other nodes, this can lead to undesired behaviour.
See
reset()to learn more about the importance of these reserved nodes’ names.Other thing one must take into account is the naming of
Nodes. Since the name of aNodeis used to access it from theTensorNetwork, the same name cannot be used by more than oneNode. In that case, repeated names get an automatic enumeration of the form"name_{number}"(underscore followed by number).To add a custom enumeration to keep track of the nodes of the network in a user-defined way, one may use brackets or parenthesis:
"name_({number})".The same automatic enumeration of names occurs for
Axes’ names in aNode.Refer to the subclasses of
AbstractNodeto see how to instantiate nodes:- property tensor#
Node’s tensor. It can be a
torch.Tensor,torch.nn.ParameterorNoneif the node is empty.
- property network#
TensorNetworkwhere the node belongs. If the node is moved to anotherTensorNetwork, the entire connected component of the graph where the node is will be moved.
- property successors#
Dictionary with
Operations’ names as keys, and dictionaries ofSuccessorsof the node as values. The inner dictionaries use as keys the arguments used when the operation was called.
- property name#
Node’s name, used to access the node from the
tensor networkwhere it belongs. It cannot contain blank spaces.
- is_leaf()[source]#
Returns a boolean indicating if the node is a
leafnode. These are the nodes that form theTensorNetwork(together with thedatanodes). Usually, these will be the trainable nodes. These nodes can store their own tensors or use other node’s tensor.
- is_data()[source]#
Returns a boolean indicating if the node is a
datanode. These nodes are similar toleafnodes, but they are never trainable, and are used to store the temporary tensors coming from input data. These nodes can store their own tensors or use other node’s tensor.
- is_virtual()[source]#
Returns a boolean indicating if the node is a
virtualnode. These nodes are a sort of ancillary, hidden nodes that accomplish some useful task (e.g. in uniform tensor networks a virtual node can store the shared tensor, while all the other nodes in the network just have a reference to it). These nodes always store their own tensors.If a
virtualnode is used as the node storing the shared tensor in a uniform (translationally invariant)TensorNetwork, it is recommended to use the string “virtual_uniform” in the node’s name (e.g. “virtual_uniform_mps”).
- is_resultant()[source]#
Returns a boolean indicating if the node is a
resultantnode. These are nodes that result from anOperation. They are intermediate nodes that (almost always) inherit edges fromleafanddatanodes, the ones that really form the network. These nodes can store their own tensors or use other node’s tensor.
- is_conj()[source]#
Equivalent to torch.is_conj().
- is_complex()[source]#
Equivalent to torch.is_complex().
- is_floating_point()[source]#
Equivalent to torch.is_floating_point().
- size(axis=None)[source]#
Returns the size of the node’s tensor. If
axisis specified, returns the size of that axis; otherwise returns the shape of the node (same asshape).- Parameters:
axis (int, str or Axis, optional) – Axis for which to retrieve the size.
- Return type:
int or torch.Size
- is_node1(axis=None)[source]#
Returns
node1attribute of axes of the node. Ifaxisis specified, returns only thenode1of that axis; otherwise returns thenode1of all axes of the node.- Parameters:
axis (int, str or Axis, optional) – Axis for which to retrieve the
node1.- Return type:
bool or list[bool]
- neighbours(axis=None)[source]#
Returns the neighbours of the node, the nodes to which it is connected.
If
selfis aresultantnode, this will return the neighbours of theleafnodes from whichselfinherits the edges. Therefore, one cannot check if tworesultantnodes are connected by looking into their neighbours lists. To do that, useis_connected_to().- Parameters:
axis (int, str or Axis, optional) – Axis for which to retrieve the neighbour.
- Return type:
AbstractNode or list[AbstractNode]
Examples
>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.randn(shape=(4, 5), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] >>> _ = nodeB['right'] ^ nodeC['left'] >>> set(nodeB.neighbours()) == {nodeA, nodeC} True
>>> nodeB.neighbours('right') == nodeC True
Nodes
resultantfrom operations are still connected to original neighbours.>>> result = nodeA @ nodeB >>> result.neighbours('right') == nodeC True
- get_edge(axis)[source]#
Returns
Edgegiven theAxis(or itsnameornum) where it is attached to the node.
- reattach_edges(axes=None, override=False)[source]#
Substitutes current edges by copies of them that are attached to the node. It can happen that an edge is not attached to the node if it is the result of an
Operationand, hence, it inherits edges from the operands. In that case, the new copied edges will be attached to the resultant node, replacing each previousnode1ornode2with it (according to thenode1attribute of each axis).Used for in-place operations like
permute_()orsplit_()and to (de)parameterize nodes.- Parameters:
axis (list[int, str or Axis] or tuple[int, str or Axis], optional) – The edge attached to these axes will be reattached. If
None, all edges will be reattached.override (bool) – Boolean indicating if the new, reattached edges should also replace the corresponding edges in the node’s neighbours (
True). Otherwise, the neighbours’ edges will be pointing to the original nodes from which the current node inherits its edges (False).
Examples
>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.randn(shape=(4, 5), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] >>> _ = nodeB['right'] ^ nodeC['left'] >>> result = nodeA @ nodeB
Node
resultinherits itsrightedge fromnodeB.>>> result['right'] == nodeB['right'] True
However,
nodeB['right']still connectsnodeBandnodeC. There is no reference toresult.>>> result in result['right'].nodes False
One can reattach its edges so that
result’s edges do have references to it.>>> result.reattach_edges() >>> result in result['right'].nodes True
If
overrideisTrue,nodeB['right']would be replaced by the newresult['right'].
- disconnect(axis=None)[source]#
Disconnects all edges of the node if they were connected to other nodes. If
axisis sepcified, only the corresponding edge is disconnected.- Parameters:
axis (int, str or Axis, optional) – Axis whose edge will be disconnected.
Examples
>>> nodeA = tk.Node(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.Node(shape=(4, 5), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] >>> _ = nodeB['right'] ^ nodeC['left'] >>> set(nodeB.neighbours()) == {nodeA, nodeC} True
>>> nodeB.disconnect() >>> nodeB.neighbours() == [] True
- make_tensor(shape=None, init_method='zeros', device=None, dtype=None, **kwargs)[source]#
Returns a tensor that can be put in the node, and is initialized according to
init_method. By default, it has the same shape as the node.- Parameters:
shape (list[int], tuple[int] or torch.Size, optional) – Shape of the tensor. If
None, node’s shape will be used.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor.
dtype (torch.dtype, optional) – Dtype of the tensor.
kwargs (float) –
Keyword arguments for the different initialization methods:
low,highfor uniform initialization. See torch.rand()mean,stdfor normal initialization. See torch.randn()
- Return type:
torch.Tensor
- Raises:
ValueError – If
init_methodis not one of “zeros”, “ones”, “copy”, “rand”, “randn”.
- set_tensor(tensor=None, init_method='zeros', device=None, dtype=None, **kwargs)[source]#
Sets new node’s tensor or creates one with
make_tensor()and sets it. Before setting it, it is cast to the correct type:torch.TensorforNodeandtorch.nn.ParameterforParamNode.When a tensor is set in the node, it means the node stores it, that is, the node has its own memory address for its tensor, rather than a reference to other node’s tensor. Because of this,
set_tensorcannot be applied for nodes that have a reference to other node’s tensor, since that tensor would be changed also in the referenced node. To overcome this issue, seereset_tensor_address().This can only be used for non
resultant``nodes that store their own tensors. For ``resultantnodes, tensors are set automatically when computingOperations.Although this can also be used for
datanodes, input data will be usually automatically set into nodes when calling theTensorNetwork.forward()method ofTensorNetworkwith a data tensor or a sequence of tensors. This method callsTensorNetwork.add_data(), which can also be used to set data tensors into thedatanodes.- Parameters:
tensor (torch.Tensor, optional) – Tensor to be set in the node. If
None, andinit_methodis provided, the tensor is created withmake_tensor(). Otherwise, aNoneis set as node’s tensor.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor.
dtype (torch.dtype, optional) – Dtype of the tensor.
kwargs (float) – Keyword arguments for the different initialization methods. See
make_tensor().
- Raises:
ValueError – If the node is a
resultantnode or if it does not store its own tensor.
Examples
>>> node = tk.Node(shape=(2, 3), axes_names=('left', 'right')) ... >>> # Call set_tensor without arguments uses the >>> # default init_method ("zeros") >>> node.set_tensor() >>> torch.equal(node.tensor, torch.zeros(node.shape)) True
>>> node.set_tensor(init_method='randn', mean=1., std=2., device='cuda') >>> torch.equal(node.tensor, torch.zeros(node.shape, device='cuda')) False
>>> node.device device(type='cuda', index=0)
>>> tensor = torch.randn(2, 3) >>> node.set_tensor(tensor) >>> torch.equal(node.tensor, tensor) True
- unset_tensor()[source]#
Replaces node’s tensor with
None. This can only be used for nonresultantnodes that store their own tensors.Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor is None False
>>> node.unset_tensor() >>> node.tensor is None True
- set_tensor_from(other)[source]#
Sets node’s tensor as the tensor used by
othernode. That is, when setting the tensor this way, the current node will store a reference to theothernode’s tensor, instead of having its own tensor.The node and
othershould be both the same type (NodeorParamNode). Also, they should be in the sameTensorNetwork.- Parameters:
other (Node or ParamNode) – Node whose tensor is to be set in current node.
- Raises:
TypeError – If
otheris a different type than the current node, or if it is in a different network.
Examples
>>> nodeA = tk.randn(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.empty(shape=(2, 3), ... name='nodeB', ... axes_names=('left', 'right'), ... network=nodeA.network) >>> nodeB.set_tensor_from(nodeA) >>> print(nodeB.tensor_address()) nodeA
Since
nodeBhas a reference tonodeA’s tensor, if this one is changed,nodeBwill reproduce all the changes.>>> nodeA.tensor = torch.randn(nodeA.shape) >>> torch.equal(nodeA.tensor, nodeB.tensor) True
- reset_tensor_address()[source]#
Resets memory address of node’s tensor to reference the node itself. Thus, the node will store its own tensor, instead of having a reference to other node’s tensor.
Examples
>>> nodeA = tk.randn(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.empty(shape=(2, 3), ... name='nodeB', ... axes_names=('left', 'right'), ... network=nodeA.network) >>> nodeB.set_tensor_from(nodeA) >>> print(nodeB.tensor_address()) nodeA
Now one cannot set in
nodeBa different tensor from the one innodeA, unless tensor address is reset innodeB.>>> nodeB.reset_tensor_address() >>> nodeB.tensor = torch.randn(nodeB.shape) >>> torch.equal(nodeA.tensor, nodeB.tensor) False
- move_to_network(network, visited=None)[source]#
Moves node to another network. All other nodes connected to it, or to a node connected to it, etc. are also moved to the new network.
If a node does not store its own tensor, and is moved to other network, it will recover the “ownership” of its tensor.
- Parameters:
network (TensorNetwork) – Tensor Network to which the nodes will be moved.
visited (list[AbstractNode], optional) – List indicating the nodes that have been already moved to the new network, used by this DFS-like algorithm.
Examples
>>> net = tk.TensorNetwork() >>> nodeA = tk.Node(shape=(2, 3), ... axes_names=('left', 'right'), ... network=net) >>> nodeB = tk.Node(shape=(3, 4), ... axes_names=('left', 'right'), ... network=net) >>> nodeC = tk.Node(shape=(5, 5), ... axes_names=('left', 'right'), ... network=net) >>> _ = nodeA['right'] ^ nodeB['left']
If
nodeAis moved to other network,nodeBwill also move, butnodeCwill not.>>> net2 = tk.TensorNetwork() >>> nodeA.network = net2 >>> nodeA.network == nodeB.network True
>>> nodeA.network != nodeC.network True
- sum(axis=None)[source]#
Returns the sum of all elements in the node’s tensor. If an
axisis specified, the sum is over that axis. Ifaxisis a sequence of axes, reduce over all of them.This is not a node
Operation, hence it returns atorch.Tensorinstead of aNode.See also torch.sum().
- Parameters:
axis (int, str, Axis or list[int, str or Axis], optional) – Axis or sequence of axes over which to reduce.
- Return type:
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[-0.2799, -0.4383, -0.8387], [ 1.6225, -0.3370, -1.2316]])
>>> node.sum() tensor(-1.5029)
>>> node.sum('left') tensor([ 1.3427, -0.7752, -2.0704])
- mean(axis=None)[source]#
Returns the mean of all elements in the node’s tensor. If an
axisis specified, the mean is over that axis. Ifaxisis a sequence of axes, reduce over all of them.This is not a node
Operation, hence it returns atorch.Tensorinstead of aNode.See also torch.mean().
- Parameters:
axis (int, str, Axis or list[int, str or Axis], optional) – Axis or sequence of axes over which to reduce.
- Return type:
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[ 1.4005, -0.0521, -1.2091], [ 1.9844, 0.3513, -0.5920]])
>>> node.mean() tensor(0.3139)
>>> node.mean('left') tensor([ 1.6925, 0.1496, -0.9006])
- std(axis=None)[source]#
Returns the std of all elements in the node’s tensor. If an
axisis specified, the std is over that axis. Ifaxisis a sequence of axes, reduce over all of them.This is not a node
Operation, hence it returns atorch.Tensorinstead of aNode.See also torch.std().
- Parameters:
axis (int, str, Axis or list[int, str or Axis], optional) – Axis or sequence of axes over which to reduce.
- Return type:
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[ 0.2111, -0.9551, -0.7812], [ 0.2254, 0.3381, -0.2461]])
>>> node.std() tensor(0.5567)
>>> node.std('left') tensor([0.0101, 0.9145, 0.3784])
- norm(p=2, axis=None, keepdim=False)[source]#
Returns the norm of all elements in the node’s tensor. If an
axisis specified, the norm is over that axis. Ifaxisis a sequence of axes, reduce over all of them.This is not a node
Operation, hence it returns atorch.Tensorinstead of aNode.See also torch.norm().
- Parameters:
- Return type:
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[ 1.5570, 1.8441, -0.0743], [ 0.4572, 0.7592, 0.6356]])
>>> node.norm() tensor(2.6495)
>>> node.norm(axis='left') tensor([1.6227, 1.9942, 0.6399])
- numel()[source]#
Returns the total number of elements in the node’s tensor.
See also torch.numel().
- Return type:
int
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.numel() 6
- conj()#
Returns a view of the node’s tensor with a flipped conjugate bit. If the node has a non-complex dtype, this function returns a new node with the same tensor.
See conj in the PyTorch documentation.
- Return type:
Examples
>>> nodeA = tk.randn((3, 3), dtype=torch.complex64) >>> conjA = nodeA.conj() >>> conjA.is_conj() True
- contract_between(node2)#
Contracts all edges shared between two nodes. Batch contraction is automatically performed when both nodes have batch edges with the same names. It can also be performed using the operator
@.Nodes
resultantfrom this operation are called"contract_edges". The node that keeps information about theSuccessorisself.- Parameters:
node2 (AbstractNode) – Second node of the contraction. Its non-contracted edges will appear last in the list of inherited edges of the resultant node.
- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 7, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> _ = nodeA['right'] ^ nodeB['left'] >>> result = nodeA @ nodeB >>> result.shape torch.Size([100, 10, 7])
- contract_between_(node2)#
In-place version of
contract_between().Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultantfrom this operation are called"contract_edges_ip".- Parameters:
node2 (AbstractNode) – Second node of the contraction. Its non-contracted edges will appear last in the list of inherited edges of the resultant node.
- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 7, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> _ = nodeA['right'] ^ nodeB['left'] >>> result = nodeA.contract_between_(nodeB) >>> result.shape torch.Size([100, 10, 7])
nodeAandnodeBhave been removed from the network.>>> nodeA.network is None True
>>> nodeB.network is None True
>>> del nodeA >>> del nodeB
- permute(axes)#
Permutes the nodes’ tensor, as well as its axes and edges to match the new shape.
See permute in the PyTorch documentation.
Nodes
resultantfrom this operation are called"permute". The node that keeps information about theSuccessorisself.Examples
>>> node = tk.randn((2, 5, 7)) >>> result = node.permute((2, 0, 1)) >>> result.shape torch.Size([7, 2, 5])
- permute_(axes)#
Permutes the nodes’ tensor, as well as its axes and edges to match the new shape (in-place).
Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
See permute.
Nodes
resultantfrom this operation use the same name asnode.- Parameters:
axes (list[int, str or Axis]) – List of axes in the permuted order.
- Returns:
Node
>>> node = tk.randn((2, 5, 7))
>>> node = node.permute_((2, 0, 1))
>>> node.shape
torch.Size([7, 2, 5])
- renormalize(p=2, axis=None)#
Normalizes the node with the specified norm. That is, the tensor of
nodeis divided by its norm.Different norms can be taken, specifying the argument
p, and accross different dimensions, or node axes, specifying the argumentaxis.See also torch.norm().
- Parameters:
- Return type:
Examples
>>> nodeA = tk.randn((3, 3)) >>> renormA = nodeA.renormalize() >>> renormA.norm() tensor(1.)
- split(node1_axes, node2_axes, mode='svd', side='left', rank=None, cum_percentage=None, cutoff=None)#
Splits one node in two via the decomposition specified in
mode. Seesplit()for a more complete explanation.Since the node is split in two, a new edge appears connecting both nodes. The axis that corresponds to this edge has the name
"split".Nodes
resultantfrom this operation are called"split". The node that keeps information about theSuccessorisself.- Parameters:
node1_axes (list[int, str or Axis]) – First set of edges, will appear as the edges of the first (left) resultant node.
node2_axes (list[int, str or Axis]) – Second set of edges, will appear as the edges of the second (right) resultant node.
mode ({"svd", "svdr", "qr", "rq"}) – Decomposition to be used.
side (str, optional) – If
modeis “svd” or “svdr”, indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type:
Examples
>>> node = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch')) >>> node_left, node_right = node.split(['left'], ['right'], ... mode='svd', ... rank=5) >>> node_left.shape torch.Size([100, 10, 5])
>>> node_right.shape torch.Size([100, 5, 15])
>>> node_left['split'] Edge( split_0[split] <-> split_1[split] )
- split_(node1_axes, node2_axes, mode='svd', side='left', rank=None, cum_percentage=None, cutoff=None)#
In-place version of
split().Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Since the node is split in two, a new edge appears connecting both nodes. The axis that corresponds to this edge has the name
"split".Nodes
resultantfrom this operation are called"split_ip".- Parameters:
node1_axes (list[int, str or Axis]) – First set of edges, will appear as the edges of the first (left) resultant node.
node2_axes (list[int, str or Axis]) – Second set of edges, will appear as the edges of the second (right) resultant node.
mode ({"svd", "svdr", "qr", "rq"}) – Decomposition to be used.
side (str, optional) – If
modeis “svd” or “svdr”, indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type:
Examples
>>> node = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch')) >>> node_left, node_right = node.split_(['left'], ['right'], ... mode='svd', ... rank=5) >>> node_left.shape torch.Size([100, 10, 5])
>>> node_right.shape torch.Size([100, 5, 15])
>>> node_left['split'] Edge( split_ip_0[split] <-> split_ip_1[split] )
nodehas been deleted (removed from the network), but it still exists until is deleted.>>> node.network is None True
>>> del node
Node#
- class tensorkrowch.Node(shape=None, axes_names=None, name=None, network=None, data=False, virtual=False, override_node=False, tensor=None, edges=None, override_edges=False, node1_list=None, init_method=None, device=None, dtype=None, **kwargs)[source]#
Base class for non-trainable nodes. Should be subclassed by any class of nodes that are not intended to be trained (e.g.
StackNode).Can be used for fixed nodes of the
TensorNetwork, or intermediate nodes that are resultant from anOperationbetween nodes.All 4 types of nodes (
leaf,data,virtualandresultant) can beNode. In fact,dataandresultantnodes can only be of classNode, since they are not intended to be trainable. To learn more about these 4 types of nodes, seeAbstractNode.For a complete list of properties and methods, see also
AbstractNode.- Parameters:
shape (list[int], tuple[int] or torch.Size, optional) – Node’s shape, that is, the shape of its tensor. If
shapeandinit_methodare provided, a tensor will be made for the node. Otherwise,tensorwould be required.axes_names (list[str] or tuple[str], optional) – Sequence of names for each of the node’s axes. Names are used to access the edge that is attached to the node in a certain axis. Hence, they should be all distinct. They cannot contain blank spaces or special characters. By default, axes names will be
"axis_0", …,"axis_n", beingnthe nummber of axes. If an axis’ name contains the word"batch", it will define a batch edge. The word"stack"cannot be used, since it is reserved for the stack edge ofStackNode.name (str, optional) – Node’s name, used to access the node from de
TensorNetworkwhere it belongs. It cannot contain blank spaces. By default, it is the name of the class (e.g."node","paramnode").network (TensorNetwork, optional) – Tensor network where the node should belong. If
None, a new tensor network will be created to contain the node.data (bool) – Boolean indicating if the node is a
datanode.virtual (bool) – Boolean indicating if the node is a
virtualnode.override_node (bool) – Boolean indicating whether the node should override (
True) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNodereplaces the non-parameterized node in the network).tensor (torch.Tensor, optional) – Tensor that is to be stored in the node. If
None,shapeandinit_methodwill be required.edges (list[Edge], optional) – List of edges that are to be attached to the node. This can be used in case the node inherits the edges from other node(s), like results from
Operations.override_edges (bool) – Boolean indicating whether the provided
edgesshould be overriden (True) when reattached (e.g. if a node is parameterized, it would be required that the newParamNode’s edges are indeed connected to it, instead of to the original non-parameterized node).node1_list (list[bool], optional) – If
edgesare provided, the list ofnode1attributes of each edge should also be provided.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor if
init_methodis provided.dtype (torch.dtype, optional) – Dtype of the tensor if
init_methodis provided.kwargs (float) – Keyword arguments for the different initialization methods. See
AbstractNode.make_tensor().
Examples
>>> node = tk.Node(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='my_node', ... init_method='randn', ... mean=0., ... std=1.) >>> node Node( name: my_node tensor: tensor([[[-1.2517, -1.8147], [-0.7997, -0.0440], [-0.2808, 0.3508], [-1.2380, 0.8859], [-0.3585, 0.8815]], [[-0.2898, -2.2775], [ 1.2856, -0.3222], [-0.8911, -0.4216], [ 0.0086, 0.2449], [-2.1998, -1.6295]]]) axes: [left input right] edges: [my_node[left] <-> None my_node[input] <-> None my_node[right] <-> None])
Also, one can use one of the Initializers to simplify:
>>> node = tk.randn((2, 5, 2)) >>> node Node( name: node tensor: tensor([[[ 0.6545, -0.0445], [-0.9265, -0.2730], [-0.5069, -0.6524], [-0.8227, -1.1211], [ 0.2390, 0.9432]], [[ 0.8633, 0.4402], [-0.6982, 0.4461], [-0.0633, -0.9320], [ 1.6023, 0.5406], [ 0.3489, -0.3088]]]) axes: [axis_0 axis_1 axis_2] edges: [node[axis_0] <-> None node[axis_1] <-> None node[axis_2] <-> None])
- parameterize(set_param=True)[source]#
Replaces the node with a parameterized version of it, that is, turns a fixed
Nodeinto a trainableParamNode.Since the node is replaced, it will be completely removed from the network, and its neighbours will point to the new parameterized node.
- Parameters:
set_param (bool) – Boolean indicating whether the node should be parameterized (
True). Otherwise (False), the non-parameterized node itself will be returned.- Returns:
The original node or a parameterized version of it.
- Return type:
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> paramnodeA = nodeA.parameterize() >>> nodeB.neighbours() == [paramnodeA] True
>>> isinstance(paramnodeA.tensor, torch.nn.Parameter) True
nodeAstill exists and has an edge pointing tonodeB, but the latter does not “see” the former. It should be deleted.>>> del nodeA
To overcome this issue, one should override
nodeA:>>> nodeA = nodeA.parameterize()
- copy(share_tensor=False)[source]#
Returns a copy of the node. That is, returns a node whose tensor is a copy of the original, whose edges are directly inherited (these are not copies, but the exact same edges) and whose name is extended with the suffix
"_copy".To create a copy that has its own (non-inherited) edges, one can use
reattach_edges()afterwards.- Parameters:
share_tensor (bool) – Boolean indicating whether the copied node should store its own copy of the tensor (
False) or share it with the original node (True) storing a reference to it.- Return type:
Examples
>>> node = tk.randn(shape=(2, 3), name='node') >>> copy = node.copy() >>> node.tensor_address() != copy.tensor_address() True
>>> torch.equal(node.tensor, copy.tensor) True
If tensor is shared:
>>> copy = node.copy(True) >>> node.tensor_address() == copy.tensor_address() True
>>> torch.equal(node.tensor, copy.tensor) True
- change_type(leaf=False, data=False, virtual=False)[source]#
Changes node type, only if node is not a resultant node.
- Parameters:
leaf (bool) – Boolean indicating if the new node type is
leaf.data (bool) – Boolean indicating if the new node type is
data.virtual (bool) – Boolean indicating if the new node type is
virtual.
ParamNode#
- class tensorkrowch.ParamNode(shape=None, axes_names=None, name=None, network=None, virtual=False, override_node=False, tensor=None, edges=None, override_edges=False, node1_list=None, init_method=None, device=None, dtype=None, **kwargs)[source]#
Class for trainable nodes. Should be subclassed by any class of nodes that are intended to be trained (e.g.
ParamStackNode).Should be used as the initial nodes conforming the
TensorNetwork, if it is going to be trained. When operating these initial nodes, the resultant nodes will be non-parameterized (e.g.Node,StackNode).The main difference with
Nodesis thatParamNodeshavetorch.nn.Parametertensors instead oftorch.Tensor. Therefore, aParamNodeis a sort of parameter that is attached to theTensorNetwork(which is itself atorch.nn.Module). That is, the list of parameters of the tensor network module contains the tensors of allParamNodes.ParamNodescan only beleafandvirtual(e.g. avirtualnode used in a uniformTensorNetworkto store the tensor that is shared by all the trainable nodes must also be aParamNode, since it stores atorch.nn.Parameter).For a complete list of properties and methods, see also
AbstractNode.- Parameters:
shape (list[int], tuple[int] or torch.Size, optional) – Node’s shape, that is, the shape of its tensor. If
shapeandinit_methodare provided, a tensor will be made for the node. Otherwise,tensorwould be required.axes_names (list[str] or tuple[str], optional) – Sequence of names for each of the node’s axes. Names are used to access the edge that is attached to the node in a certain axis. Hence, they should be all distinct. They cannot contain blank spaces or special characters. By default, axes names will be
"axis_0", …,"axis_n", beingnthe nummber of axes. If an axis’ name contains the word"batch", it will define a batch edge. The word"stack"cannot be used, since it is reserved for the stack edge ofStackNode.name (str, optional) – Node’s name, used to access the node from de
TensorNetworkwhere it belongs. It cannot contain blank spaces. By default, it is the name of the class (e.g."node","paramnode").network (TensorNetwork, optional) – Tensor network where the node should belong. If
None, a new tensor network will be created to contain the node.virtual (bool) – Boolean indicating if the node is a
virtualnode.override_node (bool) – Boolean indicating whether the node should override (
True) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNodereplaces the non-parameterized node in the network).tensor (torch.Tensor, optional) – Tensor that is to be stored in the node. If
None,shapeandinit_methodwill be required.edges (list[Edge], optional) – List of edges that are to be attached to the node. This can be used in case the node inherits the edges from other node(s), like results from
Operations.override_edges (bool) – Boolean indicating whether the provided
edgesshould be overriden (True) when reattached (e.g. if a node is parameterized, it would be required that the newParamNode’s edges are indeed connected to it, instead of to the original non-parameterized node).node1_list (list[bool], optional) – If
edgesare provided, the list ofnode1attributes of each edge should also be provided.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor if
init_methodis provided.dtype (torch.dtype, optional) – Dtype of the tensor if
init_methodis provided.kwargs (float) – Keyword arguments for the different initialization methods. See
AbstractNode.make_tensor().
Examples
>>> node = tk.ParamNode(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='my_paramnode', ... init_method='randn', ... mean=0., ... std=1.) >>> node ParamNode( name: my_paramnode tensor: Parameter containing: tensor([[[ 1.8090, -0.1371], [-0.0501, -1.0371], [ 1.4588, -0.8361], [-0.4974, -1.9957], [ 0.3760, -1.0412]], [[ 0.3393, -0.2503], [ 1.7752, -0.0188], [-0.9561, -0.0806], [-1.0465, -0.5731], [ 1.5021, 0.4181]]], requires_grad=True) axes: [left input right] edges: [my_paramnode[left] <-> None my_paramnode[input] <-> None my_paramnode[right] <-> None])
Also, one can use one of the Initializers to simplify:
>>> node = tk.randn((2, 5, 2), ... param_node=True) >>> node ParamNode( name: paramnode tensor: Parameter containing: tensor([[[-0.8442, 1.4184], [ 0.4431, -1.4385], [-0.5161, -0.6492], [ 0.2095, 0.5760], [-0.9925, -1.5797]], [[-0.8649, -0.5401], [-0.1091, 1.1654], [-0.3821, -0.2477], [-0.7688, -2.4731], [-0.0234, 0.9618]]], requires_grad=True) axes: [axis_0 axis_1 axis_2] edges: [paramnode[axis_0] <-> None paramnode[axis_1] <-> None paramnode[axis_2] <-> None])
- property grad#
Returns gradient of the param-node’s tensor.
See also torch.Tensor.grad
- Return type:
torch.Tensor or None
Examples
>>> paramnode = tk.randn((2, 3), param_node=True) >>> paramnode.tensor Parameter containing: tensor([[-0.3340, 0.6811, -0.2866], [ 1.3371, 1.4761, 0.6551]], requires_grad=True)
>>> paramnode.sum().backward() >>> paramnode.grad tensor([[1., 1., 1.], [1., 1., 1.]])
- parameterize(set_param=True)[source]#
Replaces the param-node with a de-parameterized version of it, that is, turns a
ParamNodeinto a non-trainable, fixedNode.Since the param-node is replaced, it will be completely removed from the network, and its neighbours will point to the new node.
- Parameters:
set_param (bool) – Boolean indicating whether the node should stay parameterized (
True), thus returning the param-node itself. Otherwise (False), the param-node will be de-parameterized.- Returns:
The original node or a de-parameterized version of it.
- Return type:
Examples
>>> paramnodeA = tk.randn((2, 3), param_node=True) >>> paramnodeB = tk.randn((3, 4), param_node=True) >>> _ = paramnodeA[1] ^ paramnodeB[0] >>> nodeA = paramnodeA.parameterize(False) >>> paramnodeB.neighbours() == [nodeA] True
>>> isinstance(nodeA.tensor, torch.nn.Parameter) False
paramnodeAstill exists and has an edge pointing toparamnodeB, but the latter does not “see” the former. It should be deleted.>>> del paramnodeA
To overcome this issue, one should override
paramnodeA:>>> paramnodeA = paramnodeA.parameterize()
- copy(share_tensor=False)[source]#
Returns a copy of the param-node. That is, returns a param-node whose tensor is a copy of the original, whose edges are directly inherited (these are not copies, but the exact same edges) and whose name is extended with the suffix
"_copy".To create a copy that has its own (non-inherited) edges, one can use
reattach_edges()afterwards.- Parameters:
share_tensor (bool) – Boolean indicating whether the copied param-node should store its own copy of the tensor (
False) or share it with the original param-node (True) storing a reference to it.- Return type:
Examples
>>> paramnode = tk.randn(shape=(2, 3), name='node', param_node=True) >>> copy = paramnode.copy() >>> paramnode.tensor_address() != copy.tensor_address() True
>>> torch.equal(paramnode.tensor, copy.tensor) True
If tensor is shared:
>>> copy = paramnode.copy(True) >>> paramnode.tensor_address() == copy.tensor_address() True
>>> torch.equal(paramnode.tensor, copy.tensor) True
StackNode#
- class tensorkrowch.StackNode(nodes=None, axes_names=None, name=None, network=None, override_node=False, tensor=None, edges=None, node1_list=None)[source]#
Class for stacked nodes.
StackNodesare nodes that store the information of a list of nodes that are stacked viastack(), although they can also be instantiated directly. To do so, there are two options:Provide a sequence of nodes: if
nodesare provided, their tensors will be stacked and stored in theStackNode. It is necessary that all nodes are of the same class (NodeorParamNode), have the same rank (although dimension of each leg can be different for different nodes; in which case smaller tensors are extended with 0’s to match the dimensions of the largest tensor in the stack), same axes names (to ensure only the “same kind” of nodes are stacked), belong to the same network and have edges with the same type in each axis (EdgeorParamEdge).Provide a stacked tensor: if the stacked
tensoris provided, it is also necessary to specify theaxes_names,network,edgesandnode1_list.
StackNodeshave an additional axis for the new stack dimension, which is a batch edge. This way, some contractions can be computed in parallel by first stacking two sequences of nodes (connected pair-wise), performing the batch contraction and finally unbinding theStackNodesto retrieve just one sequence of nodes.For the rest of the axes, a list of the edges corresponding to all nodes in the stack is stored, so that, when
unbindingthe stack, it can be inferred to which nodes the unbound nodes have to be connected.- Parameters:
nodes (list[AbstractNode] or tuple[AbstractNode], optional) – Sequence of nodes that are to be stacked. They should all be of the same class (
NodeorParamNode), have the same rank, same axes names and belong to the same network. They do not need to have equal shapes.axes_names (list[str], tuple[str], optional) – Sequence of names for each of the node’s axes. Names are used to access the edge that is attached to the node in a certain axis. Hence, they should be all distinct. Necessary if
nodesare not provided.name (str, optional) – Node’s name, used to access the node from de
TensorNetworkwhere it belongs. It cannot contain blank spaces.network (TensorNetwork, optional) – Tensor network where the node should belong. Necessary if
nodesare not provided.override_node (bool, optional) – Boolean indicating whether the node should override (
True) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNodereplaces the non-parameterized node in the network).tensor (torch.Tensor, optional) – Tensor that is to be stored in the node. Necessary if
nodesare not provided.edges (list[Edge], optional) – List of edges that are to be attached to the node. Necessary if
nodesare not provided.node1_list (list[bool], optional) – If
edgesare provided, the list ofnode1attributes of each edge should also be provided. Necessary ifnodesare not provided.
Examples
>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = tk.unbind(stack_nodes @ stack_data) >>> print(result[0].name) unbind_0
>>> result[0].axes [Axis( left (0) ), Axis( right (1) )]
>>> result[0].shape torch.Size([2, 2])
- property edges_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list of all the edges (one from each node) that correspond to that axis.
- property node1_lists_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list with the
node1attribute of that axis for all nodes.
- reconnect(other)[source]#
Re-connects the
StackNodeto another(Param)StackNode, in the axes where the original stacked nodes were already connected.
- unbind()#
Unbinds a
StackNodeorParamStackNode, where the first dimension is assumed to be the stack dimension.If
auto_unbind()is set toFalse, each resultant node will store its own tensor. Otherwise, they will have only a reference to the corresponding slice of the(Param)StackNode.See
TensorNetworkto learn how theauto_unbindmode affects the computation ofunbind().Nodes
resultantfrom this operation are called"unbind". The node that keeps information about theSuccessorisself.- Return type:
list[Node]
Examples
>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = stack_nodes @ stack_data >>> result = result.unbind() >>> print(result[0].name) unbind_0
>>> result[0].axes [Axis( left (0) ), Axis( right (1) )]
>>> result[0].shape torch.Size([2, 2])
ParamStackNode#
- class tensorkrowch.ParamStackNode(nodes, name=None, virtual=False, override_node=False)[source]#
Class for parametric stacked nodes. They are essentially the same as
StackNodesbut they areParamNodes.They are used to optimize memory usage and save some time when the first operation that occurs to param-nodes in a contraction (that might be computed several times during training) is
stack(). If this is the case, the param-nodes no longer store their own tensors, but rather they make reference to a slide of a greaterParamStackNode(ifauto_stackattribute of theTensorNetworkis set toTrue). Hence, that firststack()is never actually computed.The
ParamStackNodethat results from this process has the name"virtual_result_stack", which contains the reserved name"virtual_result", as explainedhere. This node stores the tensor from which all the stackedParamNodesjust take one slice.This behaviour occurs when stacking param-nodes via
stack(), not when instantiatingParamStackNodemanually.ParamStackNodescan only be instantiated by providing a sequence of nodes.- Parameters:
nodes (list[AbstractNode] or tuple[AbstractNode]) – Sequence of nodes that are to be stacked. They should all be of the same class (
NodeorParamNode), have the same rank, same axes names and belong to the same network. They do not need to have equal shapes.name (str, optional) – Node’s name, used to access the node from de
TensorNetworkwhere it belongs. It cannot contain blank spaces.virtual (bool, optional) – Boolean indicating if the node is a
virtualnode. Since it will be used mainly for the case describedhere, the node will bevirtual, it will not be an effective part of the tensor network.override_node (bool, optional) – Boolean indicating whether the node should override (
True) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNodereplaces the non-parameterized node in the network).
Examples
>>> net = tk.TensorNetwork() >>> net.auto_stack = True >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net, ... param_node=True) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_nodes.name = 'my_stack' >>> print(nodes[0].tensor_address()) my_stack
>>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = tk.unbind(stack_nodes @ stack_data) >>> print(result[0].name) unbind_0
>>> print(result[0].axes) [Axis( left (0) ), Axis( right (1) )]
>>> print(result[0].shape) torch.Size([2, 2])
- property edges_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list of all the edges (one from each node) that correspond to that axis.
- property node1_lists_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list with the
node1attribute of that axis for all nodes.
- reconnect(other)[source]#
Re-connects the
StackNodeto another(Param)StackNode, in the axes where the original stacked nodes were already connected.
- unbind()#
Unbinds a
StackNodeorParamStackNode, where the first dimension is assumed to be the stack dimension.If
auto_unbind()is set toFalse, each resultant node will store its own tensor. Otherwise, they will have only a reference to the corresponding slice of the(Param)StackNode.See
TensorNetworkto learn how theauto_unbindmode affects the computation ofunbind().Nodes
resultantfrom this operation are called"unbind". The node that keeps information about theSuccessorisself.- Return type:
list[Node]
Examples
>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = stack_nodes @ stack_data >>> result = result.unbind() >>> print(result[0].name) unbind_0
>>> result[0].axes [Axis( left (0) ), Axis( right (1) )]
>>> result[0].shape torch.Size([2, 2])
Edges#
Edge#
- class tensorkrowch.Edge(node1, axis1, node2=None, axis2=None)[source]#
Base class for edges. Should be subclassed by any new class of edges.
An edge is nothing more than an object that wraps references to the nodes it connects. Thus, it stores information like the nodes it connects, the corresponding nodes’ axes it is attached to, whether it is dangling or batch, its size, etc.
Above all, its importance lies in that edges enable to connect nodes, forming any possible graph, and to perform easily
Operationslike contracting and splitting nodes.Furthermore, edges have specific operations like
contract()orsvd()(and its variations), as well as in-place versions of them (contract_(),svd_(), etc.) that allow in-place modification of theTensorNetwork.- Parameters:
node1 (AbstractNode) – First node to which the edge is connected.
axis1 (int, str or Axis) – Axis of
node1where the edge is attached.node2 (AbstractNode, optional) – Second node to which the edge is connected. If
None,the edge will be dangling.axis2 (int, str, Axis, optional) – Axis of
node2where the edge is attached.
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> nodeA[0] Edge( node_0[axis_0] <-> None ) (Dangling Edge)
>>> nodeA[1] Edge( node_0[axis_1] <-> node_1[axis_0] )
>>> nodeB[1] Edge( node_1[axis_1] <-> None ) (Dangling Edge)
- property node1#
Returns
node1of the edge.
- property node2#
Returns
node2of the edge. If the edge is dangling, it isNone.
- property nodes#
Returns a list with
node1andnode2.
- property axis1#
Returns axis where the edge is attached to
node1.
- property axis2#
Returns axis where the edge is attached to
node2. If the edge is dangling, it isNone.
- property axes#
Returns a list of axes where the edge is attached to
node1andnode2, respectively.
- property name#
Returns edge’s name. It is formed with the corresponding nodes’ and axes’ names.
Examples
>>> nodeA = tk.Node(shape=(2, 3), ... name='nodeA', ... axes_names=['left', 'right']) >>> edge = nodeA['right'] >>> print(edge.name) nodeA[right] <-> None
>>> nodeB = tk.Node(shape=(3, 4), ... name='nodeB', ... axes_names=['left', 'right']) >>> _ = new_edge = nodeA['right'] ^ nodeB['left'] >>> print(new_edge.name) nodeA[right] <-> nodeB[left]
- change_size(size)[source]#
Changes size of the edge, thus changing the size of tensors of
node1andnode2at the corresponding axes. If new size is smaller, the tensor will be cropped; if larger, the tensor will be expanded with zeros. In both cases, the process (cropping/expanding) occurs at the “right”, “bottom”, “back”, etc. of each dimension.- Parameters:
size (int) – New size of the edge.
Examples
>>> nodeA = tk.ones((2, 3)) >>> nodeB = tk.ones((3, 4)) >>> _ = edge = nodeA[1] ^ nodeB[0] >>> edge.size() 3
>>> edge.change_size(4) >>> nodeA.tensor tensor([[1., 1., 1., 0.], [1., 1., 1., 0.]])
>>> nodeB.tensor tensor([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [0., 0., 0., 0.]])
>>> edge.size() 4
>>> edge.change_size(2) >>> nodeA.tensor tensor([[1., 1.], [1., 1.]])
>>> nodeB.tensor tensor([[1., 1., 1., 1.], [1., 1., 1., 1.]])
>>> edge.size() 2
- copy()[source]#
Returns a copy of the edge, that is, a new edge referencing the same nodes at the same axes.
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = edge = nodeA[1] ^ nodeB[0] >>> copy = edge.copy() >>> copy != edge True
>>> copy.is_attached_to(nodeA) True
>>> copy.is_attached_to(nodeB) True
- connect(other)[source]#
Connects dangling edge to another dangling edge. It is necessary that both edges have the same size so that contractions along that edge can be computed.
Note that this connectes edges from
leaf(ordata,virtual) nodes, but never fromresultantnodes. If one tries to connect one of the inherited edges of aresultantnode, the new connected edge will be attached to the originalleafnodes from which theresultantnode inherited its edges. Hence, theresultantnode will not “see” the connection until theTensorNetworkisreset().If the nodes that are being connected come from different networks, the
node2(and its connected component) will be moved tonode1’s network. See alsomove_to_network().- Parameters:
other (Edge) – The other edge to which current edge will be connected.
- Return type:
Examples
To connect two edges, the overloaded operator
^can also be used.>>> nodeA = tk.Node(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), ... name='nodeB', ... axes_names=('left', 'right')) >>> _ = new_edge = nodeA['right'] ^ nodeB['left'] # Same as .connect() >>> print(new_edge.name) nodeA[right] <-> nodeB[left]
- disconnect()[source]#
Disconnects connected edge, that is, the connected edge is split into two dangling edges, one for each node.
Examples
To disconnect an edge, the overloaded operator
|can also be used.>>> nodeA = tk.Node(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), ... name='nodeB', ... axes_names=('left', 'right')) >>> _ = new_edge = nodeA['right'] ^ nodeB['left'] >>> new_edgeA, new_edgeB = new_edge | new_edge # Same as .disconnect() >>> print(new_edgeA.name) nodeA[right] <-> None
>>> print(new_edgeB.name) nodeB[left] <-> None
- contract()#
Contracts the nodes that are connected through the edge.
This only works if the nodes connected through the edge are
leafnodes. Otherwise, this will perform the contraction between theleafnodes that were connected through this edge.Nodes
resultantfrom this operation are called"contract_edges". The node that keeps information about theSuccessorisself.node1.- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeA') >>> nodeB = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeB') ... >>> _ = nodeA['one'] ^ nodeB['one'] >>> _ = nodeA['two'] ^ nodeB['two'] >>> _ = nodeA['three'] ^ nodeB['three'] >>> result = nodeA['one'].contract() >>> result.shape torch.Size([15, 20, 15, 20])
- contract_()#
In-place version of
contract().Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultantfrom this operation are called"contract_edges_ip".- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeA') >>> nodeB = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeB') ... >>> _ = nodeA['one'] ^ nodeB['one'] >>> _ = nodeA['two'] ^ nodeB['two'] >>> _ = nodeA['three'] ^ nodeB['three'] >>> result = nodeA['one'].contract_() >>> result.shape torch.Size([15, 20, 15, 20])
nodeAandnodeBhave been removed from the network.>>> nodeA.network is None True
>>> nodeB.network is None True
>>> del nodeA >>> del nodeB
- qr()#
Contracts an edge via
contract()and splits it viasplit()usingmode = "qr". Seesplit()for a more complete explanation.This only works if the nodes connected through the edge are
leafnodes. Otherwise, this will perform the contraction between theleafnodes that were connected through this edge.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.qr() ... >>> new_nodeA.shape torch.Size([10, 10, 100])
>>> new_nodeB.shape torch.Size([10, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- qr_()#
In-place version of
qr().Contracts an edge in-place via
contract_()and splits it in-place viasplit_()usingmode = "qr". Seesplit()for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultantfrom this operation use the same names as the original nodes connected byself.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = new_edge.qr_() ... >>> nodeA.shape torch.Size([10, 10, 100])
>>> nodeB.shape torch.Size([10, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
- rq()#
Contracts an edge via
contract()and splits it viasplit()usingmode = "rq". Seesplit()for a more complete explanation.This only works if the nodes connected through the edge are
leafnodes. Otherwise, this will perform the contraction between theleafnodes that were connected through this edge.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.rq() ... >>> new_nodeA.shape torch.Size([10, 10, 100])
>>> new_nodeB.shape torch.Size([10, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- rq_()#
In-place version of
rq().Contracts an edge in-place via
contract_()and splits it in-place viasplit_()usingmode = "qr". Seesplit()for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultantfrom this operation use the same names as the original nodes connected byself.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = tk.rq_(new_edge) ... >>> nodeA.shape torch.Size([10, 10, 100])
>>> nodeB.shape torch.Size([10, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
- svd(side='left', rank=None, cum_percentage=None, cutoff=None)#
Contracts an edge via
contract()and splits it viasplit()usingmode = "svd". Seesplit()for a more complete explanation.This only works if the nodes connected through the edge are
leafnodes. Otherwise, this will perform the contraction between theleafnodes that were connected through this edge.- Parameters:
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.svd(rank=7) ... >>> new_nodeA.shape torch.Size([10, 7, 100])
>>> new_nodeB.shape torch.Size([7, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- svd_(side='left', rank=None, cum_percentage=None, cutoff=None)#
In-place version of
svd().Contracts an edge in-place via
contract_()and splits it in-place viasplit_()usingmode = "svd". Seesplit()for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultantfrom this operation use the same names as the original nodes connected byself.- Parameters:
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = new_edge.svd_(rank=7) ... >>> nodeA.shape torch.Size([10, 7, 100])
>>> nodeB.shape torch.Size([7, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
- svdr(side='left', rank=None, cum_percentage=None, cutoff=None)#
Contracts an edge via
contract()and splits it viasplit()usingmode = "svdr". Seesplit()for a more complete explanation.This only works if the nodes connected through the edge are
leafnodes. Otherwise, this will perform the contraction between theleafnodes that were connected through this edge.- Parameters:
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.svdr(rank=7) ... >>> new_nodeA.shape torch.Size([10, 7, 100])
>>> new_nodeB.shape torch.Size([7, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- svdr_(side='left', rank=None, cum_percentage=None, cutoff=None)#
In-place version of
svdr().Contracts an edge in-place via
contract_()and splits it in-place viasplit_()usingmode = "svdr". Seesplit()for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultantfrom this operation use the same names as the original nodes connected byself.- Parameters:
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type:
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = new_edge.svdr_(rank=7) ... >>> nodeA.shape torch.Size([10, 7, 100])
>>> nodeB.shape torch.Size([7, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
StackEdge#
- class tensorkrowch.StackEdge(edges, node1_list, node1, axis1, node2=None, axis2=None)[source]#
Class for stacked edges. They are just like
Edgesbut used when stacking a collection of nodes into aStackNode. When doing this, all edges of the stacked nodes must be kept, since they have the information regarding the nodes’ neighbours, which will be used when :func: unbinding <unbind> the stack.- Parameters:
edges (list[Edge]) – List of edges (one from each node that is being stacked) that are attached to the equivalent of
axis1in each node.node1_list (list[bool]) – List of
axis1attributes (one from each node that is being stacked) of the equivalent ofaxis1in each node.node1 (StackNode or ParamStackNode) – First node to which the edge is connected.
axis1 (int, str or Axis) – Axis of
node1where the edge is attached.node2 (StackNode or ParamStackNode, optional) – Second node to which the edge is connected. If
None, the edge will be dangling.axis2 (int, str, Axis, optional) – Axis of
node2where the edge is attached.
- property edges#
Returns list of stacked edges corresponding to this axis.
- property node1_list#
Returns list of
node1’s corresponding to this axis.
- connect(other)[source]#
Same as
connect()but it is first verified that all stackededges()corresponding to bothStackEdgesare the same.That is, this is a redundant operation to re-connect a list of edges that should be already connected. However, this is mandatory, since when stacking two sequences of nodes independently it cannot be inferred that the resultant
StackNodeshad to be connected.- Parameters:
other (StackEdge) – The other edge to which current edge will be connected.
- Return type:
Examples
To connect two stack-edges, the overloaded operator
^can also be used.>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks to be able to contract >>> _ = new_edge = stack_nodes['input'] ^ stack_data['feature'] >>> print(new_edge.name) stack_0[input] <-> stack_1[feature]
Successor#
- class tensorkrowch.Successor(node_ref, index, child, hints=None)[source]#
Class for successors. This is a sort of cache memory for
Operationsthat have been already computed.For instance, when contracting two nodes, the result gives a new node that stores the tensor resultant from contracting both nodes’ tensors. However, when training a
TensorNetwork, the tensors inside the nodes will change every epoch, but there is actually no need to create a new resultant node every time. Instead, it is more efficient to keep track of which node arose as the result of an operation, and simply change its tensor.Hence, a
Successoris instantiated providing details to get the operand nodes’ tensors, as well as a reference to the resultant node, and some hints that might help accelerating the computations the next time the operation is performed.These properties can be accessed via
successor.node_ref,successor.index,successor.childandsuccessor.hints.See the different
operationsto learn which resultant node keeps theSuccessorinformation.- Parameters:
node_ref (Node, ParamNode, or list[Node, ParamNode]) – For the nodes that are involved in an operation, this are the corresponding nodes that store their tensors.
index (list[int, slice] or list[list[int, slice]], optional) – For the nodes that are involved in an operation, this are the corresponding indices used to access their tensors.
child (Node or list[Node]) – The node or list of nodes that result from an operation.
hints (any, optional) – A dictionary of hints created the first time an operation is computed in order to save some computation in the next calls of the operation.
Examples
When contracting two nodes, a
Successoris created and added to the list of successors of the first node (left operand).>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] ... >>> # Contract nodes >>> result = nodeA @ nodeB >>> print(result.name) contract_edges
>>> # To get a successor, the name of the operation and the arguments have >>> # to be provided as keys of the successors dictionary >>> nodeA.successors['contract_edges'][(None, nodeA, nodeB)].child == result True
Tensor Network#
- class tensorkrowch.TensorNetwork(name=None)[source]#
Class for arbitrary Tensor Networks. Subclass of PyTorch
torch.nn.Module.Tensor Networks are the central objects of TensorKrowch. Basically, a tensor network is a graph with vertices (
Nodes) connected byEdges. In these models, nodes’ tensors will be trained so that the contraction of the whole network approximates a certain function. Hence,TensorNetwork’s are the trainable objects of TensorKrowch, very much liketorch.nn.Module’s are the trainable objects of PyTorch.Recall that the common way of defining models out of
torch.nn.Moduleis by defining a subclass where the__init__andforwardmethods are overriden:__init__: Defines the model itself (its layers, attributes, etc.).
forward: Defines the way the model operates, that is, how the different parts of the model might combine to get an output from a particular input.
With
TensorNetwork, the workflow is similar, though there are other methods that should be overriden:__init__: Defines the graph of the tensor network and initializes the tensors of the nodes. See
AbstractNodeandEdgeto learn how to create nodes and connect them.set_data_nodes (optional): Creates the data nodes where the data tensor(s) will be placed. Usually, it will just select the edges to which the
datanodes should be connected, and call the parent method. Seeset_data_nodes()to learn good practices to override it. See alsoadd_data().add_data (optional): Adds new data tensors that will be stored in
datanodes. Usually it will not be necessary to override this method, but if one wants to customize how data is set into thedatanodes,add_data()can be overriden.contract: Defines the contraction algorithm of the whole tensor network, thus returning a single node. Very much like
forwardthis is the main method that describes how the components of the network are combined. Hence, inTensorNetworktheforward()method shall not be overriden, since it will just callset_data_nodes(), if needed,add_data()andcontract()and then it will return the tensor corresponding to the lastresultantnode. Hence, the order in whichOperationsare called fromcontractis important. The last operation must be the one returning the final node.
Although one can define how the network is going to be contracted, there are a couple of modes that can change how this contraction behaves at a lower level:
auto_stack (
Falseby default): This mode indicates whether the operationstack()can take control of the memory management of the network to skip some steps in future computations. Ifauto_stackis set toTrueand a collection ofParamNodesarestacked(as the first operation in which these nodes are involved), then those nodes will no longer store their own tensors, but rather avirtualParamStackNodewill store the stacked tensor, avoiding the computation of that firststack()in every contraction. This behaviour is not possible ifauto_stackis set toFalse, in which case all nodes will always store their own tensors.Setting
auto_stacktoTruewill be faster for both inference and training. However, while experimenting withTensorNetwork’s one might want that all nodes store their own tensors to avoid problems.auto_unbind (
Falseby default): This mode indicates whether the operationunbind()has to actually unbind the stacked tensor or just generate a collection of references. That is, ifauto_unbindis set toFalse,unbind()creates a collection of nodes, each of them storing the corresponding slice of the stacked tensor. Ifauto_unbindis set toTrue,unbind()just creates the nodes and gives each of them an index to reference the stacked tensor, so that each node’s tensor would be retrieved by indexing the stack. This avoids performing the operation, since these indices will be the same in subsequent iterations.Setting
auto_unbindtoTruewill be faster for inference, but slower for training.
Once the training algorithm starts, these modes should not be changed (very often at least), since changing them entails first
resettingthe whole network, which is a costly method.When the
TensorNetworkis defined, it has a bunch ofleaf,dataandvirtualnodes that make up the network structure, each of them storing its own tensor. However, when the network is contracted, severalresultantnodes become new members of the network, even modifying its memory (depending on theauto_stackandauto_unbindmodes).Therefore, if one wants to
reset()the network to its initial state after performing some operations, all theresultantnodes should be deleted, and all the tensors should return to its nodes (each node stores its own tensor). This is exactly whatreset()does. Besides, sinceauto_stackandauto_unbindcan change how the tensors are stored, if one wants to change these modes, the network should be first reset (this is already done automatically when changing the modes).See
AbstractNodeto learn about the 4 excluding types of nodes, andreset()to learn about how these nodes are treated differently.There are also some special nodes that one should take into account. These are specified by name. See
AbstractNodeto learn about reserved nodes’ names, andreset()to learn about how these nodes are treated differently.Other thing one must take into account is the naming of
Nodes. Since the name of aNodeis used to access it from theTensorNetwork, the same name cannot be used by more than oneNode. In that case, repeated names get an automatic enumeration of the form"name_{number}"(underscore followed by number).To add a custom enumeration to keep track of the nodes of the network in a user-defined way, one may use brackets or parenthesis:
"name_({number})".For an example, check this tutorial.
- Parameters:
name (str, optional) – Network’s name. By default, it is the name of the class (e.g.
"tensornetwork").
- property nodes#
Returns dictionary with all the nodes belonging to the network (
leaf,data,virtualandresultant).
- property nodes_names#
Returns list of names of all the nodes belonging to the network (
leaf,data,virtualandresultant).
- property leaf_nodes#
Returns dictionary of
leafnodes of the network.
- property data_nodes#
Returns dictionary of
datanodes of the network.
- property virtual_nodes#
Returns dictionary of
virtualnodes of the network.
- property resultant_nodes#
Returns dictionary of
resultantnodes of the network.
- property edges#
Returns list of dangling, non-batch edges of the network. Dangling edges from
virtualnodes are not included.
- property auto_stack#
Returns boolean indicating whether
auto_stackmode is active. By default, it isTrue.This mode indicates whether the operation
stack()can take control of the memory management of the network to skip some steps in future computations. Ifauto_stackis set toTrueand a collection ofParamNodesarestacked(as the first operation in which these nodes are involved), then those nodes will no longer store their own tensors, but rather avirtualParamStackNodewill store the stacked tensor, avoiding the computation of that firststack()in every contraction. This behaviour is not possible ifauto_stackis set toFalse, in which case all nodes will always store their own tensors.Setting
auto_stacktoTruewill be faster for both inference and training. However, while experimenting withTensorNetwork’s one might want that all nodes store their own tensors to avoid problems.Be aware that changing
auto_stackmode entailsresettingthe network, which will modify its nodes. This has to be done manually in order to avoid undesired behaviour.
- property auto_unbind#
Returns boolean indicating whether
auto_unbindmode is active. By default, it isFalse.This mode indicates whether the operation
unbind()has to actually unbind the stacked tensor or just generate a collection of references. That is, ifauto_unbindis set toFalse,unbind()creates a collection of nodes, each of them storing the corresponding slice of the stacked tensor. Ifauto_unbindis set toTrue,unbind()just creates the nodes and gives each of them an index to reference the stacked tensor, so that each node’s tensor would be retrieved by indexing the stack. This avoids performing the operation, since these indices will be the same in subsequent iterations.Setting
auto_unbindtoTruewill be faster for inference, but slower for training.Be aware that changing
auto_unbindmode entailsresettingthe network, which will modify its nodes. Thus, this mode has to be changed manually in order to avoid undesired behaviour.
- delete_node(node, move_names=True)[source]#
Disconnects node from all its neighbours and removes it from the network. To completely get rid of the node, do not forget to delete it:
>>> del node
or override it:
>>> node = node.copy() # .copy() calls to .delete_node()
- Parameters:
move_names (bool) – Boolean indicating whether names’ enumerations should be decreased when removing a node (
True) or kept as they are (False). This is useful when several nodes are being modified at once, and each resultant node has the same enumeration as the corresponding original node.
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> print(nodeA.name, nodeB.name) node_0 node_1
>>> nodeB.network.delete_node(nodeB) >>> nodeA.neighbours() == [] True
>>> print(nodeA.name) node
If
move_namesis set toFalse, enumeration is not removed. Useful to avoid managing enumeration of a list of nodes that are all going to be deleted.>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> nodeB.network.delete_node(nodeB, False) >>> print(nodeA.name) node_0
- parameterize(set_param=True, override=False)[source]#
Parameterizes all
leafnodes of the network. If there areresultantnodes in theTensorNetwork, it will be firstreset().- Parameters:
set_param (bool) – Boolean indicating whether the tensor network has to be parameterized (
True) or de-parameterized (False).override (bool) – Boolean indicating whether the tensor network should be parameterized in-place (
True) or copied and then parameterized (False).
- set_data_nodes(input_edges, num_batch_edges)[source]#
Creates
datanodes with as many batch edges asnum_batch_edgesand one feature edge. Then it connects each of these nodes’ feature edges to an edge from the listinput_edges(following the provided order). Thus, edges ininput_edgesneed to be dangling. Also, if there are alreadydatanodes (or the"stack_data_memory") in the network, they should beunset()first.If all the
datanodes have the same shape, avirtualnode will contain all the tensors stacked in one, what will save some memory and time in computations. This node is"stack_data_memory". SeeAbstractNodeto learn more about this node.If this method is overriden in subclasses, it can be done in two flavours:
def set_data_nodes(self): # Collect input edges input_edges = [node_1[i], ..., node_n[j]] # Define number of batches num_batch_edges = m # Call parent method super().set_data_nodes(input_edges, num_batch_edges)
def set_data_nodes(self): # Create data nodes directly data_nodes = [ tk.Node(shape=(batch_1, ..., batch_m, feature_dim), axes_names=('batch_1', ..., 'batch_m', 'feature') network=self, data=True) for _ in range(n)] # Connect them with the leaf nodes for i, data_node in enumerate(data_nodes): data_node['feature'] ^ self.my_nodes[i]['input']
If this method is overriden, there is no need to call it explicitly during training, since it will be done in the
forward()call.On the other hand, if one does not override
set_data_nodes, it should be called before starting training.- Parameters:
input_edges (list[Edge]) – List of edges to which the
datanodes’ feature edges will be connected.num_batch_edges (int) – Number of batch edges in the
datanodes.
Examples
>>> nodeA = tk.Node(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='nodeA', ... init_method='randn') >>> nodeB = tk.Node(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='nodeB', ... init_method='randn') >>> _ = nodeA['right'] ^ nodeB['left'] ... >>> net = nodeA.network >>> input_edges = [nodeA['input'], nodeB['input']] >>> net.set_data_nodes(input_edges, 1) >>> list(net.data_nodes.keys()) ['data_0', 'data_1']
>>> net['data_0'] Node( name: data_0 tensor: None axes: [batch feature] edges: [data_0[batch] <-> None data_0[feature] <-> nodeA[input]])
- unset_data_nodes()[source]#
Deletes all
datanodes (including the"stack_data_memory"when this node exists).
- add_data(data)[source]#
Adds data tensor(s) to
datanodes, that is, changes their tensors by new data tensors when a new batch is provided.If all data nodes have the same shape, thus having its tensor stored in
"stack_data_memory", the whole data tensor will be stored by this node. Thedatanodes will just store a reference to a slice of that tensor.Otherwise, each tensor in the list (
data) will be stored by eachdatanode in the network, in the order they appear indata_nodes().If one wants to customize how data is set into the
datanodes, this method can be overriden.- Parameters:
data (torch.Tensor or list[torch.Tensor]) –
If all data nodes have the same shape, thus having its tensor stored in
"stack_data_memory",datashould be a tensor of shape\[batch\_size_{0} \times ... \times batch\_size_{n} \times n_{features} \times feature\_dim\]Otherwise, it should be a list with \(n_{features}\) elements, each of them being a tensor with shape
\[batch\_size_{0} \times ... \times batch\_size_{n} \times feature\_dim\]
Examples
>>> nodeA = tk.Node(shape=(3, 5, 3), ... axes_names=('left', 'input', 'right'), ... name='nodeA', ... init_method='randn') >>> nodeB = tk.Node(shape=(3, 5, 3), ... axes_names=('left', 'input', 'right'), ... name='nodeB', ... init_method='randn') >>> _ = nodeA['right'] ^ nodeB['left'] ... >>> net = nodeA.network >>> input_edges = [nodeA['input'], nodeB['input']] >>> net.set_data_nodes(input_edges, 1) ... >>> net.add_data(torch.randn(100, 2, 5)) >>> net['data_0'].shape torch.Size([100, 5])
- reset()[source]#
Resets the
TensorNetworkto its initial state, before computing any non-in-placeOperation. Different actions apply to different types of nodes:leaf: These nodes retrieve their tensors in case they were just referencing a slice of the tensor in theParamStackNodethat is created whenstackingParamNodes(ifauto_stackmode is active). If there is a"virtual_uniform"node in the network from which allleafnodes take their tensor, this is not modified.virtual: Only virtual nodes created inoperationsaredeleted. This only includes nodes using the reserved name"virtual_result".resultant: These nodes aredeletedfrom the network.
Also, the dictionaries of
Successorsof allleafanddatanodes are emptied.The
TensorNetworkis automaticallyresetwhenparameterizingit, changingauto_stack()orauto_unbind()modes, ortracing.See
AbstractNodeto learn more about the 4 types of nodes and the reserved names.For an example, check this tutorial.
- trace(example=None, *args, **kwargs)[source]#
Traces the tensor network contraction algorithm with two purposes:
Create all the intermediate
resultantnodes that result fromOperationsso that in the next contractions only the tensor operations have to be computed, thus saving a lot of time.Keep track of the tensors that are used to compute operations, so that intermediate results that are not useful anymore can be deleted, thus saving a lot of memory. This is achieved by constructing an
inverse_memorythat, given a memory address, stores the nodes that use the tensor located in that address of the network’s memory.
To trace a tensor network, it is necessary to provide the same arguments that would be required in the forward call. In case the tensor network is contracted with some input data, an example tensor with batch dimension 1 and filled with zeros would be enough to trace the contraction.
For an example, check this tutorial.
- Parameters:
example (torch.Tensor, optional) – Example tensor used to trace the contraction of the tensor network. In case the tensor network is contracted with some input data, an example tensor with batch dimension 1 and filled with zeros would be enough to trace the contraction.
args – Arguments that might be used in
contract().kwargs – Keyword arguments that might be used in
contract().
- contract()[source]#
Contracts the whole tensor network returning a single
Node. This method is not implemented and subclasses ofTensorNetworkshould override it to define the contraction algorithm of the network.
- forward(data=None, *args, **kwargs)[source]#
Contracts
TensorNetworkwith input data. It can be called using the__call__operator().Overrides the
forwardmethod of PyTorch’storch.nn.Module. Sets data nodes automatically wheneverset_data_nodes()is overriden,adds datatensor(s) to these nodes, and contracts the whole network according tocontract(), returning a singletorch.Tensor.Furthermore, to optimize the contraction algorithm during training, once the
TensorNetworkistraced, all thatforwarddoes is calling the differentOperationsused incontract()in the same order they appeared in the code. Hence, the last operation incontract()should be the one that returns the single outputNode.For an example, check this tutorial.
- Parameters:
data (torch.Tensor or list[torch.Tensor], optional) –
If all data nodes have the same shape, thus having its tensor stored in
"stack_data_memory",datashould be a tensor of shape\[batch\_size_{0} \times ... \times batch\_size_{n} \times n_{features} \times feature\_dim\]Otherwise, it should be a list with \(n_{features}\) elements, each of them being a tensor with shape
\[batch\_size_{0} \times ... \times batch\_size_{n} \times feature\_dim\]Also, it is not necessary that the network has
datanodes, thusNoneis also valid.args – Arguments that might be used in
contract().kwargs – Keyword arguments that might be used in
contract().