Components
Contents
Components#
Axis#
- class tensorkrowch.Axis(num, name, node=None, node1=True)[source]#
Axes are the objects that stick edges to nodes. Each instance of the
AbstractNode
class has a list of \(N\) axes, each corresponding to one edge. Each axis stores information that facilitates accessing that edge, such as itsname
andnum
(index). Additionally, an axis keeps track of itsbatch
andnode1
attributes.batch: If the axis name contains the word “batch”, the edge will be a batch edge, which means that it cannot be connected to other nodes. Instead, it specifies a dimension that allows for batch operations (e.g., batch contraction). If the name of the axis is changed and no longer contains the word “batch”, the corresponding edge will no longer be a batch edge. Furthermore, instances of the
StackNode
andParamStackNode
classes always have an axis with name “stack” whose edge is a batch edge.node1: When two dangling edges are connected the result is a new edge linking two nodes, say
nodeA
andnodeB
. If the connection is performed in the following order:new_edge = nodeA[edgeA] ^ nodeB[edgeB]
Then
nodeA
will be thenode1
ofnew_edge
andnodeB
, thenode2
. Hence, to access one of the nodes fromnew_edge
one needs to know if it isnode1
ornode2
.
Even though we can create
Axis
instances, that will not be usually the case, since axes are automatically created when instantiating a newnode
.Other thing one must take into account is the naming of
Axes
. Since the name of anAxis
is used to access it from theNode
, the same name cannot be used by more than oneAxis
. In that case, repeated names get an automatic enumeration of the form"name_{number}"
(underscore followed by number).To add a custom enumeration in a user-defined way, one may use brackets or parenthesis:
"name_({number})"
.- Parameters
num (int) – Index of the axis in the node’s axes list.
name (str) – Axis name, should not contain blank spaces or special characters. If it contains the word “batch”, the axis will correspond to a batch edge. The word “stack” cannot be used in the name, since it is reserved for stacks.
node (AbstractNode, optional) – Node to which the axis belongs.
node1 (bool) – Boolean indicating whether
node1
of the edge attached to this axis is the node that contains the axis (True
). Otherwise, the node isnode2
of the edge (False
).
Examples
Although Axis will not be usually explicitly instantiated, it can be done like so:
>>> axis = tk.Axis(0, 'left') >>> axis Axis( left (0) )
>>> axis.is_node1() True
>>> axis.is_batch() False
Since “batch” is not contained in “left”,
axis
does not correspond to a batch edge, but that can be changed:>>> axis.name = 'mybatch' >>> axis.is_batch() True
Also, as explained before, knowing if a node is the
node1
ornode2
of an edge enables users to access that node from the edge:>>> nodeA = tk.Node(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), axes_names=('left', 'right')) >>> new_edge = nodeA['right'] ^ nodeB['left'] ... >>> # nodeA is node1 and nodeB is node2 of new_edge >>> nodeA == new_edge.nodes[1 - nodeA.get_axis('right').is_node1()] True
>>> nodeB == new_edge.nodes[nodeA.get_axis('right').is_node1()] True
The
node1
attribute is extended toresultant
nodes that inherit edges.>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.randn(shape=(4, 5), axes_names=('left', 'right')) >>> edge1 = nodeA['right'] ^ nodeB['left'] >>> edge2 = nodeB['right'] ^ nodeC['left'] >>> result = nodeA @ nodeB ... >>> # result inherits the edges nodeA['left'] and edge2 >>> result['left'] == nodeA['left'] True
>>> result['right'] == edge2 True
>>> # result is still node1 of edge2, since nodeA was >>> result.is_node1('right') True
- property num#
Index in the node’s axes list.
- property name#
Axis name, used to access edges by name of the axis. It cannot contain blank spaces or special characters. If it contains the word “batch”, the axis will correspond to a batch edge. The word “stack” cannot be used in the name, since it is reserved for stacks.
- property node#
Node to which the axis belongs.
Nodes#
AbstractNode#
- class tensorkrowch.AbstractNode(shape=None, axes_names=None, name=None, network=None, data=False, virtual=False, override_node=False, tensor=None, edges=None, override_edges=False, node1_list=None, init_method=None, device=None, dtype=None, **kwargs)[source]#
Abstract class for all types of nodes. Defines what a node is and most of its properties and methods. Since it is an abstract class, cannot be instantiated.
Nodes are the elements that make up a
TensorNetwork
. At its most basic level, a node is a container for atorch.Tensor
that stores other relevant information which enables to build any network and operate nodes to contract it (and train it!). Some of the information that is carried by the nodes includes:Shape: Every node needs a shape to know if connections with other nodes are possible. Even if the tensor is not specified, an empty node needs a shape.
Tensor: The key ingredient of the node. Although the node acts as a container for the tensor, the node does not contain it. Actually, for efficiency purposes, the tensors are stored in a sort of memory that is shared by all the nodes of the
TensorNetwork
. Therefore, all that nodes contain is a memory address. Furthermore, some nodes can share the same (or a part of the same) tensor, thus containing the same address. Sometimes, to maintain consistency, when two nodes share a tensor, one stores its memory address, and the other one stores a reference to the former.Axes: A list of
Axes
that make it easy to access edges just using a name or an index.Edges: A list of
Edges
, one for each dimension of the node. Each edge is attached to the node via anAxis
. Edges are useful to connect several nodes, creating aTensorNetwork
.Network: The
TensorNetwork
to which the node belongs. If the network is not specified when creating the node, a newTensorNetwork
is created to contain the node. Although the network can be thought of as a graph, it is atorch.nn.Module
, so it is much more than that. Actually, theTensorNetwork
can contain different types of nodes, not all of them being part of the graph, but being used for different purposes.Successors: A dictionary with information about the nodes that result from
Operations
in which the current node was involved. SeeSuccessor
.
Carrying this information with the node is what makes it easy to:
Perform tensor network
Operations
such ascontraction
of two neighbouring nodes, without having to worry about tensor’s shapes, order of axes, etc.Perform more advanced operations such as
stack()
orunbind()
saving memory and time.Keep track of operations in which a node has taken place, so that several steps can be skipped in further training iterations. See
TensorNetwork.trace()
.
Also, there are 4 excluding types of nodes that will have different roles in the
TensorNetwork
:leaf: These are the nodes that form the
TensorNetwork
(together with thedata
nodes). Usually, these will be the trainable nodes. These nodes can store their own tensors or use other node’s tensor.data: These are similar to
leaf
nodes, but they are never trainable, and are used to store the temporary tensors coming from input data. These nodes can store their own tensors or use other node’s tensor.virtual: These nodes are a sort of ancillary, hidden nodes that accomplish some useful task (e.g. in uniform tensor networks a virtual node can store the shared tensor, while all the other nodes in the network just have a reference to it). These nodes always store their own tensors.
resultant: These are nodes that result from an
Operation
. They are intermediate nodes that (almost always) inherit edges fromleaf
anddata
nodes, the ones that really form the network. These nodes can store their own tensors or use other node’s tensor. The names of theresultant
nodes are the name of theOperation
that originated it.
See
TensorNetwork
andreset()
to learn more about the importance of these 4 types of nodes.Other thing one should take into account are reserved nodes’ names:
“stack_data_memory”: Name of the
virtual
StackNode
that is created inset_data_nodes()
to store the whole data tensor from which eachdata
node might take just one slice. There should be at most one"stack_data_memory"
in the network. To learn more about this, seeset_data_nodes()
andadd_data()
.“virtual_result”: Name of
virtual
nodes that are not explicitly part of the network, but are required for some situations during contraction. For instance, theParamStackNode
that results from stackingParamNodes
as the first operation in the network contraction, ifauto_stack
mode is set toTrue
. To learn more about this, seeParamStackNode
.“virtual_uniform”: Name of the
virtual
Node
orParamNode
that is used in uniform (translationally invariant) tensor networks to store the tensor that will be shared by allleaf
nodes. There might be as much"virtual_uniform"
nodes as shared memories are used for theleaf
nodes in the network (usually just one).
For
"virtual_result"
and"virtual_uniform"
, these special behaviours are not restricted to nodes having those names, but also nodes whose names contain those strings.Although these names can in principle be used for other nodes, this can lead to undesired behaviour.
See
reset()
to learn more about the importance of these reserved nodes’ names.Other thing one must take into account is the naming of
Nodes
. Since the name of aNode
is used to access it from theTensorNetwork
, the same name cannot be used by more than oneNode
. In that case, repeated names get an automatic enumeration of the form"name_{number}"
(underscore followed by number).To add a custom enumeration to keep track of the nodes of the network in a user-defined way, one may use brackets or parenthesis:
"name_({number})"
.The same automatic enumeration of names occurs for
Axes
’ names in aNode
.Refer to the subclasses of
AbstractNode
to see how to instantiate nodes:- property tensor#
Node’s tensor. It can be a
torch.Tensor
,torch.nn.Parameter
orNone
if the node is empty.
- property network#
TensorNetwork
where the node belongs. If the node is moved to anotherTensorNetwork
, the entire connected component of the graph where the node is will be moved.
- property successors#
Dictionary with
Operations
’ names as keys, and dictionaries ofSuccessors
of the node as values. The inner dictionaries use as keys the arguments used when the operation was called.
- property name#
Node’s name, used to access the node from the
tensor network
where it belongs. It cannot contain blank spaces.
- is_leaf()[source]#
Returns a boolean indicating if the node is a
leaf
node. These are the nodes that form theTensorNetwork
(together with thedata
nodes). Usually, these will be the trainable nodes. These nodes can store their own tensors or use other node’s tensor.
- is_data()[source]#
Returns a boolean indicating if the node is a
data
node. These nodes are similar toleaf
nodes, but they are never trainable, and are used to store the temporary tensors coming from input data. These nodes can store their own tensors or use other node’s tensor.
- is_virtual()[source]#
Returns a boolean indicating if the node is a
virtual
node. These nodes are a sort of ancillary, hidden nodes that accomplish some useful task (e.g. in uniform tensor networks a virtual node can store the shared tensor, while all the other nodes in the network just have a reference to it). These nodes always store their own tensors.If a
virtual
node is used as the node storing the shared tensor in a uniform (translationally invariant)TensorNetwork
, it is recommended to use the string “virtual_uniform” in the node’s name (e.g. “virtual_uniform_mps”).
- is_resultant()[source]#
Returns a boolean indicating if the node is a
resultant
node. These are nodes that result from anOperation
. They are intermediate nodes that (almost always) inherit edges fromleaf
anddata
nodes, the ones that really form the network. These nodes can store their own tensors or use other node’s tensor.
- is_conj()[source]#
Equivalent to torch.is_conj().
- is_complex()[source]#
Equivalent to torch.is_complex().
- is_floating_point()[source]#
Equivalent to torch.is_floating_point().
- size(axis=None)[source]#
Returns the size of the node’s tensor. If
axis
is specified, returns the size of that axis; otherwise returns the shape of the node (same asshape
).- Parameters
axis (int, str or Axis, optional) – Axis for which to retrieve the size.
- Return type
int or torch.Size
- is_node1(axis=None)[source]#
Returns
node1
attribute of axes of the node. Ifaxis
is specified, returns only thenode1
of that axis; otherwise returns thenode1
of all axes of the node.- Parameters
axis (int, str or Axis, optional) – Axis for which to retrieve the
node1
.- Return type
bool or list[bool]
- neighbours(axis=None)[source]#
Returns the neighbours of the node, the nodes to which it is connected.
If
self
is aresultant
node, this will return the neighbours of theleaf
nodes from whichself
inherits the edges. Therefore, one cannot check if tworesultant
nodes are connected by looking into their neighbours lists. To do that, useis_connected_to()
.- Parameters
axis (int, str or Axis, optional) – Axis for which to retrieve the neighbour.
- Return type
AbstractNode or list[AbstractNode]
Examples
>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.randn(shape=(4, 5), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] >>> _ = nodeB['right'] ^ nodeC['left'] >>> set(nodeB.neighbours()) == {nodeA, nodeC} True
>>> nodeB.neighbours('right') == nodeC True
Nodes
resultant
from operations are still connected to original neighbours.>>> result = nodeA @ nodeB >>> result.neighbours('right') == nodeC True
- get_edge(axis)[source]#
Returns
Edge
given theAxis
(or itsname
ornum
) where it is attached to the node.
- reattach_edges(axes=None, override=False)[source]#
Substitutes current edges by copies of them that are attached to the node. It can happen that an edge is not attached to the node if it is the result of an
Operation
and, hence, it inherits edges from the operands. In that case, the new copied edges will be attached to the resultant node, replacing each previousnode1
ornode2
with it (according to thenode1
attribute of each axis).Used for in-place operations like
permute_()
orsplit_()
and to (de)parameterize nodes.- Parameters
axis (list[int, str or Axis] or tuple[int, str or Axis], optional) – The edge attached to these axes will be reattached. If
None
, all edges will be reattached.override (bool) – Boolean indicating if the new, reattached edges should also replace the corresponding edges in the node’s neighbours (
True
). Otherwise, the neighbours’ edges will be pointing to the original nodes from which the current node inherits its edges (False
).
Examples
>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.randn(shape=(4, 5), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] >>> _ = nodeB['right'] ^ nodeC['left'] >>> result = nodeA @ nodeB
Node
result
inherits itsright
edge fromnodeB
.>>> result['right'] == nodeB['right'] True
However,
nodeB['right']
still connectsnodeB
andnodeC
. There is no reference toresult
.>>> result in result['right'].nodes False
One can reattach its edges so that
result
’s edges do have references to it.>>> result.reattach_edges() >>> result in result['right'].nodes True
If
override
isTrue
,nodeB['right']
would be replaced by the newresult['right']
.
- disconnect(axis=None)[source]#
Disconnects all edges of the node if they were connected to other nodes. If
axis
is sepcified, only the corresponding edge is disconnected.- Parameters
axis (int, str or Axis, optional) – Axis whose edge will be disconnected.
Examples
>>> nodeA = tk.Node(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), axes_names=('left', 'right')) >>> nodeC = tk.Node(shape=(4, 5), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] >>> _ = nodeB['right'] ^ nodeC['left'] >>> set(nodeB.neighbours()) == {nodeA, nodeC} True
>>> nodeB.disconnect() >>> nodeB.neighbours() == [] True
- make_tensor(shape=None, init_method='zeros', device=None, dtype=None, **kwargs)[source]#
Returns a tensor that can be put in the node, and is initialized according to
init_method
. By default, it has the same shape as the node.- Parameters
shape (list[int], tuple[int] or torch.Size, optional) – Shape of the tensor. If
None
, node’s shape will be used.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor.
dtype (torch.dtype, optional) – Dtype of the tensor.
kwargs (float) –
Keyword arguments for the different initialization methods:
low
,high
for uniform initialization. See torch.rand()mean
,std
for normal initialization. See torch.randn()
- Return type
torch.Tensor
- Raises
ValueError – If
init_method
is not one of “zeros”, “ones”, “copy”, “rand”, “randn”.
- set_tensor(tensor=None, init_method='zeros', device=None, dtype=None, **kwargs)[source]#
Sets new node’s tensor or creates one with
make_tensor()
and sets it. Before setting it, it is cast to the correct type:torch.Tensor
forNode
andtorch.nn.Parameter
forParamNode
.When a tensor is set in the node, it means the node stores it, that is, the node has its own memory address for its tensor, rather than a reference to other node’s tensor. Because of this,
set_tensor
cannot be applied for nodes that have a reference to other node’s tensor, since that tensor would be changed also in the referenced node. To overcome this issue, seereset_tensor_address()
.This can only be used for non
resultant``nodes that store their own tensors. For ``resultant
nodes, tensors are set automatically when computingOperations
.Although this can also be used for
data
nodes, input data will be usually automatically set into nodes when calling theTensorNetwork.forward()
method ofTensorNetwork
with a data tensor or a sequence of tensors. This method callsTensorNetwork.add_data()
, which can also be used to set data tensors into thedata
nodes.- Parameters
tensor (torch.Tensor, optional) – Tensor to be set in the node. If
None
, andinit_method
is provided, the tensor is created withmake_tensor()
. Otherwise, aNone
is set as node’s tensor.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor.
dtype (torch.dtype, optional) – Dtype of the tensor.
kwargs (float) – Keyword arguments for the different initialization methods. See
make_tensor()
.
- Raises
ValueError – If the node is a
resultant
node or if it does not store its own tensor.
Examples
>>> node = tk.Node(shape=(2, 3), axes_names=('left', 'right')) ... >>> # Call set_tensor without arguments uses the >>> # default init_method ("zeros") >>> node.set_tensor() >>> torch.equal(node.tensor, torch.zeros(node.shape)) True
>>> node.set_tensor(init_method='randn', mean=1., std=2., device='cuda') >>> torch.equal(node.tensor, torch.zeros(node.shape, device='cuda')) False
>>> node.device device(type='cuda', index=0)
>>> tensor = torch.randn(2, 3) >>> node.set_tensor(tensor) >>> torch.equal(node.tensor, tensor) True
- unset_tensor()[source]#
Replaces node’s tensor with
None
. This can only be used for nonresultant
nodes that store their own tensors.Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor is None False
>>> node.unset_tensor() >>> node.tensor is None True
- set_tensor_from(other)[source]#
Sets node’s tensor as the tensor used by
other
node. That is, when setting the tensor this way, the current node will store a reference to theother
node’s tensor, instead of having its own tensor.The node and
other
should be both the same type (Node
orParamNode
). Also, they should be in the sameTensorNetwork
.- Parameters
other (Node or ParamNode) – Node whose tensor is to be set in current node.
- Raises
TypeError – If
other
is a different type than the current node, or if it is in a different network.
Examples
>>> nodeA = tk.randn(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.empty(shape=(2, 3), ... name='nodeB', ... axes_names=('left', 'right'), ... network=nodeA.network) >>> nodeB.set_tensor_from(nodeA) >>> print(nodeB.tensor_address()) nodeA
Since
nodeB
has a reference tonodeA
’s tensor, if this one is changed,nodeB
will reproduce all the changes.>>> nodeA.tensor = torch.randn(nodeA.shape) >>> torch.equal(nodeA.tensor, nodeB.tensor) True
- reset_tensor_address()[source]#
Resets memory address of node’s tensor to reference the node itself. Thus, the node will store its own tensor, instead of having a reference to other node’s tensor.
Examples
>>> nodeA = tk.randn(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.empty(shape=(2, 3), ... name='nodeB', ... axes_names=('left', 'right'), ... network=nodeA.network) >>> nodeB.set_tensor_from(nodeA) >>> print(nodeB.tensor_address()) nodeA
Now one cannot set in
nodeB
a different tensor from the one innodeA
, unless tensor address is reset innodeB
.>>> nodeB.reset_tensor_address() >>> nodeB.tensor = torch.randn(nodeB.shape) >>> torch.equal(nodeA.tensor, nodeB.tensor) False
- move_to_network(network, visited=None)[source]#
Moves node to another network. All other nodes connected to it, or to a node connected to it, etc. are also moved to the new network.
If a node does not store its own tensor, and is moved to other network, it will recover the “ownership” of its tensor.
- Parameters
network (TensorNetwork) – Tensor Network to which the nodes will be moved.
visited (list[AbstractNode], optional) – List indicating the nodes that have been already moved to the new network, used by this DFS-like algorithm.
Examples
>>> net = tk.TensorNetwork() >>> nodeA = tk.Node(shape=(2, 3), ... axes_names=('left', 'right'), ... network=net) >>> nodeB = tk.Node(shape=(3, 4), ... axes_names=('left', 'right'), ... network=net) >>> nodeC = tk.Node(shape=(5, 5), ... axes_names=('left', 'right'), ... network=net) >>> _ = nodeA['right'] ^ nodeB['left']
If
nodeA
is moved to other network,nodeB
will also move, butnodeC
will not.>>> net2 = tk.TensorNetwork() >>> nodeA.network = net2 >>> nodeA.network == nodeB.network True
>>> nodeA.network != nodeC.network True
- sum(axis=None)[source]#
Returns the sum of all elements in the node’s tensor. If an
axis
is specified, the sum is over that axis. Ifaxis
is a sequence of axes, reduce over all of them.This is not a node
Operation
, hence it returns atorch.Tensor
instead of aNode
.See also torch.sum().
- Parameters
axis (int, str, Axis or list[int, str or Axis], optional) – Axis or sequence of axes over which to reduce.
- Return type
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[-0.2799, -0.4383, -0.8387], [ 1.6225, -0.3370, -1.2316]])
>>> node.sum() tensor(-1.5029)
>>> node.sum('left') tensor([ 1.3427, -0.7752, -2.0704])
- mean(axis=None)[source]#
Returns the mean of all elements in the node’s tensor. If an
axis
is specified, the mean is over that axis. Ifaxis
is a sequence of axes, reduce over all of them.This is not a node
Operation
, hence it returns atorch.Tensor
instead of aNode
.See also torch.mean().
- Parameters
axis (int, str, Axis or list[int, str or Axis], optional) – Axis or sequence of axes over which to reduce.
- Return type
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[ 1.4005, -0.0521, -1.2091], [ 1.9844, 0.3513, -0.5920]])
>>> node.mean() tensor(0.3139)
>>> node.mean('left') tensor([ 1.6925, 0.1496, -0.9006])
- std(axis=None)[source]#
Returns the std of all elements in the node’s tensor. If an
axis
is specified, the std is over that axis. Ifaxis
is a sequence of axes, reduce over all of them.This is not a node
Operation
, hence it returns atorch.Tensor
instead of aNode
.See also torch.std().
- Parameters
axis (int, str, Axis or list[int, str or Axis], optional) – Axis or sequence of axes over which to reduce.
- Return type
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[ 0.2111, -0.9551, -0.7812], [ 0.2254, 0.3381, -0.2461]])
>>> node.std() tensor(0.5567)
>>> node.std('left') tensor([0.0101, 0.9145, 0.3784])
- norm(p=2, axis=None, keepdim=False)[source]#
Returns the norm of all elements in the node’s tensor. If an
axis
is specified, the norm is over that axis. Ifaxis
is a sequence of axes, reduce over all of them.This is not a node
Operation
, hence it returns atorch.Tensor
instead of aNode
.See also torch.norm().
- Parameters
- Return type
torch.Tensor
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.tensor tensor([[ 1.5570, 1.8441, -0.0743], [ 0.4572, 0.7592, 0.6356]])
>>> node.norm() tensor(2.6495)
>>> node.norm(axis='left') tensor([1.6227, 1.9942, 0.6399])
- numel()[source]#
Returns the total number of elements in the node’s tensor.
See also torch.numel().
- Return type
int
Examples
>>> node = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> node.numel() 6
- conj()#
Returns a view of the node’s tensor with a flipped conjugate bit. If the node has a non-complex dtype, this function returns a new node with the same tensor.
See conj in the PyTorch documentation.
- Return type
Examples
>>> nodeA = tk.randn((3, 3), dtype=torch.complex64) >>> conjA = nodeA.conj() >>> conjA.is_conj() True
- contract_between(node2)#
Contracts all edges shared between two nodes. Batch contraction is automatically performed when both nodes have batch edges with the same names. It can also be performed using the operator
@
.Nodes
resultant
from this operation are called"contract_edges"
. The node that keeps information about theSuccessor
isself
.- Parameters
node2 (AbstractNode) – Second node of the contraction. Its non-contracted edges will appear last in the list of inherited edges of the resultant node.
- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 7, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> _ = nodeA['right'] ^ nodeB['left'] >>> result = nodeA @ nodeB >>> result.shape torch.Size([100, 10, 7])
- contract_between_(node2)#
In-place version of
contract_between()
.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultant
from this operation are called"contract_edges_ip"
.- Parameters
node2 (AbstractNode) – Second node of the contraction. Its non-contracted edges will appear last in the list of inherited edges of the resultant node.
- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 7, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> _ = nodeA['right'] ^ nodeB['left'] >>> result = nodeA.contract_between_(nodeB) >>> result.shape torch.Size([100, 10, 7])
nodeA
andnodeB
have been removed from the network.>>> nodeA.network is None True
>>> nodeB.network is None True
>>> del nodeA >>> del nodeB
- permute(axes)#
Permutes the nodes’ tensor, as well as its axes and edges to match the new shape.
See permute in the PyTorch documentation.
Nodes
resultant
from this operation are called"permute"
. The node that keeps information about theSuccessor
isself
.Examples
>>> node = tk.randn((2, 5, 7)) >>> result = node.permute((2, 0, 1)) >>> result.shape torch.Size([7, 2, 5])
- permute_(axes)#
Permutes the nodes’ tensor, as well as its axes and edges to match the new shape (in-place).
Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
See permute.
Nodes
resultant
from this operation use the same name asnode
.- Parameters
axes (list[int, str or Axis]) – List of axes in the permuted order.
- Returns
Node
>>> node = tk.randn((2, 5, 7))
>>> node = node.permute_((2, 0, 1))
>>> node.shape
torch.Size([7, 2, 5])
- renormalize(p=2, axis=None)#
Normalizes the node with the specified norm. That is, the tensor of
node
is divided by its norm.Different norms can be taken, specifying the argument
p
, and accross different dimensions, or node axes, specifying the argumentaxis
.See also torch.norm().
- Parameters
- Return type
Examples
>>> nodeA = tk.randn((3, 3)) >>> renormA = nodeA.renormalize() >>> renormA.norm() tensor(1.)
- split(node1_axes, node2_axes, mode='svd', side='left', rank=None, cum_percentage=None, cutoff=None)#
Splits one node in two via the decomposition specified in
mode
. Seesplit()
for a more complete explanation.Since the node is split in two, a new edge appears connecting both nodes. The axis that corresponds to this edge has the name
"split"
.Nodes
resultant
from this operation are called"split"
. The node that keeps information about theSuccessor
isself
.- Parameters
node1_axes (list[int, str or Axis]) – First set of edges, will appear as the edges of the first (left) resultant node.
node2_axes (list[int, str or Axis]) – Second set of edges, will appear as the edges of the second (right) resultant node.
mode ({"svd", "svdr", "qr", "rq"}) – Decomposition to be used.
side (str, optional) – If
mode
is “svd” or “svdr”, indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type
Examples
>>> node = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch')) >>> node_left, node_right = node.split(['left'], ['right'], ... mode='svd', ... rank=5) >>> node_left.shape torch.Size([100, 10, 5])
>>> node_right.shape torch.Size([100, 5, 15])
>>> node_left['split'] Edge( split_0[split] <-> split_1[split] )
- split_(node1_axes, node2_axes, mode='svd', side='left', rank=None, cum_percentage=None, cutoff=None)#
In-place version of
split()
.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Since the node is split in two, a new edge appears connecting both nodes. The axis that corresponds to this edge has the name
"split"
.Nodes
resultant
from this operation are called"split_ip"
.- Parameters
node1_axes (list[int, str or Axis]) – First set of edges, will appear as the edges of the first (left) resultant node.
node2_axes (list[int, str or Axis]) – Second set of edges, will appear as the edges of the second (right) resultant node.
mode ({"svd", "svdr", "qr", "rq"}) – Decomposition to be used.
side (str, optional) – If
mode
is “svd” or “svdr”, indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type
Examples
>>> node = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch')) >>> node_left, node_right = node.split_(['left'], ['right'], ... mode='svd', ... rank=5) >>> node_left.shape torch.Size([100, 10, 5])
>>> node_right.shape torch.Size([100, 5, 15])
>>> node_left['split'] Edge( split_ip_0[split] <-> split_ip_1[split] )
node
has been deleted (removed from the network), but it still exists until is deleted.>>> node.network is None True
>>> del node
Node#
- class tensorkrowch.Node(shape=None, axes_names=None, name=None, network=None, data=False, virtual=False, override_node=False, tensor=None, edges=None, override_edges=False, node1_list=None, init_method=None, device=None, dtype=None, **kwargs)[source]#
Base class for non-trainable nodes. Should be subclassed by any class of nodes that are not intended to be trained (e.g.
StackNode
).Can be used for fixed nodes of the
TensorNetwork
, or intermediate nodes that are resultant from anOperation
between nodes.All 4 types of nodes (
leaf
,data
,virtual
andresultant
) can beNode
. In fact,data
andresultant
nodes can only be of classNode
, since they are not intended to be trainable. To learn more about these 4 types of nodes, seeAbstractNode
.For a complete list of properties and methods, see also
AbstractNode
.- Parameters
shape (list[int], tuple[int] or torch.Size, optional) – Node’s shape, that is, the shape of its tensor. If
shape
andinit_method
are provided, a tensor will be made for the node. Otherwise,tensor
would be required.axes_names (list[str] or tuple[str], optional) – Sequence of names for each of the node’s axes. Names are used to access the edge that is attached to the node in a certain axis. Hence, they should be all distinct. They cannot contain blank spaces or special characters. By default, axes names will be
"axis_0"
, …,"axis_n"
, beingn
the nummber of axes. If an axis’ name contains the word"batch"
, it will define a batch edge. The word"stack"
cannot be used, since it is reserved for the stack edge ofStackNode
.name (str, optional) – Node’s name, used to access the node from de
TensorNetwork
where it belongs. It cannot contain blank spaces. By default, it is the name of the class (e.g."node"
,"paramnode"
).network (TensorNetwork, optional) – Tensor network where the node should belong. If
None
, a new tensor network will be created to contain the node.data (bool) – Boolean indicating if the node is a
data
node.virtual (bool) – Boolean indicating if the node is a
virtual
node.override_node (bool) – Boolean indicating whether the node should override (
True
) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNode
replaces the non-parameterized node in the network).tensor (torch.Tensor, optional) – Tensor that is to be stored in the node. If
None
,shape
andinit_method
will be required.edges (list[Edge], optional) – List of edges that are to be attached to the node. This can be used in case the node inherits the edges from other node(s), like results from
Operations
.override_edges (bool) – Boolean indicating whether the provided
edges
should be overriden (True
) when reattached (e.g. if a node is parameterized, it would be required that the newParamNode
’s edges are indeed connected to it, instead of to the original non-parameterized node).node1_list (list[bool], optional) – If
edges
are provided, the list ofnode1
attributes of each edge should also be provided.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor if
init_method
is provided.dtype (torch.dtype, optional) – Dtype of the tensor if
init_method
is provided.kwargs (float) – Keyword arguments for the different initialization methods. See
AbstractNode.make_tensor()
.
Examples
>>> node = tk.Node(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='my_node', ... init_method='randn', ... mean=0., ... std=1.) >>> node Node( name: my_node tensor: tensor([[[-1.2517, -1.8147], [-0.7997, -0.0440], [-0.2808, 0.3508], [-1.2380, 0.8859], [-0.3585, 0.8815]], [[-0.2898, -2.2775], [ 1.2856, -0.3222], [-0.8911, -0.4216], [ 0.0086, 0.2449], [-2.1998, -1.6295]]]) axes: [left input right] edges: [my_node[left] <-> None my_node[input] <-> None my_node[right] <-> None])
Also, one can use one of the Initializers to simplify:
>>> node = tk.randn((2, 5, 2)) >>> node Node( name: node tensor: tensor([[[ 0.6545, -0.0445], [-0.9265, -0.2730], [-0.5069, -0.6524], [-0.8227, -1.1211], [ 0.2390, 0.9432]], [[ 0.8633, 0.4402], [-0.6982, 0.4461], [-0.0633, -0.9320], [ 1.6023, 0.5406], [ 0.3489, -0.3088]]]) axes: [axis_0 axis_1 axis_2] edges: [node[axis_0] <-> None node[axis_1] <-> None node[axis_2] <-> None])
- parameterize(set_param=True)[source]#
Replaces the node with a parameterized version of it, that is, turns a fixed
Node
into a trainableParamNode
.Since the node is replaced, it will be completely removed from the network, and its neighbours will point to the new parameterized node.
- Parameters
set_param (bool) – Boolean indicating whether the node should be parameterized (
True
). Otherwise (False
), the non-parameterized node itself will be returned.- Returns
The original node or a parameterized version of it.
- Return type
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> paramnodeA = nodeA.parameterize() >>> nodeB.neighbours() == [paramnodeA] True
>>> isinstance(paramnodeA.tensor, torch.nn.Parameter) True
nodeA
still exists and has an edge pointing tonodeB
, but the latter does not “see” the former. It should be deleted.>>> del nodeA
To overcome this issue, one should override
nodeA
:>>> nodeA = nodeA.parameterize()
- copy(share_tensor=False)[source]#
Returns a copy of the node. That is, returns a node whose tensor is a copy of the original, whose edges are directly inherited (these are not copies, but the exact same edges) and whose name is extended with the suffix
"_copy"
.To create a copy that has its own (non-inherited) edges, one can use
reattach_edges()
afterwards.- Parameters
share_tensor (bool) – Boolean indicating whether the copied node should store its own copy of the tensor (
False
) or share it with the original node (True
) storing a reference to it.- Return type
Examples
>>> node = tk.randn(shape=(2, 3), name='node') >>> copy = node.copy() >>> node.tensor_address() != copy.tensor_address() True
>>> torch.equal(node.tensor, copy.tensor) True
If tensor is shared:
>>> copy = node.copy(True) >>> node.tensor_address() == copy.tensor_address() True
>>> torch.equal(node.tensor, copy.tensor) True
- change_type(leaf=False, data=False, virtual=False)[source]#
Changes node type, only if node is not a resultant node.
- Parameters
leaf (bool) – Boolean indicating if the new node type is
leaf
.data (bool) – Boolean indicating if the new node type is
data
.virtual (bool) – Boolean indicating if the new node type is
virtual
.
ParamNode#
- class tensorkrowch.ParamNode(shape=None, axes_names=None, name=None, network=None, virtual=False, override_node=False, tensor=None, edges=None, override_edges=False, node1_list=None, init_method=None, device=None, dtype=None, **kwargs)[source]#
Class for trainable nodes. Should be subclassed by any class of nodes that are intended to be trained (e.g.
ParamStackNode
).Should be used as the initial nodes conforming the
TensorNetwork
, if it is going to be trained. When operating these initial nodes, the resultant nodes will be non-parameterized (e.g.Node
,StackNode
).The main difference with
Nodes
is thatParamNodes
havetorch.nn.Parameter
tensors instead oftorch.Tensor
. Therefore, aParamNode
is a sort of parameter that is attached to theTensorNetwork
(which is itself atorch.nn.Module
). That is, the list of parameters of the tensor network module contains the tensors of allParamNodes
.ParamNodes
can only beleaf
andvirtual
(e.g. avirtual
node used in a uniformTensorNetwork
to store the tensor that is shared by all the trainable nodes must also be aParamNode
, since it stores atorch.nn.Parameter
).For a complete list of properties and methods, see also
AbstractNode
.- Parameters
shape (list[int], tuple[int] or torch.Size, optional) – Node’s shape, that is, the shape of its tensor. If
shape
andinit_method
are provided, a tensor will be made for the node. Otherwise,tensor
would be required.axes_names (list[str] or tuple[str], optional) – Sequence of names for each of the node’s axes. Names are used to access the edge that is attached to the node in a certain axis. Hence, they should be all distinct. They cannot contain blank spaces or special characters. By default, axes names will be
"axis_0"
, …,"axis_n"
, beingn
the nummber of axes. If an axis’ name contains the word"batch"
, it will define a batch edge. The word"stack"
cannot be used, since it is reserved for the stack edge ofStackNode
.name (str, optional) – Node’s name, used to access the node from de
TensorNetwork
where it belongs. It cannot contain blank spaces. By default, it is the name of the class (e.g."node"
,"paramnode"
).network (TensorNetwork, optional) – Tensor network where the node should belong. If
None
, a new tensor network will be created to contain the node.virtual (bool) – Boolean indicating if the node is a
virtual
node.override_node (bool) – Boolean indicating whether the node should override (
True
) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNode
replaces the non-parameterized node in the network).tensor (torch.Tensor, optional) – Tensor that is to be stored in the node. If
None
,shape
andinit_method
will be required.edges (list[Edge], optional) – List of edges that are to be attached to the node. This can be used in case the node inherits the edges from other node(s), like results from
Operations
.override_edges (bool) – Boolean indicating whether the provided
edges
should be overriden (True
) when reattached (e.g. if a node is parameterized, it would be required that the newParamNode
’s edges are indeed connected to it, instead of to the original non-parameterized node).node1_list (list[bool], optional) – If
edges
are provided, the list ofnode1
attributes of each edge should also be provided.init_method ({"zeros", "ones", "copy", "rand", "randn"}, optional) – Initialization method.
device (torch.device, optional) – Device where to initialize the tensor if
init_method
is provided.dtype (torch.dtype, optional) – Dtype of the tensor if
init_method
is provided.kwargs (float) – Keyword arguments for the different initialization methods. See
AbstractNode.make_tensor()
.
Examples
>>> node = tk.ParamNode(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='my_paramnode', ... init_method='randn', ... mean=0., ... std=1.) >>> node ParamNode( name: my_paramnode tensor: Parameter containing: tensor([[[ 1.8090, -0.1371], [-0.0501, -1.0371], [ 1.4588, -0.8361], [-0.4974, -1.9957], [ 0.3760, -1.0412]], [[ 0.3393, -0.2503], [ 1.7752, -0.0188], [-0.9561, -0.0806], [-1.0465, -0.5731], [ 1.5021, 0.4181]]], requires_grad=True) axes: [left input right] edges: [my_paramnode[left] <-> None my_paramnode[input] <-> None my_paramnode[right] <-> None])
Also, one can use one of the Initializers to simplify:
>>> node = tk.randn((2, 5, 2), ... param_node=True) >>> node ParamNode( name: paramnode tensor: Parameter containing: tensor([[[-0.8442, 1.4184], [ 0.4431, -1.4385], [-0.5161, -0.6492], [ 0.2095, 0.5760], [-0.9925, -1.5797]], [[-0.8649, -0.5401], [-0.1091, 1.1654], [-0.3821, -0.2477], [-0.7688, -2.4731], [-0.0234, 0.9618]]], requires_grad=True) axes: [axis_0 axis_1 axis_2] edges: [paramnode[axis_0] <-> None paramnode[axis_1] <-> None paramnode[axis_2] <-> None])
- property grad#
Returns gradient of the param-node’s tensor.
See also torch.Tensor.grad
- Return type
torch.Tensor or None
Examples
>>> paramnode = tk.randn((2, 3), param_node=True) >>> paramnode.tensor Parameter containing: tensor([[-0.3340, 0.6811, -0.2866], [ 1.3371, 1.4761, 0.6551]], requires_grad=True)
>>> paramnode.sum().backward() >>> paramnode.grad tensor([[1., 1., 1.], [1., 1., 1.]])
- parameterize(set_param=True)[source]#
Replaces the param-node with a de-parameterized version of it, that is, turns a
ParamNode
into a non-trainable, fixedNode
.Since the param-node is replaced, it will be completely removed from the network, and its neighbours will point to the new node.
- Parameters
set_param (bool) – Boolean indicating whether the node should stay parameterized (
True
), thus returning the param-node itself. Otherwise (False
), the param-node will be de-parameterized.- Returns
The original node or a de-parameterized version of it.
- Return type
Examples
>>> paramnodeA = tk.randn((2, 3), param_node=True) >>> paramnodeB = tk.randn((3, 4), param_node=True) >>> _ = paramnodeA[1] ^ paramnodeB[0] >>> nodeA = paramnodeA.parameterize(False) >>> paramnodeB.neighbours() == [nodeA] True
>>> isinstance(nodeA.tensor, torch.nn.Parameter) False
paramnodeA
still exists and has an edge pointing toparamnodeB
, but the latter does not “see” the former. It should be deleted.>>> del paramnodeA
To overcome this issue, one should override
paramnodeA
:>>> paramnodeA = paramnodeA.parameterize()
- copy(share_tensor=False)[source]#
Returns a copy of the param-node. That is, returns a param-node whose tensor is a copy of the original, whose edges are directly inherited (these are not copies, but the exact same edges) and whose name is extended with the suffix
"_copy"
.To create a copy that has its own (non-inherited) edges, one can use
reattach_edges()
afterwards.- Parameters
share_tensor (bool) – Boolean indicating whether the copied param-node should store its own copy of the tensor (
False
) or share it with the original param-node (True
) storing a reference to it.- Return type
Examples
>>> paramnode = tk.randn(shape=(2, 3), name='node', param_node=True) >>> copy = paramnode.copy() >>> paramnode.tensor_address() != copy.tensor_address() True
>>> torch.equal(paramnode.tensor, copy.tensor) True
If tensor is shared:
>>> copy = paramnode.copy(True) >>> paramnode.tensor_address() == copy.tensor_address() True
>>> torch.equal(paramnode.tensor, copy.tensor) True
StackNode#
- class tensorkrowch.StackNode(nodes=None, axes_names=None, name=None, network=None, override_node=False, tensor=None, edges=None, node1_list=None)[source]#
Class for stacked nodes.
StackNodes
are nodes that store the information of a list of nodes that are stacked viastack()
, although they can also be instantiated directly. To do so, there are two options:Provide a sequence of nodes: if
nodes
are provided, their tensors will be stacked and stored in theStackNode
. It is necessary that all nodes are of the same class (Node
orParamNode
), have the same rank (although dimension of each leg can be different for different nodes; in which case smaller tensors are extended with 0’s to match the dimensions of the largest tensor in the stack), same axes names (to ensure only the “same kind” of nodes are stacked), belong to the same network and have edges with the same type in each axis (Edge
orParamEdge
).Provide a stacked tensor: if the stacked
tensor
is provided, it is also necessary to specify theaxes_names
,network
,edges
andnode1_list
.
StackNodes
have an additional axis for the new stack dimension, which is a batch edge. This way, some contractions can be computed in parallel by first stacking two sequences of nodes (connected pair-wise), performing the batch contraction and finally unbinding theStackNodes
to retrieve just one sequence of nodes.For the rest of the axes, a list of the edges corresponding to all nodes in the stack is stored, so that, when
unbinding
the stack, it can be inferred to which nodes the unbound nodes have to be connected.- Parameters
nodes (list[AbstractNode] or tuple[AbstractNode], optional) – Sequence of nodes that are to be stacked. They should all be of the same class (
Node
orParamNode
), have the same rank, same axes names and belong to the same network. They do not need to have equal shapes.axes_names (list[str], tuple[str], optional) – Sequence of names for each of the node’s axes. Names are used to access the edge that is attached to the node in a certain axis. Hence, they should be all distinct. Necessary if
nodes
are not provided.name (str, optional) – Node’s name, used to access the node from de
TensorNetwork
where it belongs. It cannot contain blank spaces.network (TensorNetwork, optional) – Tensor network where the node should belong. Necessary if
nodes
are not provided.override_node (bool, optional) – Boolean indicating whether the node should override (
True
) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNode
replaces the non-parameterized node in the network).tensor (torch.Tensor, optional) – Tensor that is to be stored in the node. Necessary if
nodes
are not provided.edges (list[Edge], optional) – List of edges that are to be attached to the node. Necessary if
nodes
are not provided.node1_list (list[bool], optional) – If
edges
are provided, the list ofnode1
attributes of each edge should also be provided. Necessary ifnodes
are not provided.
Examples
>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = tk.unbind(stack_nodes @ stack_data) >>> print(result[0].name) unbind_0
>>> result[0].axes [Axis( left (0) ), Axis( right (1) )]
>>> result[0].shape torch.Size([2, 2])
- property edges_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list of all the edges (one from each node) that correspond to that axis.
- property node1_lists_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list with the
node1
attribute of that axis for all nodes.
- reconnect(other)[source]#
Re-connects the
StackNode
to another(Param)StackNode
, in the axes where the original stacked nodes were already connected.
- unbind()#
Unbinds a
StackNode
orParamStackNode
, where the first dimension is assumed to be the stack dimension.If
auto_unbind()
is set toFalse
, each resultant node will store its own tensor. Otherwise, they will have only a reference to the corresponding slice of the(Param)StackNode
.See
TensorNetwork
to learn how theauto_unbind
mode affects the computation ofunbind()
.Nodes
resultant
from this operation are called"unbind"
. The node that keeps information about theSuccessor
isself
.- Return type
list[Node]
Examples
>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = stack_nodes @ stack_data >>> result = result.unbind() >>> print(result[0].name) unbind_0
>>> result[0].axes [Axis( left (0) ), Axis( right (1) )]
>>> result[0].shape torch.Size([2, 2])
ParamStackNode#
- class tensorkrowch.ParamStackNode(nodes, name=None, virtual=False, override_node=False)[source]#
Class for parametric stacked nodes. They are essentially the same as
StackNodes
but they areParamNodes
.They are used to optimize memory usage and save some time when the first operation that occurs to param-nodes in a contraction (that might be computed several times during training) is
stack()
. If this is the case, the param-nodes no longer store their own tensors, but rather they make reference to a slide of a greaterParamStackNode
(ifauto_stack
attribute of theTensorNetwork
is set toTrue
). Hence, that firststack()
is never actually computed.The
ParamStackNode
that results from this process has the name"virtual_result_stack"
, which contains the reserved name"virtual_result"
, as explainedhere
. This node stores the tensor from which all the stackedParamNodes
just take one slice.This behaviour occurs when stacking param-nodes via
stack()
, not when instantiatingParamStackNode
manually.ParamStackNodes
can only be instantiated by providing a sequence of nodes.- Parameters
nodes (list[AbstractNode] or tuple[AbstractNode]) – Sequence of nodes that are to be stacked. They should all be of the same class (
Node
orParamNode
), have the same rank, same axes names and belong to the same network. They do not need to have equal shapes.name (str, optional) – Node’s name, used to access the node from de
TensorNetwork
where it belongs. It cannot contain blank spaces.virtual (bool, optional) – Boolean indicating if the node is a
virtual
node. Since it will be used mainly for the case describedhere
, the node will bevirtual
, it will not be an effective part of the tensor network.override_node (bool, optional) – Boolean indicating whether the node should override (
True
) another node in the network that has the same name (e.g. if a node is parameterized, it would be required that a newParamNode
replaces the non-parameterized node in the network).
Examples
>>> net = tk.TensorNetwork() >>> net.auto_stack = True >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net, ... param_node=True) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_nodes.name = 'my_stack' >>> print(nodes[0].tensor_address()) my_stack
>>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = tk.unbind(stack_nodes @ stack_data) >>> print(result[0].name) unbind_0
>>> print(result[0].axes) [Axis( left (0) ), Axis( right (1) )]
>>> print(result[0].shape) torch.Size([2, 2])
- property edges_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list of all the edges (one from each node) that correspond to that axis.
- property node1_lists_dict#
Returns dictionary where the keys are the axes. For each axis, the value is the list with the
node1
attribute of that axis for all nodes.
- reconnect(other)[source]#
Re-connects the
StackNode
to another(Param)StackNode
, in the axes where the original stacked nodes were already connected.
- unbind()#
Unbinds a
StackNode
orParamStackNode
, where the first dimension is assumed to be the stack dimension.If
auto_unbind()
is set toFalse
, each resultant node will store its own tensor. Otherwise, they will have only a reference to the corresponding slice of the(Param)StackNode
.See
TensorNetwork
to learn how theauto_unbind
mode affects the computation ofunbind()
.Nodes
resultant
from this operation are called"unbind"
. The node that keeps information about theSuccessor
isself
.- Return type
list[Node]
Examples
>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks >>> _ = stack_nodes['input'] ^ stack_data['feature'] >>> result = stack_nodes @ stack_data >>> result = result.unbind() >>> print(result[0].name) unbind_0
>>> result[0].axes [Axis( left (0) ), Axis( right (1) )]
>>> result[0].shape torch.Size([2, 2])
Edges#
Edge#
- class tensorkrowch.Edge(node1, axis1, node2=None, axis2=None)[source]#
Base class for edges. Should be subclassed by any new class of edges.
An edge is nothing more than an object that wraps references to the nodes it connects. Thus, it stores information like the nodes it connects, the corresponding nodes’ axes it is attached to, whether it is dangling or batch, its size, etc.
Above all, its importance lies in that edges enable to connect nodes, forming any possible graph, and to perform easily
Operations
like contracting and splitting nodes.Furthermore, edges have specific operations like
contract()
orsvd()
(and its variations), as well as in-place versions of them (contract_()
,svd_()
, etc.) that allow in-place modification of theTensorNetwork
.- Parameters
node1 (AbstractNode) – First node to which the edge is connected.
axis1 (int, str or Axis) – Axis of
node1
where the edge is attached.node2 (AbstractNode, optional) – Second node to which the edge is connected. If
None,
the edge will be dangling.axis2 (int, str, Axis, optional) – Axis of
node2
where the edge is attached.
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> nodeA[0] Edge( node_0[axis_0] <-> None ) (Dangling Edge)
>>> nodeA[1] Edge( node_0[axis_1] <-> node_1[axis_0] )
>>> nodeB[1] Edge( node_1[axis_1] <-> None ) (Dangling Edge)
- property node1#
Returns
node1
of the edge.
- property node2#
Returns
node2
of the edge. If the edge is dangling, it isNone
.
- property nodes#
Returns a list with
node1
andnode2
.
- property axis1#
Returns axis where the edge is attached to
node1
.
- property axis2#
Returns axis where the edge is attached to
node2
. If the edge is dangling, it isNone
.
- property axes#
Returns a list of axes where the edge is attached to
node1
andnode2
, respectively.
- property name#
Returns edge’s name. It is formed with the corresponding nodes’ and axes’ names.
Examples
>>> nodeA = tk.Node(shape=(2, 3), ... name='nodeA', ... axes_names=['left', 'right']) >>> edge = nodeA['right'] >>> print(edge.name) nodeA[right] <-> None
>>> nodeB = tk.Node(shape=(3, 4), ... name='nodeB', ... axes_names=['left', 'right']) >>> _ = new_edge = nodeA['right'] ^ nodeB['left'] >>> print(new_edge.name) nodeA[right] <-> nodeB[left]
- change_size(size)[source]#
Changes size of the edge, thus changing the size of tensors of
node1
andnode2
at the corresponding axes. If new size is smaller, the tensor will be cropped; if larger, the tensor will be expanded with zeros. In both cases, the process (cropping/expanding) occurs at the “right”, “bottom”, “back”, etc. of each dimension.- Parameters
size (int) – New size of the edge.
Examples
>>> nodeA = tk.ones((2, 3)) >>> nodeB = tk.ones((3, 4)) >>> _ = edge = nodeA[1] ^ nodeB[0] >>> edge.size() 3
>>> edge.change_size(4) >>> nodeA.tensor tensor([[1., 1., 1., 0.], [1., 1., 1., 0.]])
>>> nodeB.tensor tensor([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [0., 0., 0., 0.]])
>>> edge.size() 4
>>> edge.change_size(2) >>> nodeA.tensor tensor([[1., 1.], [1., 1.]])
>>> nodeB.tensor tensor([[1., 1., 1., 1.], [1., 1., 1., 1.]])
>>> edge.size() 2
- copy()[source]#
Returns a copy of the edge, that is, a new edge referencing the same nodes at the same axes.
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = edge = nodeA[1] ^ nodeB[0] >>> copy = edge.copy() >>> copy != edge True
>>> copy.is_attached_to(nodeA) True
>>> copy.is_attached_to(nodeB) True
- connect(other)[source]#
Connects dangling edge to another dangling edge. It is necessary that both edges have the same size so that contractions along that edge can be computed.
Note that this connectes edges from
leaf
(ordata
,virtual
) nodes, but never fromresultant
nodes. If one tries to connect one of the inherited edges of aresultant
node, the new connected edge will be attached to the originalleaf
nodes from which theresultant
node inherited its edges. Hence, theresultant
node will not “see” the connection until theTensorNetwork
isreset()
.If the nodes that are being connected come from different networks, the
node2
(and its connected component) will be moved tonode1
’s network. See alsomove_to_network()
.Examples
To connect two edges, the overloaded operator
^
can also be used.>>> nodeA = tk.Node(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), ... name='nodeB', ... axes_names=('left', 'right')) >>> _ = new_edge = nodeA['right'] ^ nodeB['left'] # Same as .connect() >>> print(new_edge.name) nodeA[right] <-> nodeB[left]
- disconnect()[source]#
Disconnects connected edge, that is, the connected edge is split into two dangling edges, one for each node.
Examples
To disconnect an edge, the overloaded operator
|
can also be used.>>> nodeA = tk.Node(shape=(2, 3), ... name='nodeA', ... axes_names=('left', 'right')) >>> nodeB = tk.Node(shape=(3, 4), ... name='nodeB', ... axes_names=('left', 'right')) >>> _ = new_edge = nodeA['right'] ^ nodeB['left'] >>> new_edgeA, new_edgeB = new_edge | new_edge # Same as .disconnect() >>> print(new_edgeA.name) nodeA[right] <-> None
>>> print(new_edgeB.name) nodeB[left] <-> None
- contract()#
Contracts the nodes that are connected through the edge.
This only works if the nodes connected through the edge are
leaf
nodes. Otherwise, this will perform the contraction between theleaf
nodes that were connected through this edge.Nodes
resultant
from this operation are called"contract_edges"
. The node that keeps information about theSuccessor
isself.node1
.- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeA') >>> nodeB = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeB') ... >>> _ = nodeA['one'] ^ nodeB['one'] >>> _ = nodeA['two'] ^ nodeB['two'] >>> _ = nodeA['three'] ^ nodeB['three'] >>> result = nodeA['one'].contract() >>> result.shape torch.Size([15, 20, 15, 20])
- contract_()#
In-place version of
contract()
.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultant
from this operation are called"contract_edges_ip"
.- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeA') >>> nodeB = tk.randn(shape=(10, 15, 20), ... axes_names=('one', 'two', 'three'), ... name='nodeB') ... >>> _ = nodeA['one'] ^ nodeB['one'] >>> _ = nodeA['two'] ^ nodeB['two'] >>> _ = nodeA['three'] ^ nodeB['three'] >>> result = nodeA['one'].contract_() >>> result.shape torch.Size([15, 20, 15, 20])
nodeA
andnodeB
have been removed from the network.>>> nodeA.network is None True
>>> nodeB.network is None True
>>> del nodeA >>> del nodeB
- qr()#
Contracts an edge via
contract()
and splits it viasplit()
usingmode = "qr"
. Seesplit()
for a more complete explanation.This only works if the nodes connected through the edge are
leaf
nodes. Otherwise, this will perform the contraction between theleaf
nodes that were connected through this edge.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.qr() ... >>> new_nodeA.shape torch.Size([10, 10, 100])
>>> new_nodeB.shape torch.Size([10, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- qr_()#
In-place version of
qr()
.Contracts an edge in-place via
contract_()
and splits it in-place viasplit_()
usingmode = "qr"
. Seesplit()
for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultant
from this operation use the same names as the original nodes connected byself
.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = new_edge.qr_() ... >>> nodeA.shape torch.Size([10, 10, 100])
>>> nodeB.shape torch.Size([10, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
- rq()#
Contracts an edge via
contract()
and splits it viasplit()
usingmode = "rq"
. Seesplit()
for a more complete explanation.This only works if the nodes connected through the edge are
leaf
nodes. Otherwise, this will perform the contraction between theleaf
nodes that were connected through this edge.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.rq() ... >>> new_nodeA.shape torch.Size([10, 10, 100])
>>> new_nodeB.shape torch.Size([10, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- rq_()#
In-place version of
rq()
.Contracts an edge in-place via
contract_()
and splits it in-place viasplit_()
usingmode = "qr"
. Seesplit()
for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultant
from this operation use the same names as the original nodes connected byself
.Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = tk.rq_(new_edge) ... >>> nodeA.shape torch.Size([10, 10, 100])
>>> nodeB.shape torch.Size([10, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
- svd(side='left', rank=None, cum_percentage=None, cutoff=None)#
Contracts an edge via
contract()
and splits it viasplit()
usingmode = "svd"
. Seesplit()
for a more complete explanation.This only works if the nodes connected through the edge are
leaf
nodes. Otherwise, this will perform the contraction between theleaf
nodes that were connected through this edge.- Parameters
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.svd(rank=7) ... >>> new_nodeA.shape torch.Size([10, 7, 100])
>>> new_nodeB.shape torch.Size([7, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- svd_(side='left', rank=None, cum_percentage=None, cutoff=None)#
In-place version of
svd()
.Contracts an edge in-place via
contract_()
and splits it in-place viasplit_()
usingmode = "svd"
. Seesplit()
for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultant
from this operation use the same names as the original nodes connected byself
.- Parameters
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = new_edge.svd_(rank=7) ... >>> nodeA.shape torch.Size([10, 7, 100])
>>> nodeB.shape torch.Size([7, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
- svdr(side='left', rank=None, cum_percentage=None, cutoff=None)#
Contracts an edge via
contract()
and splits it viasplit()
usingmode = "svdr"
. Seesplit()
for a more complete explanation.This only works if the nodes connected through the edge are
leaf
nodes. Otherwise, this will perform the contraction between theleaf
nodes that were connected through this edge.- Parameters
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> new_nodeA, new_nodeB = new_edge.svdr(rank=7) ... >>> new_nodeA.shape torch.Size([10, 7, 100])
>>> new_nodeB.shape torch.Size([7, 20, 100])
>>> print(new_nodeA.axes_names) ['left', 'right', 'batch']
>>> print(new_nodeB.axes_names) ['left', 'right', 'batch']
Original nodes still exist in the network
>>> assert nodeA.network == new_nodeA.network >>> assert nodeB.network == new_nodeB.network
- svdr_(side='left', rank=None, cum_percentage=None, cutoff=None)#
In-place version of
svdr()
.Contracts an edge in-place via
contract_()
and splits it in-place viasplit_()
usingmode = "svdr"
. Seesplit()
for a more complete explanation.Following the PyTorch convention, names of functions ended with an underscore indicate in-place operations.
Nodes
resultant
from this operation use the same names as the original nodes connected byself
.- Parameters
side (str, optional) – Indicates the side to which the diagonal matrix \(S\) should be contracted. If “left”, the first resultant node’s tensor will be \(US\), and the other node’s tensor will be \(V^{\dagger}\). If “right”, their tensors will be \(U\) and \(SV^{\dagger}\), respectively.
rank (int, optional) – Number of singular values to keep.
cum_percentage (float, optional) –
Proportion that should be satisfied between the sum of all singular values kept and the total sum of all singular values.
\[\frac{\sum_{i \in \{kept\}}{s_i}}{\sum_{i \in \{all\}}{s_i}} \ge cum\_percentage\]cutoff (float, optional) – Quantity that lower bounds singular values in order to be kept.
- Return type
Examples
>>> nodeA = tk.randn(shape=(10, 15, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeA') >>> nodeB = tk.randn(shape=(15, 20, 100), ... axes_names=('left', 'right', 'batch'), ... name='nodeB') ... >>> new_edge = nodeA['right'] ^ nodeB['left'] >>> nodeA, nodeB = new_edge.svdr_(rank=7) ... >>> nodeA.shape torch.Size([10, 7, 100])
>>> nodeB.shape torch.Size([7, 20, 100])
>>> print(nodeA.axes_names) ['left', 'right', 'batch']
>>> print(nodeB.axes_names) ['left', 'right', 'batch']
StackEdge#
- class tensorkrowch.StackEdge(edges, node1_list, node1, axis1, node2=None, axis2=None)[source]#
Class for stacked edges. They are just like
Edges
but used when stacking a collection of nodes into aStackNode
. When doing this, all edges of the stacked nodes must be kept, since they have the information regarding the nodes’ neighbours, which will be used when :func: unbinding <unbind> the stack.- Parameters
edges (list[Edge]) – List of edges (one from each node that is being stacked) that are attached to the equivalent of
axis1
in each node.node1_list (list[bool]) – List of
axis1
attributes (one from each node that is being stacked) of the equivalent ofaxis1
in each node.node1 (StackNode or ParamStackNode) – First node to which the edge is connected.
axis1 (int, str or Axis) – Axis of
node1
where the edge is attached.node2 (StackNode or ParamStackNode, optional) – Second node to which the edge is connected. If
None
, the edge will be dangling.axis2 (int, str, Axis, optional) – Axis of
node2
where the edge is attached.
- property edges#
Returns list of stacked edges corresponding to this axis.
- property node1_list#
Returns list of
node1
’s corresponding to this axis.
- connect(other)[source]#
Same as
connect()
but it is first verified that all stackededges()
corresponding to bothStackEdges
are the same.That is, this is a redundant operation to re-connect a list of edges that should be already connected. However, this is mandatory, since when stacking two sequences of nodes independently it cannot be inferred that the resultant
StackNodes
had to be connected.- Parameters
other (StackEdge) – The other edge to which current edge will be connected.
- Return type
Examples
To connect two stack-edges, the overloaded operator
^
can also be used.>>> net = tk.TensorNetwork() >>> nodes = [tk.randn(shape=(2, 4, 2), ... axes_names=('left', 'input', 'right'), ... network=net) ... for _ in range(10)] >>> data = [tk.randn(shape=(4,), ... axes_names=('feature',), ... network=net) ... for _ in range(10)] ... >>> for i in range(10): ... _ = nodes[i]['input'] ^ data[i]['feature'] ... >>> stack_nodes = tk.stack(nodes) >>> stack_data = tk.stack(data) ... >>> # It is necessary to re-connect stacks to be able to contract >>> _ = new_edge = stack_nodes['input'] ^ stack_data['feature'] >>> print(new_edge.name) stack_0[input] <-> stack_1[feature]
Successor#
- class tensorkrowch.Successor(node_ref, index, child, hints=None)[source]#
Class for successors. This is a sort of cache memory for
Operations
that have been already computed.For instance, when contracting two nodes, the result gives a new node that stores the tensor resultant from contracting both nodes’ tensors. However, when training a
TensorNetwork
, the tensors inside the nodes will change every epoch, but there is actually no need to create a new resultant node every time. Instead, it is more efficient to keep track of which node arose as the result of an operation, and simply change its tensor.Hence, a
Successor
is instantiated providing details to get the operand nodes’ tensors, as well as a reference to the resultant node, and some hints that might help accelerating the computations the next time the operation is performed.These properties can be accessed via
successor.node_ref
,successor.index
,successor.child
andsuccessor.hints
.See the different
operations
to learn which resultant node keeps theSuccessor
information.- Parameters
node_ref (Node, ParamNode, or list[Node, ParamNode]) – For the nodes that are involved in an operation, this are the corresponding nodes that store their tensors.
index (list[int, slice] or list[list[int, slice]], optional) – For the nodes that are involved in an operation, this are the corresponding indices used to access their tensors.
child (Node or list[Node]) – The node or list of nodes that result from an operation.
hints (any, optional) – A dictionary of hints created the first time an operation is computed in order to save some computation in the next calls of the operation.
Examples
When contracting two nodes, a
Successor
is created and added to the list of successors of the first node (left operand).>>> nodeA = tk.randn(shape=(2, 3), axes_names=('left', 'right')) >>> nodeB = tk.randn(shape=(3, 4), axes_names=('left', 'right')) >>> _ = nodeA['right'] ^ nodeB['left'] ... >>> # Contract nodes >>> result = nodeA @ nodeB >>> print(result.name) contract_edges
>>> # To get a successor, the name of the operation and the arguments have >>> # to be provided as keys of the successors dictionary >>> nodeA.successors['contract_edges'][(None, nodeA, nodeB)].child == result True
Tensor Network#
- class tensorkrowch.TensorNetwork(name=None)[source]#
Class for arbitrary Tensor Networks. Subclass of PyTorch
torch.nn.Module
.Tensor Networks are the central objects of TensorKrowch. Basically, a tensor network is a graph with vertices (
Nodes
) connected byEdges
. In these models, nodes’ tensors will be trained so that the contraction of the whole network approximates a certain function. Hence,TensorNetwork
’s are the trainable objects of TensorKrowch, very much liketorch.nn.Module
’s are the trainable objects of PyTorch.Recall that the common way of defining models out of
torch.nn.Module
is by defining a subclass where the__init__
andforward
methods are overriden:__init__: Defines the model itself (its layers, attributes, etc.).
forward: Defines the way the model operates, that is, how the different parts of the model might combine to get an output from a particular input.
With
TensorNetwork
, the workflow is similar, though there are other methods that should be overriden:__init__: Defines the graph of the tensor network and initializes the tensors of the nodes. See
AbstractNode
andEdge
to learn how to create nodes and connect them.set_data_nodes (optional): Creates the data nodes where the data tensor(s) will be placed. Usually, it will just select the edges to which the
data
nodes should be connected, and call the parent method. Seeset_data_nodes()
to learn good practices to override it. See alsoadd_data()
.add_data (optional): Adds new data tensors that will be stored in
data
nodes. Usually it will not be necessary to override this method, but if one wants to customize how data is set into thedata
nodes,add_data()
can be overriden.contract: Defines the contraction algorithm of the whole tensor network, thus returning a single node. Very much like
forward
this is the main method that describes how the components of the network are combined. Hence, inTensorNetwork
theforward()
method shall not be overriden, since it will just callset_data_nodes()
, if needed,add_data()
andcontract()
and then it will return the tensor corresponding to the lastresultant
node. Hence, the order in whichOperations
are called fromcontract
is important. The last operation must be the one returning the final node.
Although one can define how the network is going to be contracted, there are a couple of modes that can change how this contraction behaves at a lower level:
auto_stack (
False
by default): This mode indicates whether the operationstack()
can take control of the memory management of the network to skip some steps in future computations. Ifauto_stack
is set toTrue
and a collection ofParamNodes
arestacked
(as the first operation in which these nodes are involved), then those nodes will no longer store their own tensors, but rather avirtual
ParamStackNode
will store the stacked tensor, avoiding the computation of that firststack()
in every contraction. This behaviour is not possible ifauto_stack
is set toFalse
, in which case all nodes will always store their own tensors.Setting
auto_stack
toTrue
will be faster for both inference and training. However, while experimenting withTensorNetwork
’s one might want that all nodes store their own tensors to avoid problems.auto_unbind (
False
by default): This mode indicates whether the operationunbind()
has to actually unbind the stacked tensor or just generate a collection of references. That is, ifauto_unbind
is set toFalse
,unbind()
creates a collection of nodes, each of them storing the corresponding slice of the stacked tensor. Ifauto_unbind
is set toTrue
,unbind()
just creates the nodes and gives each of them an index to reference the stacked tensor, so that each node’s tensor would be retrieved by indexing the stack. This avoids performing the operation, since these indices will be the same in subsequent iterations.Setting
auto_unbind
toTrue
will be faster for inference, but slower for training.
Once the training algorithm starts, these modes should not be changed (very often at least), since changing them entails first
resetting
the whole network, which is a costly method.When the
TensorNetwork
is defined, it has a bunch ofleaf
,data
andvirtual
nodes that make up the network structure, each of them storing its own tensor. However, when the network is contracted, severalresultant
nodes become new members of the network, even modifying its memory (depending on theauto_stack
andauto_unbind
modes).Therefore, if one wants to
reset()
the network to its initial state after performing some operations, all theresultant
nodes should be deleted, and all the tensors should return to its nodes (each node stores its own tensor). This is exactly whatreset()
does. Besides, sinceauto_stack
andauto_unbind
can change how the tensors are stored, if one wants to change these modes, the network should be first reset (this is already done automatically when changing the modes).See
AbstractNode
to learn about the 4 excluding types of nodes, andreset()
to learn about how these nodes are treated differently.There are also some special nodes that one should take into account. These are specified by name. See
AbstractNode
to learn about reserved nodes’ names, andreset()
to learn about how these nodes are treated differently.Other thing one must take into account is the naming of
Nodes
. Since the name of aNode
is used to access it from theTensorNetwork
, the same name cannot be used by more than oneNode
. In that case, repeated names get an automatic enumeration of the form"name_{number}"
(underscore followed by number).To add a custom enumeration to keep track of the nodes of the network in a user-defined way, one may use brackets or parenthesis:
"name_({number})"
.For an example, check this tutorial.
- Parameters
name (str, optional) – Network’s name. By default, it is the name of the class (e.g.
"tensornetwork"
).
- property nodes#
Returns dictionary with all the nodes belonging to the network (
leaf
,data
,virtual
andresultant
).
- property nodes_names#
Returns list of names of all the nodes belonging to the network (
leaf
,data
,virtual
andresultant
).
- property leaf_nodes#
Returns dictionary of
leaf
nodes of the network.
- property data_nodes#
Returns dictionary of
data
nodes of the network.
- property virtual_nodes#
Returns dictionary of
virtual
nodes of the network.
- property resultant_nodes#
Returns dictionary of
resultant
nodes of the network.
- property edges#
Returns list of dangling, non-batch edges of the network. Dangling edges from
virtual
nodes are not included.
- property auto_stack#
Returns boolean indicating whether
auto_stack
mode is active. By default, it isTrue
.This mode indicates whether the operation
stack()
can take control of the memory management of the network to skip some steps in future computations. Ifauto_stack
is set toTrue
and a collection ofParamNodes
arestacked
(as the first operation in which these nodes are involved), then those nodes will no longer store their own tensors, but rather avirtual
ParamStackNode
will store the stacked tensor, avoiding the computation of that firststack()
in every contraction. This behaviour is not possible ifauto_stack
is set toFalse
, in which case all nodes will always store their own tensors.Setting
auto_stack
toTrue
will be faster for both inference and training. However, while experimenting withTensorNetwork
’s one might want that all nodes store their own tensors to avoid problems.Be aware that changing
auto_stack
mode entailsresetting
the network, which will modify its nodes. This has to be done manually in order to avoid undesired behaviour.
- property auto_unbind#
Returns boolean indicating whether
auto_unbind
mode is active. By default, it isFalse
.This mode indicates whether the operation
unbind()
has to actually unbind the stacked tensor or just generate a collection of references. That is, ifauto_unbind
is set toFalse
,unbind()
creates a collection of nodes, each of them storing the corresponding slice of the stacked tensor. Ifauto_unbind
is set toTrue
,unbind()
just creates the nodes and gives each of them an index to reference the stacked tensor, so that each node’s tensor would be retrieved by indexing the stack. This avoids performing the operation, since these indices will be the same in subsequent iterations.Setting
auto_unbind
toTrue
will be faster for inference, but slower for training.Be aware that changing
auto_unbind
mode entailsresetting
the network, which will modify its nodes. Thus, this mode has to be changed manually in order to avoid undesired behaviour.
- delete_node(node, move_names=True)[source]#
Disconnects node from all its neighbours and removes it from the network. To completely get rid of the node, do not forget to delete it:
>>> del node
or override it:
>>> node = node.copy() # .copy() calls to .delete_node()
- Parameters
move_names (bool) – Boolean indicating whether names’ enumerations should be decreased when removing a node (
True
) or kept as they are (False
). This is useful when several nodes are being modified at once, and each resultant node has the same enumeration as the corresponding original node.
Examples
>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> print(nodeA.name, nodeB.name) node_0 node_1
>>> nodeB.network.delete_node(nodeB) >>> nodeA.neighbours() == [] True
>>> print(nodeA.name) node
If
move_names
is set toFalse
, enumeration is not removed. Useful to avoid managing enumeration of a list of nodes that are all going to be deleted.>>> nodeA = tk.randn((2, 3)) >>> nodeB = tk.randn((3, 4)) >>> _ = nodeA[1] ^ nodeB[0] >>> nodeB.network.delete_node(nodeB, False) >>> print(nodeA.name) node_0
- parameterize(set_param=True, override=False)[source]#
Parameterizes all
leaf
nodes of the network. If there areresultant
nodes in theTensorNetwork
, it will be firstreset()
.- Parameters
set_param (bool) – Boolean indicating whether the tensor network has to be parameterized (
True
) or de-parameterized (False
).override (bool) – Boolean indicating whether the tensor network should be parameterized in-place (
True
) or copied and then parameterized (False
).
- set_data_nodes(input_edges, num_batch_edges)[source]#
Creates
data
nodes with as many batch edges asnum_batch_edges
and one feature edge. Then it connects each of these nodes’ feature edges to an edge from the listinput_edges
(following the provided order). Thus, edges ininput_edges
need to be dangling. Also, if there are alreadydata
nodes (or the"stack_data_memory"
) in the network, they should beunset()
first.If all the
data
nodes have the same shape, avirtual
node will contain all the tensors stacked in one, what will save some memory and time in computations. This node is"stack_data_memory"
. SeeAbstractNode
to learn more about this node.If this method is overriden in subclasses, it can be done in two flavours:
def set_data_nodes(self): # Collect input edges input_edges = [node_1[i], ..., node_n[j]] # Define number of batches num_batch_edges = m # Call parent method super().set_data_nodes(input_edges, num_batch_edges)
def set_data_nodes(self): # Create data nodes directly data_nodes = [ tk.Node(shape=(batch_1, ..., batch_m, feature_dim), axes_names=('batch_1', ..., 'batch_m', 'feature') network=self, data=True) for _ in range(n)] # Connect them with the leaf nodes for i, data_node in enumerate(data_nodes): data_node['feature'] ^ self.my_nodes[i]['input']
If this method is overriden, there is no need to call it explicitly during training, since it will be done in the
forward()
call.On the other hand, if one does not override
set_data_nodes
, it should be called before starting training.- Parameters
input_edges (list[Edge]) – List of edges to which the
data
nodes’ feature edges will be connected.num_batch_edges (int) – Number of batch edges in the
data
nodes.
Examples
>>> nodeA = tk.Node(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='nodeA', ... init_method='randn') >>> nodeB = tk.Node(shape=(2, 5, 2), ... axes_names=('left', 'input', 'right'), ... name='nodeB', ... init_method='randn') >>> _ = nodeA['right'] ^ nodeB['left'] ... >>> net = nodeA.network >>> input_edges = [nodeA['input'], nodeB['input']] >>> net.set_data_nodes(input_edges, 1) >>> list(net.data_nodes.keys()) ['data_0', 'data_1']
>>> net['data_0'] Node( name: data_0 tensor: None axes: [batch feature] edges: [data_0[batch] <-> None data_0[feature] <-> nodeA[input]])
- unset_data_nodes()[source]#
Deletes all
data
nodes (including the"stack_data_memory"
when this node exists).
- add_data(data)[source]#
Adds data tensor(s) to
data
nodes, that is, changes their tensors by new data tensors when a new batch is provided.If all data nodes have the same shape, thus having its tensor stored in
"stack_data_memory"
, the whole data tensor will be stored by this node. Thedata
nodes will just store a reference to a slice of that tensor.Otherwise, each tensor in the list (
data
) will be stored by eachdata
node in the network, in the order they appear indata_nodes()
.If one wants to customize how data is set into the
data
nodes, this method can be overriden.- Parameters
data (torch.Tensor or list[torch.Tensor]) –
If all data nodes have the same shape, thus having its tensor stored in
"stack_data_memory"
,data
should be a tensor of shape\[batch\_size_{0} \times ... \times batch\_size_{n} \times n_{features} \times feature\_dim\]Otherwise, it should be a list with \(n_{features}\) elements, each of them being a tensor with shape
\[batch\_size_{0} \times ... \times batch\_size_{n} \times feature\_dim\]
Examples
>>> nodeA = tk.Node(shape=(3, 5, 3), ... axes_names=('left', 'input', 'right'), ... name='nodeA', ... init_method='randn') >>> nodeB = tk.Node(shape=(3, 5, 3), ... axes_names=('left', 'input', 'right'), ... name='nodeB', ... init_method='randn') >>> _ = nodeA['right'] ^ nodeB['left'] ... >>> net = nodeA.network >>> input_edges = [nodeA['input'], nodeB['input']] >>> net.set_data_nodes(input_edges, 1) ... >>> net.add_data(torch.randn(100, 2, 5)) >>> net['data_0'].shape torch.Size([100, 5])
- reset()[source]#
Resets the
TensorNetwork
to its initial state, before computing any non-in-placeOperation
. Different actions apply to different types of nodes:leaf
: These nodes retrieve their tensors in case they were just referencing a slice of the tensor in theParamStackNode
that is created whenstacking
ParamNodes
(ifauto_stack
mode is active). If there is a"virtual_uniform"
node in the network from which allleaf
nodes take their tensor, this is not modified.virtual
: Only virtual nodes created inoperations
aredeleted
. This only includes nodes using the reserved name"virtual_result"
.resultant
: These nodes aredeleted
from the network.
Also, the dictionaries of
Successors
of allleaf
anddata
nodes are emptied.The
TensorNetwork
is automaticallyreset
whenparameterizing
it, changingauto_stack()
orauto_unbind()
modes, ortracing
.See
AbstractNode
to learn more about the 4 types of nodes and the reserved names.For an example, check this tutorial.
- trace(example=None, *args, **kwargs)[source]#
Traces the tensor network contraction algorithm with two purposes:
Create all the intermediate
resultant
nodes that result fromOperations
so that in the next contractions only the tensor operations have to be computed, thus saving a lot of time.Keep track of the tensors that are used to compute operations, so that intermediate results that are not useful anymore can be deleted, thus saving a lot of memory. This is achieved by constructing an
inverse_memory
that, given a memory address, stores the nodes that use the tensor located in that address of the network’s memory.
To trace a tensor network, it is necessary to provide the same arguments that would be required in the forward call. In case the tensor network is contracted with some input data, an example tensor with batch dimension 1 and filled with zeros would be enough to trace the contraction.
For an example, check this tutorial.
- Parameters
example (torch.Tensor, optional) – Example tensor used to trace the contraction of the tensor network. In case the tensor network is contracted with some input data, an example tensor with batch dimension 1 and filled with zeros would be enough to trace the contraction.
args – Arguments that might be used in
contract()
.kwargs – Keyword arguments that might be used in
contract()
.
- contract()[source]#
Contracts the whole tensor network returning a single
Node
. This method is not implemented and subclasses ofTensorNetwork
should override it to define the contraction algorithm of the network.
- forward(data=None, *args, **kwargs)[source]#
Contracts
TensorNetwork
with input data. It can be called using the__call__
operator()
.Overrides the
forward
method of PyTorch’storch.nn.Module
. Sets data nodes automatically wheneverset_data_nodes()
is overriden,adds data
tensor(s) to these nodes, and contracts the whole network according tocontract()
, returning a singletorch.Tensor
.Furthermore, to optimize the contraction algorithm during training, once the
TensorNetwork
istraced
, all thatforward
does is calling the differentOperations
used incontract()
in the same order they appeared in the code. Hence, the last operation incontract()
should be the one that returns the single outputNode
.For an example, check this tutorial.
- Parameters
data (torch.Tensor or list[torch.Tensor], optional) –
If all data nodes have the same shape, thus having its tensor stored in
"stack_data_memory"
,data
should be a tensor of shape\[batch\_size_{0} \times ... \times batch\_size_{n} \times n_{features} \times feature\_dim\]Otherwise, it should be a list with \(n_{features}\) elements, each of them being a tensor with shape
\[batch\_size_{0} \times ... \times batch\_size_{n} \times feature\_dim\]Also, it is not necessary that the network has
data
nodes, thusNone
is also valid.args – Arguments that might be used in
contract()
.kwargs – Keyword arguments that might be used in
contract()
.