Package mdp :: Class Flow
[hide private]
[frames] | no frames]

Class Flow


A 'Flow' is a sequence of nodes that are trained and executed together to form a more complex algorithm. Input data is sent to the first node and is successively processed by the subsequent nodes along the sequence.

Using a flow as opposed to handling manually a set of nodes has a clear advantage: The general flow implementation automatizes the training (including supervised training and multiple training phases), execution, and inverse execution (if defined) of the whole sequence.

Crash recovery is optionally available: in case of failure the current state of the flow is saved for later inspection. A subclass of the basic flow class ('CheckpointFlow') allows user-supplied checkpoint functions to be executed at the end of each phase, for example to save the internal structures of a node for later analysis. Flow objects are Python containers. Most of the builtin 'list' methods are available. A 'Flow' can be saved or copied using the corresponding 'save' and 'copy' methods.

Instance Methods [hide private]
 
__add__(self, other)
 
__call__(self, iterable, nodenr=None)
Calling an instance is equivalent to call its 'execute' method.
 
__contains__(self, item)
 
__delitem__(self, key)
 
__getitem__(self, key)
 
__iadd__(self, other)
 
__init__(self, flow, crash_recovery=False, verbose=False)
Keyword arguments:
 
__iter__(self)
 
__len__(self)
 
__repr__(self)
repr(x)
 
__setitem__(self, key, value)
 
__str__(self)
str(x)
 
_check_dimension_consistency(self, out, inp)
Raise ValueError when both dimensions are set and different.
 
_check_nodes_consistency(self, flow=None)
Check the dimension consistency of a list of nodes.
 
_check_value_type_isnode(self, value)
 
_close_last_node(self)
 
_execute_seq(self, x, nodenr=None)
 
_inverse_seq(self, x)
 
_propagate_exception(self, except_, nodenr)
 
_stop_training_hook(self)
Hook method that is called before stop_training is called.
 
_train_check_iterables(self, data_iterables)
Return the data iterables after some checks and sanitizing.
 
_train_node(self, data_iterable, nodenr)
Train a single node in the flow.
 
append(flow, node)
append node to flow end
 
copy(self, protocol=None)
Return a deep copy of the flow.
 
execute(self, iterable, nodenr=None)
Process the data through all nodes in the flow.
 
extend(flow, iterable)
extend flow by appending elements from the iterable
 
insert(flow, index, node)
insert node before index
 
inverse(self, iterable)
Process the data through all nodes in the flow backwards (starting from the last node up to the first node) by calling the inverse function of each node. Of course, all nodes in the flow must be invertible.
node
pop(flow, index=...)
remove and return node at index (default last)
 
save(self, filename, protocol=-1)
Save a pickled serialization of the flow to 'filename'. If 'filename' is None, return a string.
 
set_crash_recovery(self, state=True)
Set crash recovery capabilities.
 
train(self, data_iterables)
Train all trainable nodes in the flow.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Static Methods [hide private]
 
_get_required_train_args(node)
Return arguments in addition to self and x for node.train.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__add__(self, other)
(Addition operator)

 

__call__(self, iterable, nodenr=None)
(Call operator)

 
Calling an instance is equivalent to call its 'execute' method.

__contains__(self, item)
(In operator)

 

__delitem__(self, key)
(Index deletion operator)

 

__getitem__(self, key)
(Indexing operator)

 

__iadd__(self, other)

 

__init__(self, flow, crash_recovery=False, verbose=False)
(Constructor)

 

Keyword arguments:

flow -- a list of Nodes
crash_recovery -- set (or not) Crash Recovery Mode (save node
                  in case a failure)
verbose -- if True, print some basic progress information

Overrides: object.__init__

__iter__(self)

 

__len__(self)
(Length operator)

 

__repr__(self)
(Representation operator)

 
repr(x)

Overrides: object.__repr__
(inherited documentation)

__setitem__(self, key, value)
(Index assignment operator)

 

__str__(self)
(Informal representation operator)

 
str(x)

Overrides: object.__str__
(inherited documentation)

_check_dimension_consistency(self, out, inp)

 
Raise ValueError when both dimensions are set and different.

_check_nodes_consistency(self, flow=None)

 
Check the dimension consistency of a list of nodes.

_check_value_type_isnode(self, value)

 

_close_last_node(self)

 

_execute_seq(self, x, nodenr=None)

 

_get_required_train_args(node)
Static Method

 

Return arguments in addition to self and x for node.train.

Argumentes that have a default value are ignored.

_inverse_seq(self, x)

 

_propagate_exception(self, except_, nodenr)

 

_stop_training_hook(self)

 
Hook method that is called before stop_training is called.

_train_check_iterables(self, data_iterables)

 
Return the data iterables after some checks and sanitizing.

Note that this method does not distinguish between iterables and
iterators, so this must be taken care of later.

_train_node(self, data_iterable, nodenr)

 
Train a single node in the flow.

nodenr -- index of the node in the flow

append(flow, node)

 
append node to flow end

copy(self, protocol=None)

 

Return a deep copy of the flow.

The protocol parameter should not be used.

execute(self, iterable, nodenr=None)

 

Process the data through all nodes in the flow.

'iterable' is an iterable or iterator (note that a list is also an iterable), which returns data arrays that are used as input to the flow. Alternatively, one can specify one data array as input.

If 'nodenr' is specified, the flow is executed only up to node nr. 'nodenr'. This is equivalent to 'flow[:nodenr+1](iterable)'.

extend(flow, iterable)

 
extend flow by appending elements from the iterable

insert(flow, index, node)

 
insert node before index

inverse(self, iterable)

 

Process the data through all nodes in the flow backwards (starting from the last node up to the first node) by calling the inverse function of each node. Of course, all nodes in the flow must be invertible.

'iterable' is an iterable or iterator (note that a list is also an iterable), which returns data arrays that are used as input to the flow. Alternatively, one can specify one data array as input.

Note that this is _not_ equivalent to 'flow[::-1](iterable)', which also executes the flow backwards but calls the 'execute' function of each node.

pop(flow, index=...)

 
remove and return node at index (default last)
Returns: node

save(self, filename, protocol=-1)

 

Save a pickled serialization of the flow to 'filename'. If 'filename' is None, return a string.

Note: the pickled Flow is not guaranteed to be upward or backward compatible.

set_crash_recovery(self, state=True)

 

Set crash recovery capabilities.

When a node raises an Exception during training, execution, or inverse execution that the flow is unable to handle, a FlowExceptionCR is raised. If crash recovery is set, a crash dump of the flow instance is saved for later inspection. The original exception can be found as the 'parent_exception' attribute of the FlowExceptionCR instance.

  • If 'state' = False, disable crash recovery.
  • If 'state' is a string, the crash dump is saved on a file with that name.
  • If 'state' = True, the crash dump is saved on a file created by the tempfile module.

train(self, data_iterables)

 

Train all trainable nodes in the flow.

'data_iterables' is a list of iterables, one for each node in the flow. The iterators returned by the iterables must return data arrays that are then used for the node training (so the data arrays are the 'x' for the nodes). Note that the data arrays are processed by the nodes which are in front of the node that gets trained, so the data dimension must match the input dimension of the first node.

If a node has only a single training phase then instead of an iterable you can alternatively provide an iterator (including generator-type iterators). For nodes with multiple training phases this is not possible, since the iterator cannot be restarted after the first iteration. For more information on iterators and iterables see http://docs.python.org/library/stdtypes.html#iterator-types .

In the special case that 'data_iterables' is one single array, it is used as the data array 'x' for all nodes and training phases.

Instead of a data array 'x' the iterators can also return a list or tuple, where the first entry is 'x' and the following are args for the training of the node (e.g. for supervised training).