Home | Trees | Indices | Help |
|
---|
|
An 'OnlineFlow' is a sequence of nodes that are trained online and executed together to form a more complex algorithm. Input data is sent to the first node and is successively processed by the subsequent nodes along the sequence. Using an online flow as opposed to manually handling a set of nodes has a clear advantage: The general online flow implementation automatates the training (including supervised training and multiple training phases), execution, and inverse execution (if defined) of the whole sequence. To understand the compatible node sequences for an OnlineFlow, the following terminology is useful: A "trainable" node: node.is_trainable() returns True, node.is_training() returns True. A "trained" node: node.is_trainable() returns True, node.is_training() returns False. A "non-trainable" node: node.is_trainable() returns False, node.is_training() returns False. OnlineFlow node sequence can contain (a) only OnlineNodes (Eg. [OnlineCenteringNode(), IncSFANode()], or (b) a mix of OnlineNodes and trained/non-trainable Nodes (eg. [a fully trained PCANode, IncSFANode()] or [QuadraticExpansionNode(), IncSFANode()], or (c) a mix of OnlineNodes/trained/non-trainable Nodes and a terminal trainable Node (but not an OnlineNode) whose training hasn't finished (eg. [IncSFANode(), QuadraticExpansionNode(), a partially or untrained SFANode]). Differences between a Flow and an OnlineFlow: a) In Flow, data is processed sequentially, training one node at a time. That is, the second node's training starts only after the first node is "trained". Whereas, in an OnlineFlow data is processed simultaneously training all the nodes at the same time. Eg: flow = Flow([node1, node2]), onlineflow = OnlineFlow([node1, node2]) Let input x = [x_0, x_1, ...., x_n], where x_t a sample or a mini batch of samples. Flow training: node1 trains on the entire x. While node1 is training, node2 is inactive. node1 training completes. node2 training begins on the node1(x). Therefore, Flow goes through all the data twice. Once for each node. OnlineFlow training: node1 trains on x_0. node2 trains on the output of node1 (node1(x_0)) node1 trains on x_1. node2 trains on the output of node1 (node1(x_1)) .... node1 trains on x_n. node2 trains on the output of node1 (node1(x_n)) OnlineFlow goes through all the data only once. b) Flow requires a list of dataiterables with a length equal to the number of nodes or a single numpy array. OnlineFlow requires only one input dataiterable as each node is trained simultaneously. c) Additional train args (supervised labels etc) are passed to each node through the node specific dataiterable. OnlineFlow requires the dataiterable to return a list that contains tuples of args for each node: [x, (node0 args), (node1 args), ...]. See train docstring. Crash recovery is optionally available: in case of failure the current state of the flow is saved for later inspection. OnlineFlow objects are Python containers. Most of the builtin 'list' methods are available. An 'OnlineFlow' can be saved or copied using the corresponding 'save' and 'copy' methods.
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Inherited from Inherited from |
|||
Inherited from Flow | |||
---|---|---|---|
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
node |
|
||
|
|||
|
|
|||
Inherited from Flow | |||
---|---|---|---|
|
|
|||
Inherited from |
|
|
|
|
Keyword arguments: flow -- a list of Nodes crash_recovery -- set (or not) Crash Recovery Mode (save node in case a failure) verbose -- if True, print some basic progress information
|
|
|
|
|
Return the data iterable after some checks and sanitizing. Note that this method does not distinguish between iterables and iterators, so this must be taken care of later.
|
Train a single node in the flow. nodenr -- index of the node in the flow
|
|
|
|
|
Train all trainable nodes in the flow. 'data_iterables' is a single iterable (including generator-type iterators if the last node has no multiple training phases) that must return data arrays to train nodes (so the data arrays are the 'x' for the nodes). Note that the data arrays are processed by the nodes which are in front of the node that gets trained, so the data dimension must match the input dimension of the first node. 'data_iterables' can also be a 2D or a 3D numpy array. A 2D array trains all the nodes incrementally, while a 3D array supports online training in batches (=shape[1]). 'data_iterables' can also return a list or a tuple, where the first entry is 'x' and the rest are the required args for training all the nodes in the flow (e.g. for supervised training). (x, (node-0 args), (node-1 args), ..., (node-n args)) - args for n nodes if say node-i does not require any args, the provided (node-i args) are ignored. So, one can simply use None for the nodes that do not require args. (x, (node-0 args), ..., None, ..., (node-n args)) - No args for the ith node.
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0.1-MDP on Mon Apr 27 21:56:16 2020 | http://epydoc.sourceforge.net |