Package mdp :: Package parallel :: Class ProcessScheduler
[hide private]
[frames] | no frames]

Class ProcessScheduler


Scheduler that distributes the task to multiple processes.

The subprocess module is used to start the requested number of processes.
The execution of each task is internally managed by dedicated thread.

This scheduler should work on all platforms (at least on Linux,
Windows XP and Vista).

Instance Methods [hide private]
 
__init__(self, result_container=None, verbose=False, n_processes=1, source_paths=None, python_executable=None, cache_callable=True)
Initialize the scheduler and start the slave processes.
 
_process_task(self, data, task_callable, task_index)
Add a task, if possible without blocking.
 
_shutdown(self)
Shut down the slave processes.
 
_task_thread(self, process, data, task_callable, task_index)
Thread function which cares for a single task.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

    Inherited from Scheduler
 
__enter__(self)
Return self.
 
__exit__(self, type, value, traceback)
Shutdown the scheduler.
 
_store_result(self, result, task_index)
Store a result in the internal result container.
 
add_task(self, data, task_callable=None)
Add a task to be executed.
 
get_results(self)
Get the accumulated results from the result container.
 
set_task_callable(self, task_callable)
Set the callable that will be used if no task_callable is given.
 
shutdown(self)
Controlled shutdown of the scheduler.
Properties [hide private]

Inherited from object: __class__

    Inherited from Scheduler
  n_open_tasks
This property counts of submitted but unfinished tasks.
  task_counter
This property counts the number of submitted tasks.
Method Details [hide private]

__init__(self, result_container=None, verbose=False, n_processes=1, source_paths=None, python_executable=None, cache_callable=True)
(Constructor)

 
Initialize the scheduler and start the slave processes.

result_container -- ResultContainer used to store the results.
verbose -- Set to True to get progress reports from the scheduler
    (default value is False).
n_processes -- Number of processes used in parallel. If None (default)
    then the number of detected CPU cores is used.
source_paths -- List of paths that are added to sys.path in
    the processes to make the task unpickling work. A single path
    instead of a list is also accepted.
    If None (default value) then source_paths is set to sys.path.
    To prevent this you can specify an empty list.
python_executable -- Python executable that is used for the processes.
    The default value is None, in which case sys.executable will be
    used.
cache_callable -- Cache the task objects in the processes (default
    is True). Disabling caching can reduce the memory usage, but will
    generally be less efficient since the task_callable has to be
    pickled each time.

Overrides: object.__init__

_process_task(self, data, task_callable, task_index)

 
Add a task, if possible without blocking.

It blocks when the system is not able to start a new thread
or when the processes are all in use.

Overrides: Scheduler._process_task

_shutdown(self)

 
Shut down the slave processes.

If a process is still running a task then an exception is raised.

Overrides: Scheduler._shutdown

_task_thread(self, process, data, task_callable, task_index)

 
Thread function which cares for a single task.

The task is pushed to the process via stdin, then we wait for the
result on stdout, pass the result to the result container, free
the process and exit.