Transform a count matrix to a TF or TF-IDF representation
This node has been automatically generated by wrapping the scikits.learn.feature_extraction.text.TfidfTransformer class
from the sklearn library. The wrapped instance can be accessed
through the scikits_alg attribute.
TF means term-frequency while TF-IDF means term-frequency times inverse
document-frequency:
The goal of using TF-IDF instead of the raw frequencies of occurrence of a
token in a given document is to scale down the impact of tokens that occur
very frequently in a given corpus and that are hence empirically less
informative than feature that occur in a small fraction of the training
corpus.
TF-IDF can be seen as a smooth alternative to the stop words filtering.
|
|
__init__(self,
input_dim=None,
output_dim=None,
dtype=None,
**kwargs)
Transform a count matrix to a TF or TF-IDF representation
This node has been automatically generated by wrapping the scikits.learn.feature_extraction.text.TfidfTransformer class
from the sklearn library. The wrapped instance can be accessed
through the scikits_alg attribute.
TF means term-frequency while TF-IDF means term-frequency times inverse
document-frequency: |
|
|
|
|
|
|
list
|
_get_supported_dtypes(self)
Return the list of dtypes supported by this node.
The types can be specified in any format allowed by numpy.dtype. |
|
|
|
|
_stop_training(self,
**kwargs)
Concatenate the collected data in a single array. |
|
|
|
|
execute(self,
x)
Transform a count matrix to a TF or TF-IDF representation
This node has been automatically generated by wrapping the scikits.learn.feature_extraction.text.TfidfTransformer class
from the sklearn library. The wrapped instance can be accessed
through the scikits_alg attribute.
Parameters |
|
|
|
|
stop_training(self,
**kwargs)
Learn the IDF vector (global term weights)
This node has been automatically generated by wrapping the scikits.learn.feature_extraction.text.TfidfTransformer class
from the sklearn library. The wrapped instance can be accessed
through the scikits_alg attribute.
Parameters |
|
|
|
Inherited from unreachable.newobject:
__long__,
__native__,
__nonzero__,
__unicode__,
next
Inherited from object:
__delattr__,
__format__,
__getattribute__,
__hash__,
__new__,
__reduce__,
__reduce_ex__,
__setattr__,
__sizeof__,
__subclasshook__
|
|
|
_train(self,
*args)
Collect all input data in a list. |
|
|
|
|
train(self,
*args)
Collect all input data in a list. |
|
|
|
|
|
|
|
__call__(self,
x,
*args,
**kwargs)
Calling an instance of Node is equivalent to calling
its execute method. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
_refcast(self,
x)
Helper function to cast arrays to the internal dtype. |
|
|
|
|
|
|
|
|
|
|
|
|
|
copy(self,
protocol=None)
Return a deep copy of the node. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
inverse(self,
y,
*args,
**kwargs)
Invert y. |
|
|
|
|
is_training(self)
Return True if the node is in the training phase,
False otherwise. |
|
|
|
|
save(self,
filename,
protocol=-1)
Save a pickled serialization of the node to filename.
If filename is None, return a string. |
|
|
|
|
set_dtype(self,
t)
Set internal structures' dtype. |
|
|
|
|
|
|
|
|