sklearn.preprocessing.Binarizer

xiaoxiao2021-02-28  98

Binarizer类和binarize方法根据指定的阈值将特征二值化,小于等于阈值的,将特征值赋予0,大于特征值的赋予1,其阈值threshold默认都为0

①binarize方法:sklearn.preprocessing.binarize(X, threshold=0.0, copy=True)

a、对于非稀疏矩阵而言,阈值threshold可以设置任何浮点数

In [1]: from sklearn import preprocessing ...: from sklearn import datasets ...: import numpy as np ...: data = datasets.load_boston() ...: new_target = preprocessing.binarize(data.target[:,np.newaxis] , thresh ...: old = data.target.mean()).astype(int)#小于等于均值赋予0,否则赋予1 ...: print(type(preprocessing.binarize(data.target[:,np.newaxis] , threshold ...: = data.target.mean()))) ...: new_target[:5] ...: <class 'numpy.ndarray'> Out[1]: array([[1], [0], [1], [1], [1]]) In [2]: preprocessing.binarize(data.target[:,np.newaxis] , threshold = -1).asty ...: pe(int)[:5] Out[2]: array([[1], [1], [1], [1], [1]])b、对于稀疏矩阵而言,阈值threshold必须设置为大于等于0浮点数 In [3]: from scipy.sparse import coo ...: from sklearn import preprocessing ...: spar = coo.coo_matrix(np.random.binomial(1,0.25,100)) ...: preprocessing.binarize(spar,threshold=-1) ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-3-ff778f656a6b> in <module>() 2 from sklearn import preprocessing 3 spar = coo.coo_matrix(np.random.binomial(1,0.25,100)) ----> 4 preprocessing.binarize(spar,threshold=-1) d:\softwore\python\lib\site-packages\sklearn\preprocessing\data.py in binarize(X , threshold, copy) 1470 if sparse.issparse(X): 1471 if threshold < 0: -> 1472 raise ValueError('Cannot binarize a sparse matrix with thres hold ' 1473 '< 0') 1474 cond = X.data > threshold ValueError: Cannot binarize a sparse matrix with threshold < 0 In [4]: preprocessing.binarize(spar,threshold=0) Out[4]: <1x100 sparse matrix of type '<class 'numpy.int32'>' with 24 stored elements in Compressed Sparse Row format>②Binarizer类:sklearn.preprocessing.Binarizer(threshold=0.0, copy=True)

a、对于非稀疏矩阵而言,阈值threshold可以设置任意浮点数

In [5]: from sklearn import preprocessing ...: from sklearn import datasets ...: import numpy as np ...: data = datasets.load_boston() ...: bz = preprocessing.Binarizer(data.target.mean()) ...: new_target = bz.fit_transform(data.target[:,np.newaxis]).astype(int) ...: print(bz) ...: new_target[:5] ...: Binarizer(copy=True, threshold=22.532806324110677) Out[5]: array([[1], [0], [1], [1], [1]]) In [6]: preprocessing.Binarizer(-1).fit_transform(data.target[:,np.newaxis]).as ...: type(int)[:5] Out[6]: array([[1], [1], [1], [1], [1]])b、对于稀疏矩阵而言,阈值threshold同样必须设置为大于等于0浮点数

In [7]: from scipy.sparse import coo ...: spar = coo.coo_matrix(np.random.binomial(1,0.25,100)) ...: preprocessing.Binarizer(threshold= -1).fit_transform(spar) ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-7-fc5a78d3b8c5> in <module>() 1 from scipy.sparse import coo 2 spar = coo.coo_matrix(np.random.binomial(1,0.25,100)) ----> 3 preprocessing.Binarizer(threshold= -1).fit_transform(spar) d:\softwore\python\lib\site-packages\sklearn\base.py in fit_transform(self, X, y , **fit_params) 492 if y is None: 493 # fit method of arity 1 (unsupervised transformation) --> 494 return self.fit(X, **fit_params).transform(X) 495 else: 496 # fit method of arity 2 (supervised transformation) d:\softwore\python\lib\site-packages\sklearn\preprocessing\data.py in transform( self, X, y, copy) 1549 """ 1550 copy = copy if copy is not None else self.copy -> 1551 return binarize(X, threshold=self.threshold, copy=copy) 1552 1553 d:\softwore\python\lib\site-packages\sklearn\preprocessing\data.py in binarize(X , threshold, copy) 1470 if sparse.issparse(X): 1471 if threshold < 0: -> 1472 raise ValueError('Cannot binarize a sparse matrix with thres hold ' 1473 '< 0') 1474 cond = X.data > threshold ValueError: Cannot binarize a sparse matrix with threshold < 0

转载请注明原文地址: https://www.6miu.com/read-70303.html

最新回复(0)