Scikit中使用Grid

xiaoxiao2021-02-28  77

1. grid search是用来寻找模型的最佳参数

先导入一些依赖包

from sklearn.ensemble import GradientBoostingClassifier from sklearn.grid_search import GridSearchCV from sklearn import metrics import numnpy as np import pandas as pd

2. 设置要查找的参数

params={'learning_rate':np.linspace(0.05,0.25,5), 'max_depth':[x for x in range(1,8,1)], 'min_samples_leaf':[x for x in range(1,5,1)], 'n_estimators':[x for x in range(50,100,10)]}

3. 设置模型和评价指标,开始用不同的参数训练模型

clf = GradientBoostingClassifier() grid = GridSearchCV(clf, params, cv=10, scoring="f1") grid.fit(X, y)

scoring所有可能情况如下:

Classification scoringfunctioncommentaccuracymetrics.accuracy_scoreaverage_precisionmetrics.average_precision_scoref1metrics.f1_scorefor binary targetsf1_micrometrics.f1_scoremicro-averagedf1_macrometrics.f1_scoremacro-averagedf1_weightedmetrics.f1_scoreweighted averagef1_samplesmetrics.f1_scoreby multilabel sampleneg_log_lossmetrics.log_lossrequires predict_proba supportprecision etc.metrics.precision_scoresuffixes apply as with “f1”recall etc.metrics.recall_scoresuffixes apply as with “f1”roc_aucmetrics.roc_auc_score Clustering scoringfunctioncommentadjusted_rand_scoremetrics.adjusted_rand_score Regression scoringfunctioncommentneg_mean_absolute_errormetrics.mean_absolute_errorneg_mean_squared_errormetrics.mean_squared_errorneg_median_absolute_errormetrics.median_absolute_errorr2metrics.r2_score

4. 查看最佳分数和最佳参数

grid.best_score_ #查看最佳分数(此处为f1_score) grid.best_params_ #查看最佳参数

5. 获取最佳模型

grid.best_estimator_

6. 利用最佳模型来进行预测

best_model=grid.best_estimator_ predict_y=best_model.predict(Test_X) metrics.f1_score(y, predict_y)
转载请注明原文地址: https://www.6miu.com/read-83394.html

最新回复(0)