mafese.wrapper package

mafese.wrapper.recursive module

class mafese.wrapper.recursive.RecursiveSelector(problem='classification', estimator='knn', estimator_paras=None, n_features=3, step=1, verbose=0, importance_getter='auto')[source]

Bases: mafese.selector.Selector

Defines a RecursiveSelector class that hold all RecursiveSelector Feature Selection methods for feature selection problems

Parameters
  • problem (str, default = "classification") – The problem you are trying to solve (or type of dataset), “classification” or “regression”

  • estimator (str or Estimator instance (from scikit-learn or custom)) –

    If estimator is str, we are currently support:
    • svm: support vector machine with kernel = ‘linear’

    • rf: random forest

    • adaboost: AdaBoost

    • xgb: Gradient Boosting

    • tree: Extra Trees

    If estimator is Estimator instance: you need to make sure it is has a fit method that provides information about feature importance (e.g. coef_, feature_importances_).

  • estimator_paras (None or dict, default = None) – The parameters of the estimator, please see the official document of scikit-learn to selected estimator. If None, we use the best parameter for selected estimator

  • n_features (int or float, default=3) – The number of features to select. If None, half of the features are selected. If integer, the parameter is the absolute number of features to select. If float between 0 and 1, it is the fraction of features to select.

  • step (int or float, default=1) – If greater than or equal to 1, then step corresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), then step corresponds to the percentage (rounded down) of features to remove at each iteration.

  • verbose (int, default=0) – Controls verbosity of output.

  • importance_getter (str or callable, default='auto') –

    If ‘auto’, uses the feature importance either through a coef_ or feature_importances_ attributes of estimator.

    Also accepts a string that specifies an attribute name/path for extracting feature importance (implemented with attrgetter). For example, give regressor_.coef_ in case of TransformedTargetRegressor or named_steps.clf.feature_importances_ in case of class:~sklearn.pipeline.Pipeline with its last step named clf.

    If callable, overrides the default feature importance getter. The callable is passed with the fitted estimator and it should return importance for each feature.

Examples

The following example shows how to retrieve the most informative features in the RecursiveSelector FS method

>>> import pandas as pd
>>> from mafese.wrapper.recursive import RecursiveSelector
>>> # load dataset
>>> dataset = pd.read_csv('your_path/dataset.csv', index_col=0).values
>>> X, y = dataset[:, 0:-1], dataset[:, -1]     # Assumption that the last column is label column
>>> # define mafese feature selection method
>>> feat_selector = RecursiveSelector(problem="classification", estimator="rf", n_features=5)
>>> # find all relevant features
>>> feat_selector.fit(X, y)
>>> # check selected features - True (or 1) is selected, False (or 0) is not selected
>>> print(feat_selector.selected_feature_masks)
array([ True, True, True, False, False, True, False, False, False, True])
>>> print(feat_selector.selected_feature_solution)
array([ 1, 1, 1, 0, 0, 1, 0, 0, 0, 1])
>>> # check the index of selected features
>>> print(feat_selector.selected_feature_indexes)
array([ 0, 1, 2, 5, 9])
>>> # call transform() on X to filter it down to selected features
>>> X_filtered = feat_selector.transform(X)
SUPPORT = ['svm', 'rf', 'adaboost', 'xgb', 'tree']
fit(X, y=None)[source]

Learn the features to select from X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of predictors.

  • y (array-like of shape (n_samples,), default=None) – Target values. This parameter may be ignored for unsupervised learning.

Returns

self – Returns the instance itself.

Return type

object

mafese.wrapper.sequential module

class mafese.wrapper.sequential.SequentialSelector(problem='classification', estimator='knn', estimator_paras=None, n_features=3, direction='forward', tol=None, scoring=None, cv=5, n_jobs=None)[source]

Bases: mafese.selector.Selector

Defines a SequentialSelector class that hold all Forward or Backward Feature Selection methods for feature selection problems

Parameters
  • problem (str, default = "classification") – The problem you are trying to solve (or type of dataset), “classification” or “regression”

  • estimator (str or Estimator instance (from scikit-learn or custom)) –

    If estimator is str, we are currently support:
    • knn: k-nearest neighbors

    • svm: support vector machine

    • rf: random forest

    • adaboost: AdaBoost

    • xgb: Gradient Boosting

    • tree: Extra Trees

    • ann: Artificial Neural Network (Multi-Layer Perceptron)

    If estimator is Estimator instance: you need to make sure it is has a fit method that provides information about feature importance (e.g. coef_, feature_importances_).

  • estimator_paras (None or dict, default = None) – The parameters of the estimator, please see the official document of scikit-learn to selected estimator. If None, we use the default parameter for selected estimator

  • n_features (int or float, default=3) – The number of features to select. If None, half of the features are selected. If integer, the parameter is the absolute number of features to select. If float between 0 and 1, it is the fraction of features to select.

  • direction ({'forward', 'backward'}, default='forward') – Whether to perform forward selection or backward selection.

  • tol (float, default=None) – If the score is not incremented by at least tol between two consecutive feature additions or removals, stop adding or removing. tol can be negative when removing features using direction=”backward”. It can be useful to reduce the number of features at the cost of a small decrease in the score. tol is enabled only when n_features is “auto”.

  • scoring (str or callable, default=None) – A single str (see scoring_parameter) or a callable to evaluate the predictions on the test set. NOTE that when using a custom scorer, it should return a single value. If None, the estimator’s score method is used.

  • cv (int, cross-validation generator or an iterable, default=None) –

    Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross validation,

    • integer, to specify the number of folds in a (Stratified)KFold,

    • CV splitter,

    • An iterable yielding (train, test) splits as arrays of indices.

    For integer/None inputs, if the estimator is a classifier and y is either binary or multiclass, StratifiedKFold is used. In all other cases, KFold is used. These splitters are instantiated with shuffle=False so the splits will be the same across calls.

  • n_jobs (int, default=None) – Number of jobs to run in parallel. When evaluating a new feature to add or remove, the cross-validation procedure is parallel over the folds. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

Examples

The following example shows how to retrieve the most informative features in the Sequential-based (forward, backward) FS method

>>> import pandas as pd
>>> from mafese.wrapper.sequential import SequentialSelector
>>> # load dataset
>>> dataset = pd.read_csv('your_path/dataset.csv', index_col=0).values
>>> X, y = dataset[:, 0:-1], dataset[:, -1]     # Assumption that the last column is label column
>>> # define mafese feature selection method
>>> feat_selector = SequentialSelector(problem="classification", estimator="knn", n_features=5, direction="forward")
>>> # find all relevant features
>>> feat_selector.fit(X, y)
>>> # check selected features - True (or 1) is selected, False (or 0) is not selected
>>> print(feat_selector.selected_feature_masks)
array([ True, True, True, False, False, True, False, False, False, True])
>>> print(feat_selector.selected_feature_solution)
array([ 1, 1, 1, 0, 0, 1, 0, 0, 0, 1])
>>> # check the index of selected features
>>> print(feat_selector.selected_feature_indexes)
array([ 0, 1, 2, 5, 9])
>>> # call transform() on X to filter it down to selected features
>>> X_filtered = feat_selector.transform(X)
SUPPORT = ['knn', 'svm', 'rf', 'adaboost', 'xgb', 'tree', 'ann']
fit(X, y=None)[source]

Learn the features to select from X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Training vectors, where n_samples is the number of samples and n_features is the number of predictors.

  • y (array-like of shape (n_samples,), default=None) – Target values. This parameter may be ignored for unsupervised learning.

Returns

self – Returns the instance itself.

Return type

object

mafese.wrapper.mha module

class mafese.wrapper.mha.MhaSelector(problem='classification', estimator='knn', estimator_paras=None, optimizer='BaseGA', optimizer_paras=None, transfer_func='vstf_01', obj_name=None)[source]

Bases: mafese.selector.Selector

Defines a MhaSelector class that hold all Metaheuristic-based Feature Selection methods for feature selection problems

Parameters
  • problem (str, default = "classification") – The problem you are trying to solve (or type of dataset), “classification” or “regression”

  • estimator (str or Estimator instance (from scikit-learn or custom)) –

    If estimator is str, we are currently support:
    • knn: k-nearest neighbors

    • svm: support vector machine

    • rf: random forest

    • adaboost: AdaBoost

    • xgb: Gradient Boosting

    • tree: Extra Trees

    • ann: Artificial Neural Network (Multi-Layer Perceptron)

    If estimator is Estimator instance: you need to make sure that it has fit and predict methods

  • estimator_paras (None or dict, default = None) – The parameters of the estimator, please see the official document of scikit-learn to selected estimator. If None, we use the default parameter for selected estimator

  • optimizer (str or instance of Optimizer class (from Mealpy library), default = "BaseGA") – The Metaheuristic Algorithm that use to solve the feature selection problem. Current supported list, please check it here: https://github.com/thieu1995/mealpy. If a custom optimizer is passed, make sure it is an instance of Optimizer class.

  • optimizer_paras (None or dict of parameter, default=None) – The parameter for the optimizer object. If None, the default parameters of optimizer is used (defined in https://github.com/thieu1995/mealpy.) If dict is passed, make sure it has at least epoch and pop_size parameters.

  • transfer_func (str or callable function, default="vstf_01") –

    The transfer function used to convert solution from float to integer. Current supported list:
    • v-shape transfer function: “vstf_01”, “vstf_02”, “vstf_03”, “vstf_04”

    • s-shape transfer function: “sstf_01”, “sstf_02”, “sstf_03”, “sstf_04”

    If callable function, make sure it return a list/tuple/np.ndarray values.

  • obj_name (None or str, default=None) –

    The name of objective for the problem, also depend on the problem is classification and regression.

    • If problem is classification, None will be replaced by AS (Accuracy score).

    • If problem is regression, None will be replaced by MSE (Mean squared error).

Examples

The following example shows how to retrieve the most informative features in the MhaSelector FS method

>>> import pandas as pd
>>> from mafese.wrapper.mha import MhaSelector
>>> # load dataset
>>> dataset = pd.read_csv('your_path/dataset.csv', index_col=0).values
>>> X, y = dataset[:, 0:-1], dataset[:, -1]     # Assumption that the last column is label column
>>> # define mafese feature selection method
>>> feat_selector = MhaSelector(problem="classification", estimator="rf", optimizer="BaseGA")
>>> # find all relevant features - 5 features should be selected
>>> feat_selector.fit(X, y)
>>> # check selected features - True (or 1) is selected, False (or 0) is not selected
>>> print(feat_selector.selected_feature_masks)
array([ True, True, True, False, False, True, False, False, False, True])
>>> print(feat_selector.selected_feature_solution)
array([ 1, 1, 1, 0, 0, 1, 0, 0, 0, 1])
>>> # check the index of selected features
>>> print(feat_selector.selected_feature_indexes)
array([ 0, 1, 2, 5, 9])
>>> # call transform() on X to filter it down to selected features
>>> X_filtered = feat_selector.transform(X)
SUPPORT = {'classification_objective': {'AS': 'max', 'BSL': 'min', 'CEL': 'min', 'CKS': 'max', 'F1S': 'max', 'F2S': 'max', 'FBS': 'max', 'GINI': 'min', 'GMS': 'max', 'HL': 'min', 'HS': 'max', 'JSI': 'max', 'KLDL': 'min', 'LS': 'max', 'MCC': 'max', 'NPV': 'max', 'PS': 'max', 'ROC-AUC': 'max', 'RS': 'max', 'SS': 'max'}, 'estimator': ['knn', 'svm', 'rf', 'adaboost', 'xgb', 'tree', 'ann'], 'optimizer': ['OriginalABC', 'OriginalACOR', 'AugmentedAEO', 'EnhancedAEO', 'ImprovedAEO', 'ModifiedAEO', 'OriginalAEO', 'MGTO', 'OriginalAGTO', 'BaseALO', 'OriginalALO', 'OriginalAO', 'OriginalAOA', 'IARO', 'LARO', 'OriginalARO', 'OriginalASO', 'OriginalAVOA', 'OriginalArchOA', 'AdaptiveBA', 'ModifiedBA', 'OriginalBA', 'BaseBBO', 'OriginalBBO', 'OriginalBBOA', 'OriginalBES', 'ABFO', 'OriginalBFO', 'OriginalBMO', 'BaseBRO', 'OriginalBRO', 'OriginalBSA', 'ImprovedBSO', 'OriginalBSO', 'CleverBookBeesA', 'OriginalBeesA', 'ProbBeesA', 'OriginalCA', 'OriginalCDO', 'OriginalCEM', 'OriginalCGO', 'BaseCHIO', 'OriginalCHIO', 'OriginalCOA', 'OCRO', 'OriginalCRO', 'OriginalCSA', 'OriginalCSO', 'OriginalCircleSA', 'OriginalCoatiOA', 'BaseDE', 'JADE', 'SADE', 'SAP_DE', 'DevDMOA', 'OriginalDMOA', 'OriginalDO', 'BaseEFO', 'OriginalEFO', 'OriginalEHO', 'AdaptiveEO', 'ModifiedEO', 'OriginalEO', 'OriginalEOA', 'LevyEP', 'OriginalEP', 'CMA_ES', 'LevyES', 'OriginalES', 'Simple_CMA_ES', 'OriginalESOA', 'OriginalEVO', 'OriginalFA', 'BaseFBIO', 'OriginalFBIO', 'OriginalFFA', 'OriginalFFO', 'OriginalFLA', 'BaseFOA', 'OriginalFOA', 'WhaleFOA', 'OriginalFOX', 'OriginalFPA', 'BaseGA', 'EliteMultiGA', 'EliteSingleGA', 'MultiGA', 'SingleGA', 'OriginalGBO', 'BaseGCO', 'OriginalGCO', 'OriginalGJO', 'OriginalGOA', 'BaseGSKA', 'OriginalGSKA', 'Matlab101GTO', 'Matlab102GTO', 'OriginalGTO', 'GWO_WOA', 'IGWO', 'OriginalGWO', 'RW_GWO', 'OriginalHBA', 'OriginalHBO', 'OriginalHC', 'SwarmHC', 'OriginalHCO', 'OriginalHGS', 'OriginalHGSO', 'OriginalHHO', 'BaseHS', 'OriginalHS', 'OriginalICA', 'OriginalINFO', 'OriginalIWO', 'BaseJA', 'LevyJA', 'OriginalJA', 'BaseLCO', 'ImprovedLCO', 'OriginalLCO', 'OriginalMA', 'BaseMFO', 'OriginalMFO', 'OriginalMGO', 'OriginalMPA', 'OriginalMRFO', 'WMQIMRFO', 'OriginalMSA', 'BaseMVO', 'OriginalMVO', 'OriginalNGO', 'ImprovedNMRA', 'OriginalNMRA', 'OriginalNRO', 'OriginalOOA', 'OriginalPFA', 'OriginalPOA', 'CL_PSO', 'C_PSO', 'HPSO_TVAC', 'OriginalPSO', 'PPSO', 'OriginalPSS', 'BaseQSA', 'ImprovedQSA', 'LevyQSA', 'OppoQSA', 'OriginalQSA', 'OriginalRIME', 'OriginalRUN', 'GaussianSA', 'OriginalSA', 'SwarmSA', 'BaseSARO', 'OriginalSARO', 'BaseSBO', 'OriginalSBO', 'BaseSCA', 'OriginalSCA', 'QleSCA', 'OriginalSCSO', 'ImprovedSFO', 'OriginalSFO', 'L_SHADE', 'OriginalSHADE', 'OriginalSHIO', 'OriginalSHO', 'ImprovedSLO', 'ModifiedSLO', 'OriginalSLO', 'BaseSMA', 'OriginalSMA', 'DevSOA', 'OriginalSOA', 'OriginalSOS', 'DevSPBO', 'OriginalSPBO', 'OriginalSRSR', 'BaseSSA', 'OriginalSSA', 'OriginalSSDO', 'OriginalSSO', 'OriginalSSpiderA', 'OriginalSSpiderO', 'OriginalSTO', 'OriginalSeaHO', 'OriginalServalOA', 'OriginalTDO', 'BaseTLO', 'ImprovedTLO', 'OriginalTLO', 'OriginalTOA', 'OriginalTPO', 'OriginalTS', 'OriginalTSA', 'OriginalTSO', 'EnhancedTWO', 'LevyTWO', 'OppoTWO', 'OriginalTWO', 'BaseVCS', 'OriginalVCS', 'OriginalWCA', 'OriginalWDO', 'OriginalWHO', 'HI_WOA', 'OriginalWOA', 'OriginalWaOA', 'OriginalWarSO', 'OriginalZOA'], 'regression_objective': {'A10': 'max', 'A20': 'max', 'A30': 'max', 'ACOD': 'max', 'APCC': 'max', 'AR': 'max', 'AR2': 'max', 'CI': 'max', 'COD': 'max', 'COR': 'max', 'COV': 'max', 'CRM': 'min', 'DRV': 'min', 'EC': 'max', 'EVS': 'max', 'GINI': 'min', 'GINI_WIKI': 'min', 'JSD': 'min', 'KGE': 'max', 'MAAPE': 'min', 'MAE': 'min', 'MAPE': 'min', 'MASE': 'min', 'ME': 'min', 'MRB': 'min', 'MRE': 'min', 'MSE': 'min', 'MSLE': 'min', 'MedAE': 'min', 'NNSE': 'max', 'NRMSE': 'min', 'NSE': 'max', 'OI': 'max', 'PCC': 'max', 'PCD': 'max', 'R': 'max', 'R2': 'max', 'R2S': 'max', 'RAE': 'min', 'RMSE': 'min', 'RSE': 'min', 'RSQ': 'max', 'SMAPE': 'min', 'VAF': 'max', 'WI': 'max'}, 'transfer_func': ['vstf_01', 'vstf_02', 'vstf_03', 'vstf_04', 'sstf_01', 'sstf_02', 'sstf_03', 'sstf_04']}
fit(X, y=None, fit_weights=(0.9, 0.1), verbose=True, mode='single', n_workers=None, termination=None)[source]
Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples.

  • y (array-like of shape (n_samples,)) – The target values.

  • fit_weights (list, tuple or np.ndarray, default = (0.9, 0.1)) – The first weight is for objective value and the second weight is for the number of features

  • verbose (int, default = True) – Controls verbosity of output.

  • mode (str, default = 'single') –

    The mode used in Optimizer belongs to Mealpy library. Parallel: ‘process’, ‘thread’; Sequential: ‘swarm’, ‘single’.

    • ’process’: The parallel mode with multiple cores run the tasks

    • ’thread’: The parallel mode with multiple threads run the tasks

    • ’swarm’: The sequential mode that no effect on updating phase of other agents

    • ’single’: The sequential mode that effect on updating phase of other agents, default

  • n_workers (int or None, default = None) – The number of workers (cores or threads) to do the tasks (effect only on parallel mode)

  • termination (dict or None, default = None) – The termination dictionary or an instance of Termination class. It is for Optimizer belongs to Mealpy library.

fit_transform(X, y=None, fit_weights=(0.9, 0.1), verbose=True, mode='single', n_workers=None, termination=None)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

get_best_obj_and_fit()[source]
transform(X)[source]

Reduce X to the selected features.

Parameters

X (array of shape [n_samples, n_features]) – The input samples.

Returns

X_r – The input samples with only the selected features.

Return type

array of shape [n_samples, n_selected_features]

class mafese.wrapper.mha.MultiMhaSelector(problem='classification', estimator='knn', estimator_paras=None, list_optimizers=('BaseGA',), list_optimizer_paras=None, transfer_func='vstf_01', obj_name=None)[source]

Bases: mafese.selector.Selector

SUPPORT = {'classification_objective': {'AS': 'max', 'BSL': 'min', 'CEL': 'min', 'CKS': 'max', 'F1S': 'max', 'F2S': 'max', 'FBS': 'max', 'GINI': 'min', 'GMS': 'max', 'HL': 'min', 'HS': 'max', 'JSI': 'max', 'KLDL': 'min', 'LS': 'max', 'MCC': 'max', 'NPV': 'max', 'PS': 'max', 'ROC-AUC': 'max', 'RS': 'max', 'SS': 'max'}, 'estimator': ['knn', 'svm', 'rf', 'adaboost', 'xgb', 'tree', 'ann'], 'optimizer': ['OriginalABC', 'OriginalACOR', 'AugmentedAEO', 'EnhancedAEO', 'ImprovedAEO', 'ModifiedAEO', 'OriginalAEO', 'MGTO', 'OriginalAGTO', 'BaseALO', 'OriginalALO', 'OriginalAO', 'OriginalAOA', 'IARO', 'LARO', 'OriginalARO', 'OriginalASO', 'OriginalAVOA', 'OriginalArchOA', 'AdaptiveBA', 'ModifiedBA', 'OriginalBA', 'BaseBBO', 'OriginalBBO', 'OriginalBBOA', 'OriginalBES', 'ABFO', 'OriginalBFO', 'OriginalBMO', 'BaseBRO', 'OriginalBRO', 'OriginalBSA', 'ImprovedBSO', 'OriginalBSO', 'CleverBookBeesA', 'OriginalBeesA', 'ProbBeesA', 'OriginalCA', 'OriginalCDO', 'OriginalCEM', 'OriginalCGO', 'BaseCHIO', 'OriginalCHIO', 'OriginalCOA', 'OCRO', 'OriginalCRO', 'OriginalCSA', 'OriginalCSO', 'OriginalCircleSA', 'OriginalCoatiOA', 'BaseDE', 'JADE', 'SADE', 'SAP_DE', 'DevDMOA', 'OriginalDMOA', 'OriginalDO', 'BaseEFO', 'OriginalEFO', 'OriginalEHO', 'AdaptiveEO', 'ModifiedEO', 'OriginalEO', 'OriginalEOA', 'LevyEP', 'OriginalEP', 'CMA_ES', 'LevyES', 'OriginalES', 'Simple_CMA_ES', 'OriginalESOA', 'OriginalEVO', 'OriginalFA', 'BaseFBIO', 'OriginalFBIO', 'OriginalFFA', 'OriginalFFO', 'OriginalFLA', 'BaseFOA', 'OriginalFOA', 'WhaleFOA', 'OriginalFOX', 'OriginalFPA', 'BaseGA', 'EliteMultiGA', 'EliteSingleGA', 'MultiGA', 'SingleGA', 'OriginalGBO', 'BaseGCO', 'OriginalGCO', 'OriginalGJO', 'OriginalGOA', 'BaseGSKA', 'OriginalGSKA', 'Matlab101GTO', 'Matlab102GTO', 'OriginalGTO', 'GWO_WOA', 'IGWO', 'OriginalGWO', 'RW_GWO', 'OriginalHBA', 'OriginalHBO', 'OriginalHC', 'SwarmHC', 'OriginalHCO', 'OriginalHGS', 'OriginalHGSO', 'OriginalHHO', 'BaseHS', 'OriginalHS', 'OriginalICA', 'OriginalINFO', 'OriginalIWO', 'BaseJA', 'LevyJA', 'OriginalJA', 'BaseLCO', 'ImprovedLCO', 'OriginalLCO', 'OriginalMA', 'BaseMFO', 'OriginalMFO', 'OriginalMGO', 'OriginalMPA', 'OriginalMRFO', 'WMQIMRFO', 'OriginalMSA', 'BaseMVO', 'OriginalMVO', 'OriginalNGO', 'ImprovedNMRA', 'OriginalNMRA', 'OriginalNRO', 'OriginalOOA', 'OriginalPFA', 'OriginalPOA', 'CL_PSO', 'C_PSO', 'HPSO_TVAC', 'OriginalPSO', 'PPSO', 'OriginalPSS', 'BaseQSA', 'ImprovedQSA', 'LevyQSA', 'OppoQSA', 'OriginalQSA', 'OriginalRIME', 'OriginalRUN', 'GaussianSA', 'OriginalSA', 'SwarmSA', 'BaseSARO', 'OriginalSARO', 'BaseSBO', 'OriginalSBO', 'BaseSCA', 'OriginalSCA', 'QleSCA', 'OriginalSCSO', 'ImprovedSFO', 'OriginalSFO', 'L_SHADE', 'OriginalSHADE', 'OriginalSHIO', 'OriginalSHO', 'ImprovedSLO', 'ModifiedSLO', 'OriginalSLO', 'BaseSMA', 'OriginalSMA', 'DevSOA', 'OriginalSOA', 'OriginalSOS', 'DevSPBO', 'OriginalSPBO', 'OriginalSRSR', 'BaseSSA', 'OriginalSSA', 'OriginalSSDO', 'OriginalSSO', 'OriginalSSpiderA', 'OriginalSSpiderO', 'OriginalSTO', 'OriginalSeaHO', 'OriginalServalOA', 'OriginalTDO', 'BaseTLO', 'ImprovedTLO', 'OriginalTLO', 'OriginalTOA', 'OriginalTPO', 'OriginalTS', 'OriginalTSA', 'OriginalTSO', 'EnhancedTWO', 'LevyTWO', 'OppoTWO', 'OriginalTWO', 'BaseVCS', 'OriginalVCS', 'OriginalWCA', 'OriginalWDO', 'OriginalWHO', 'HI_WOA', 'OriginalWOA', 'OriginalWaOA', 'OriginalWarSO', 'OriginalZOA'], 'regression_objective': {'A10': 'max', 'A20': 'max', 'A30': 'max', 'ACOD': 'max', 'APCC': 'max', 'AR': 'max', 'AR2': 'max', 'CI': 'max', 'COD': 'max', 'COR': 'max', 'COV': 'max', 'CRM': 'min', 'DRV': 'min', 'EC': 'max', 'EVS': 'max', 'GINI': 'min', 'GINI_WIKI': 'min', 'JSD': 'min', 'KGE': 'max', 'MAAPE': 'min', 'MAE': 'min', 'MAPE': 'min', 'MASE': 'min', 'ME': 'min', 'MRB': 'min', 'MRE': 'min', 'MSE': 'min', 'MSLE': 'min', 'MedAE': 'min', 'NNSE': 'max', 'NRMSE': 'min', 'NSE': 'max', 'OI': 'max', 'PCC': 'max', 'PCD': 'max', 'R': 'max', 'R2': 'max', 'R2S': 'max', 'RAE': 'min', 'RMSE': 'min', 'RSE': 'min', 'RSQ': 'max', 'SMAPE': 'min', 'VAF': 'max', 'WI': 'max'}, 'transfer_func': ['vstf_01', 'vstf_02', 'vstf_03', 'vstf_04', 'sstf_01', 'sstf_02', 'sstf_03', 'sstf_04']}
evaluate(estimator=None, estimator_paras=None, data=None, metrics=None, save_path='history', verbose=False)[source]

Evaluate the new dataset. We will re-train the estimator with training set and return the metrics of both training and testing set

Parameters
  • estimator (str or Estimator instance (from scikit-learn or custom)) –

    If estimator is str, we are currently support:
    • knn: k-nearest neighbors

    • svm: support vector machine

    • rf: random forest

    • adaboost: AdaBoost

    • xgb: Gradient Boosting

    • tree: Extra Trees

    • ann: Artificial Neural Network (Multi-Layer Perceptron)

    If estimator is Estimator instance: you need to make sure that it has fit and predict methods

  • estimator_paras (None or dict, default = None) – The parameters of the estimator, please see the official document of scikit-learn to selected estimator. If None, we use the default parameter for selected estimator

  • data (Data, an instance of Data class. It must have training and testing set) –

  • metrics (tuple, list, default = None) – Depend on the regression or classification you are trying to tackle. The supported metrics can be found at: https://github.com/thieu1995/permetrics

  • save_path (str, default="history") – The path to save the file

  • verbose (bool, default=False) – Print the results to console or not.

Returns

metrics_results – The metrics for both training and testing set.

Return type

dict.

export_boxplot_figures(xlabel='Model', ylabel='Global best fitness value', title='Boxplot of comparison models', show_legend=True, show_mean_only=False, exts=('.png', '.pdf'))[source]
export_convergence_figures(xlabel='Epoch', ylabel='Fitness value', title='Convergence chart of comparison models', exts=('.png', '.pdf'))[source]
fit(X, y=None, n_trials=2, n_jobs=2, save_path='history', save_results=True, verbose=True, fit_weights=(0.9, 0.1), mode='single', n_workers=None, termination=None)[source]
Parameters
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples.

  • y (array-like of shape (n_samples,)) – The target values.

  • n_trials (int. Number of repetitions) –

  • n_jobs (int, None. Number of processes will be used to speed up the computation (<=1 or None: sequential, >=2: parallel)) –

  • save_path (str. The path to the folder that hold results) –

  • save_results (bool. Save the global best fitness and loss (convergence/fitness) during generations to csv file (default: True)) –

  • fit_weights (list, tuple or np.ndarray, default = (0.9, 0.1)) – The first weight is for objective value and the second weight is for the number of features

  • verbose (int, default = True) – Controls verbosity of output.

  • mode (str, default = 'single') –

    The mode used in Optimizer belongs to Mealpy library. Parallel: ‘process’, ‘thread’; Sequential: ‘swarm’, ‘single’.

    • ’process’: The parallel mode with multiple cores run the tasks

    • ’thread’: The parallel mode with multiple threads run the tasks

    • ’swarm’: The sequential mode that no effect on updating phase of other agents

    • ’single’: The sequential mode that effect on updating phase of other agents, default

  • n_workers (int or None, default = None) – The number of workers (cores or threads) used in Optimizer (effect only on parallel mode)

  • termination (dict or None, default = None) – The termination dictionary or an instance of Termination class. It is for Optimizer belongs to Mealpy library.

fit_transform(X, y=None, n_trials=2, n_jobs=2, save_path='history', save_results=True, verbose=True, fit_weights=(0.9, 0.1), mode='single', n_workers=None, termination=None)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

transform(X, trial=1, model='BaseGA', all_models=False)[source]

Reduce X to the selected features.

Parameters

X (array of shape [n_samples, n_features]) – The input samples.

Returns

X_r – The input samples with only the selected features.

Return type

array of shape [n_samples, n_selected_features]