Skip to content

Algorithm Selection

We currently support regression and classification algorithms from scikit-learn, XGBoost, and LightGBM.

Algorithms

Gradient Boosting

Algorithm Regression Classification
xgboost XGBRegressor XGBClassifier
xgboost_random_forest XGBRFRegressor XGBRFClassifier
lightgbm LGBMRegressor LGBMClassifier

Scikit Ensembles

Algorithm Regression Classification
ada_boost AdaBoostRegressor AdaBoostClassifier
bagging BaggingRegressor BaggingClassifier
extra_trees ExtraTreesRegressor ExtraTreesClassifier
gradient_boosting_trees GradientBoostingRegressor GradientBoostingClassifier
random_forest RandomForestRegressor RandomForestClassifier
hist_gradient_boosting HistGradientBoostingRegressor HistGradientBoostingClassifier

Support Vector Machines

Algorithm Regression Classification
svm SVR SVC
nu_svm NuSVR NuSVC
linear_svm LinearSVR LinearSVC

Linear Models

Algorithm Regression Classification
linear LinearRegression LogisticRegression
ridge Ridge RidgeClassifier
lasso Lasso -
elastic_net ElasticNet -
least_angle LARS -
lasso_least_angle LassoLars -
orthoganl_matching_pursuit OrthogonalMatchingPursuit -
bayesian_ridge BayesianRidge -
automatic_relevance_determination ARDRegression -
stochastic_gradient_descent SGDRegressor SGDClassifier
perceptron - Perceptron
passive_aggressive PassiveAggressiveRegressor PassiveAggressiveClassifier
ransac RANSACRegressor -
theil_sen TheilSenRegressor -
huber HuberRegressor -
quantile QuantileRegressor -

Other

Algorithm Regression Classification
kernel_ridge KernelRidge -
gaussian_process GaussianProcessRegressor GaussianProcessClassifier

Comparing Algorithms

Any of the above algorithms can be passed to our pgml.train() function using the algorithm parameter. If the parameter is omitted, linear regression is used by default.

Example

SELECT * FROM pgml.train(
    'My First PostgresML Project',
    task => 'classification',
    relation_name => 'pgml.digits',
    y_column_name => 'target',
    algorithm => 'xgboost',
);

The hyperparams argument will pass the hyperparameters on to the algorithm. Take a look at the associated documentation for valid hyperparameters of each algorithm. Our interface uses the scikit-learn notation for all parameters.

Example

SELECT * FROM pgml.train(
    'My First PostgresML Project',
    algorithm => 'xgboost',
    hyperparams => '{
        "n_estimators": 25
    }'
);

Once prepared, the training data can be efficiently reused by other PostgresML algorithms for training and predictions. Every time the pgml.train() function receives the relation_name and y_column_name arguments, it will create a new snapshot of the relation (table) and save it in the pgml schema.

To train another algorithm on the same dataset, omit the two arguments. PostgresML will reuse the latest snapshot with the new algorithm.

Tip

Try experimenting with multiple algorithms to explore their performance characteristics on your dataset. It's often hard to know which algorithm will be the best.

Dashboard

The PostgresML dashboard makes it easy to compare various algorithms on your dataset. You can explore individual metrics & compare algorithms to each other, all trained on the same dataset for a fair benchmark.

Model Selection

Comments