Skip to content

Trainer Node

The Trainer node is the core of model training in MLOps Desktop. It supports three modes: training new models, loading pre-trained models, and hyperparameter tuning with Optuna.

PropertyValue
TypeProcessing node
InputsDataFrame (from DataLoader or DataSplit)
OutputsTrained model
Libraryscikit-learn
ModesTrain, Load, Tune

The Trainer node has three modes, selected via toggle buttons:

Train a new model from scratch.

Configuration:

  • Model Type — Select from 12 algorithms
  • Target Column — Column to predict
  • Test Split — Ratio for train/test split (if not using DataSplit node)

Best for initial model development and quick experiments.

ModelDescriptionKey Hyperparameters
Logistic RegressionLinear classifier, interpretableC, max_iter
Random Forest ClassifierEnsemble of decision treesn_estimators, max_depth, min_samples_split
Gradient Boosting ClassifierSequential boosting, high accuracyn_estimators, learning_rate, max_depth
SVM (SVC)Support vector machineC, kernel, gamma
KNN ClassifierDistance-based classificationn_neighbors, weights, metric
MLP ClassifierNeural networkhidden_layer_sizes, alpha, learning_rate_init
ModelDescriptionKey Hyperparameters
Linear RegressionSimple linear modelNone (no tuning)
Random Forest RegressorEnsemble for regressionn_estimators, max_depth, min_samples_split
Gradient Boosting RegressorBoosted trees for regressionn_estimators, learning_rate, max_depth
SVM (SVR)Support vector regressionC, kernel, gamma
KNN RegressorDistance-based regressionn_neighbors, weights, metric
MLP RegressorNeural network for regressionhidden_layer_sizes, alpha, learning_rate_init

When Tune mode is selected, click the tuning config button to open the TuningPanel.

StrategyDescriptionBest For
Bayesian (TPE)Tree-structured Parzen Estimator, learns from past trialsMost cases (default)
RandomUniform random samplingBaseline comparison
GridExhaustive enumeration of all combinationsSmall, discrete spaces
SettingRangeDefaultDescription
Number of Trials1-100050How many configurations to try
CV Folds2-103Cross-validation folds
Scoring Metricvariesaccuracy/r2Metric to optimize

Each model has predefined search ranges:

Random Forest:

n_estimators: 50-300 (step 50)
max_depth: [null, 10, 15, 20, 30]
min_samples_split: 2-10 (step 2)
min_samples_leaf: 1-4 (step 1)

Gradient Boosting:

n_estimators: 50-300 (step 50)
learning_rate: 0.01-0.3 (log scale)
max_depth: 3-8 (step 1)
subsample: 0.7-1.0 (uniform)

SVM (SVC/SVR):

C: 0.1-100 (log scale)
kernel: [rbf, linear, poly]
gamma: [scale, auto]

KNN:

n_neighbors: 3-21 (step 2)
weights: [uniform, distance]
metric: [euclidean, manhattan, minkowski]

MLP Neural Network:

hidden_layer_sizes: [(50,), (100,), (100,50), (100,100)]
alpha: 0.0001-0.1 (log scale)
learning_rate_init: 0.0001-0.1 (log scale)
max_iter: 200-1000 (step 100)

Classification:

  • Accuracy
  • F1 Score
  • Precision
  • Recall
  • ROC AUC

Regression:

  • R² Score
  • Neg MSE
  • Neg MAE
  • Neg RMSE

The Trainer automatically handles:

  • Missing Values — Numeric columns filled with median, categorical with mode
  • Categorical Encoding — Label encoding for all categorical columns
  • ID Column Filtering — Drops columns like id, index, name, ticket, cabin
  • High-Cardinality Columns — Drops columns with >50 unique values
DirectionNode Types
Input fromDataLoader, DataSplit
Output toEvaluator, ModelExporter

Typical pipeline:

DataLoader → DataSplit → Trainer → Evaluator

When tuning, results appear in the Trials tab:

ColumnDescription
Trial #Trial number
ScoreCross-validation score
ParametersHyperparameter values used
DurationTime taken
StatusComplete, Pruned, or Failed

The best trial is highlighted with a star icon.

Linear Regression has no tunable hyperparameters. Use Train mode instead, or select a different model.

Install Optuna:

Terminal window
pip install optuna
  • Reduce number of trials
  • Use Random search instead of Grid
  • Reduce CV folds (minimum 2)
  • Use a faster model (Logistic Regression vs MLP)

Check that the column name matches exactly (case-sensitive). Use the DataLoader preview to verify column names.