Skip to content

Quickstart

In this quickstart, you’ll build a complete ML pipeline that:

  1. Loads the Titanic dataset
  2. Splits into train/test sets
  3. Trains a Random Forest classifier
  4. Evaluates accuracy and generates explanations

Time to complete: ~5 minutes

Before starting, make sure you have:

  • MLOps Desktop installed
  • Python 3.9+ with packages: pip install scikit-learn pandas shap
  1. Open MLOps Desktop

    Launch the app. You’ll see an empty canvas with a toolbar at the top and a node palette on the left.

  2. Add a DataLoader node

    From the Components panel on the left, drag DataLoader onto the canvas.

    Click the node to select it. In the node, click Browse and select a CSV file. For this tutorial, use any classification dataset (or see “Sample Dataset” below).

  3. Add a DataSplit node

    Drag Data Split from the Components panel onto the canvas.

    Connect the DataLoader’s right handle to the DataSplit’s left handle.

    Configure DataSplit:

    • Test Split: 20% (default)
    • Random State: 42
    • Stratify: Enable, set column to Survived (or your target)
  4. Add a Trainer node

    Drag Trainer onto the canvas and connect from DataSplit.

    Configure Trainer:

    • Mode: Train (default)
    • Model Type: Random Forest Classifier
    • Target Column: Survived
  5. Add an Evaluator node

    Drag Evaluator onto the canvas and connect from Trainer.

    No configuration needed—it auto-detects the model type.

  6. Run the pipeline

    Click the Run button in the toolbar.

    Watch the Logs tab as each node executes:

    [DataLoader] Loaded titanic.csv: 891 rows, 12 columns
    [DataSplit] Split: 712 train, 179 test (stratified by Survived)
    [Trainer] Training RandomForestClassifier...
    [Trainer] Training complete
    [Evaluator] Accuracy: 0.821, F1: 0.756
  7. View results

    Click the Metrics tab to see:

    • Bar chart with Accuracy, Precision, Recall, F1
    • Confusion matrix heatmap

    Click Explain to generate:

    • Feature importance chart
    • SHAP beeswarm plot
    • Partial dependence plots

If you don’t have a CSV file, create the Titanic dataset:

import pandas as pd
from sklearn.datasets import fetch_openml
titanic = fetch_openml('titanic', version=1, as_frame=True)
df = titanic.frame
df.to_csv("titanic.csv", index=False)
print(f"Saved titanic.csv with {len(df)} rows")

Or use the Iris dataset for a simpler example:

from sklearn.datasets import load_iris
import pandas as pd
iris = load_iris(as_frame=True)
df = iris.frame
df.to_csv("iris.csv", index=False)
print("Saved iris.csv")

For Iris, set Target Column to target in the Trainer.

MetricWhat It Means
AccuracyOverall correctness (correct / total)
PrecisionWhen we predict positive, how often are we right?
RecallOf all actual positives, how many did we find?
F1 ScoreBalance of precision and recall
Predicted
Died Survived
Actual Died 98 12
Survived 21 48
  • Diagonal = correct predictions
  • Off-diagonal = errors

Click Explain to see why your model makes predictions:

  • Feature Importance — Which features matter most
  • SHAP Beeswarm — How each feature pushes predictions up/down
  • Partial Dependence — How changing a feature affects predictions

Explore the other tabs at the bottom:

TabPurpose
LogsExecution output and errors
Data ProfileDataset statistics
MetricsModel performance charts
RunsHistory of all pipeline runs
ModelsRegistered models with versioning
TrialsHyperparameter tuning results
ServingHTTP model server

Click Save in the toolbar:

  1. Enter a name (e.g., “titanic-classifier”)
  2. Click Save

Your pipeline is stored locally and appears in the Load dropdown.


Troubleshooting:

  • “No Python found” — See Python Setup
  • “Module not found: sklearn” — Run pip install scikit-learn
  • “Target column not found” — Check column name matches exactly (case-sensitive)
  • Pipeline stuck — Check the Logs tab for error messages