Skip to content

Exporter Node

The Exporter node saves trained models to disk in various formats for deployment or later use.

PropertyValue
TypeTerminal node
InputsTrained model (from Trainer)
OutputsModel file(s) on disk
Formatsjoblib, pickle, ONNX (coming soon)

Where to save the model file.

~/Desktop/my_model.joblib
/path/to/models/classifier_v1.pkl

Best for: Python applications

  • Efficient for NumPy arrays
  • Smaller file sizes for large models
  • Fast loading
import joblib
model = joblib.load("model.joblib")

When enabled, saves a companion .json file with:

{
"model_type": "RandomForestClassifier",
"sklearn_version": "1.3.0",
"feature_names": ["sepal_length", "sepal_width", "petal_length", "petal_width"],
"target_names": ["setosa", "versicolor", "virginica"],
"n_features": 4,
"n_classes": 3,
"training_date": "2024-01-15T10:30:00Z",
"metrics": {
"accuracy": 0.967,
"f1_weighted": 0.965
},
"hyperparameters": {
"n_estimators": 100,
"max_depth": 10
}
}

When you export model.joblib with metadata:

FileContents
model.joblibThe trained model
model_meta.jsonMetadata (features, metrics, etc.)
import joblib
import json
# Load model
model = joblib.load("model.joblib")
# Load metadata
with open("model_meta.json") as f:
meta = json.load(f)
# Validate input
expected_features = meta["feature_names"]
print(f"Model expects {len(expected_features)} features: {expected_features}")
# Make predictions
import numpy as np
X_new = np.array([[5.1, 3.5, 1.4, 0.2]])
prediction = model.predict(X_new)
print(f"Predicted: {meta['target_names'][prediction[0]]}")
from flask import Flask, request, jsonify
import joblib
import numpy as np
app = Flask(__name__)
model = joblib.load("model.joblib")
@app.route("/predict", methods=["POST"])
def predict():
data = request.json
X = np.array(data["features"])
predictions = model.predict(X)
return jsonify({"predictions": predictions.tolist()})
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)

Best practices for managing model versions:

{model_name}_v{version}_{algorithm}_{date}.joblib

Examples:

  • churn_model_v1_rf_2024-01-15.joblib
  • churn_model_v2_gb_tuned_2024-01-20.joblib
models/
├── production/
│ └── churn_model_current.joblib -> ../v2/churn_model.joblib
├── v1/
│ ├── churn_model.joblib
│ └── churn_model_meta.json
└── v2/
├── churn_model.joblib
└── churn_model_meta.json

For models > 100MB, use Git LFS:

Terminal window
git lfs install
git lfs track "*.joblib"
git add .gitattributes
git add models/
git commit -m "Add trained model"

Models saved with one scikit-learn version may not load with another.

Solution: Check the sklearn_version in metadata and install the matching version:

Terminal window
pip install scikit-learn==1.3.0

Large models (especially Random Forest with many trees) can be big.

Solutions:

  • Reduce n_estimators
  • Use compression: joblib.dump(model, "model.joblib", compress=3)
  • Export to ONNX for smaller files (coming soon)

The loading environment must have the same packages as the training environment.

Solution: Include a requirements.txt:

scikit-learn==1.3.0
pandas==2.0.0
numpy==1.24.0

Best practices:

  • Only load models you created or trust
  • Verify file integrity with checksums
  • Use ONNX for sharing models publicly (safer format)
import joblib
import json
from datetime import datetime
# Save model
joblib.dump(model, "/path/to/model.joblib")
# Save metadata
metadata = {
"model_type": type(model).__name__,
"feature_names": list(X.columns),
"training_date": datetime.now().isoformat(),
"metrics": {
"accuracy": accuracy,
"f1_score": f1
}
}
with open("/path/to/model_meta.json", "w") as f:
json.dump(metadata, f, indent=2)
print(f"Model saved to /path/to/model.joblib")