MLOps: Deploy dan Maintain Model AI di Production
Sudah berhasil training model AI dengan akurasi 95%? Selamat! Tapi tahukah kamu? Training model itu cuma 20% dari pekerjaan. Sisanya? Deploy, monitor, dan maintain model di production β itulah yang disebut MLOps (Machine Learning Operations)! π
Apa Itu MLOps?
MLOps adalah praktik menggabungkan Machine Learning, DevOps, dan Data Engineering untuk mengotomatisasi dan streamline deployment serta maintenance model AI di production.
Perbandingan: ML Research vs MLOps
| Aspek | ML Research | MLOps/Production |
|---|---|---|
| Code | Jupyter notebook | Modular, tested, versioned |
| Data | Static dataset | Streaming, real-time |
| Model | Single trained model | Versioned, A/B tested |
| Deployment | Manual/saved file | Automated pipeline |
| Monitoring | Validation metrics | Real-time performance |
| Updates | Manual retraining | Automated retraining |
ML Lifecycle: Dari Eksperimen ke Production
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Data Prep ββββββΆβ Training ββββββΆβ Evaluate β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β
βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Monitor βββββββ Deploy βββββββ Test β
β & Retrain β β β β β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
Step 1: Model Packaging
Save Model
import joblib
import pickle
# Scikit-learn
joblib.dump(model, 'model.pkl')
# TensorFlow/Keras
model.save('my_model.h5')
# PyTorch
torch.save(model.state_dict(), 'model.pth')
Model Registry
Simpan model dengan versioning:
MLflow:
import mlflow
mlflow.set_experiment("sentiment-analysis")
with mlflow.start_run():
mlflow.log_param("learning_rate", 0.01)
mlflow.log_metric("accuracy", 0.95)
mlflow.sklearn.log_model(model, "model")
Step 2: Model Deployment
Option 1: REST API (Flask/FastAPI)
FastAPI (Recommended):
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
app = FastAPI()
model = joblib.load('model.pkl')
class PredictionRequest(BaseModel):
text: str
@app.post("/predict")
def predict(request: PredictionRequest):
prediction = model.predict([request.text])
return {"prediction": prediction[0]}
# Run: uvicorn main:app --host 0.0.0.0 --port 8000
Testing:
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"text": "This movie is amazing!"}'
Option 2: Docker Container
Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY model.pkl .
COPY app.py .
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
Build & Run:
docker build -t my-ml-model .
docker run -p 8000:8000 my-ml-model
Option 3: Cloud Deployment
AWS SageMaker:
import sagemaker
from sagemaker.sklearn import SKLearnModel
model = SKLearnModel(
model_data='s3://my-bucket/model.tar.gz',
role=role,
entry_point='inference.py'
)
predictor = model.deploy(
initial_instance_count=1,
instance_type='ml.m5.large'
)
Google Cloud AI Platform:
gcloud ai-platform models create my_model
gcloud ai-platform versions create v1 \
--model=my_model \
--runtime-version=2.5 \
--python-version=3.7 \
--framework=scikit-learn \
--origin=gs://my-bucket/model/
Step 3: Model Serving Patterns
Pattern 1: Online Serving (Real-time)
- Use case: Chatbot, recommendation, fraud detection
- Latency requirement: < 100ms
- Tech: REST API, gRPC
Pattern 2: Batch Serving
- Use case: Daily report, churn prediction
- Latency requirement: Minutes to hours OK
- Tech: Apache Spark, Airflow
Pattern 3: Edge Deployment
- Use case: Mobile app, IoT devices
- Constraint: Limited compute, offline capable
- Tech: TensorFlow Lite, ONNX, Core ML
# TensorFlow Lite conversion
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model('my_model')
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
Step 4: Monitoring Model di Production
Metrics yang Harus Di-Monitor
1. Model Performance Metrics
# Track prediction confidence
def log_prediction(features, prediction, confidence):
mlflow.log_metric("confidence", confidence)
if confidence < 0.7:
send_alert("Low confidence prediction detected!")
2. Data Drift Detection
Data di production berubah dari training data?
Evidently AI:
from evidently import ColumnMapping
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
report = Report(metrics=[DataDriftPreset()])
report.run(
reference_data=training_data,
current_data=production_data
)
report.save_html("drift_report.html")
3. Concept Drift
Hubungan antara features dan target berubah?
Contoh:
- Training: βiPhoneβ = luxury item (2010)
- Production: βiPhoneβ = common item (2024)
- Model perlu retrain dengan data baru!
Monitoring Tools
| Tool | Use Case |
|---|---|
| Prometheus + Grafana | Infrastructure metrics |
| Evidently AI | ML-specific metrics |
| MLflow | Experiment tracking |
| Weights & Biases | Experiment tracking + visualization |
| WhyLabs | Data drift detection |
Step 5: Retraining Pipeline
Trigger untuk Retraining
- Scheduled: Retrain setiap minggu/bulan
- Performance-based: Accuracy turun di bawah threshold
- Data-based: Data drift terdeteksi
- Manual: Data scientist trigger retraining
Automated Retraining dengan Airflow
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
def check_drift():
# Check if retraining needed
return detect_drift()
def retrain_model():
# Fetch new data
# Train model
# Validate
# Deploy if better
pass
with DAG('ml_retraining', start_date=datetime(2024, 1, 1)) as dag:
check = PythonOperator(task_id='check_drift', python_callable=check_drift)
retrain = PythonOperator(task_id='retrain', python_callable=retrain_model)
check >> retrain
Step 6: A/B Testing untuk Model
Test model baru tanpa risk ke seluruh user.
# Route 10% traffic ke model baru
def get_model_version(user_id):
if hash(user_id) % 100 < 10: # 10%
return "model_v2"
return "model_v1"
# Compare metrics
# If v2 better, increase traffic gradually
Tools:
- MLflow Model Registry: Manage model versions
- Seldon: Advanced deployment patterns (canary, shadow)
- KFServing: Kubernetes-native model serving
Step 7: CI/CD untuk ML
ML Pipeline dengan GitHub Actions
# .github/workflows/ml-pipeline.yml
name: ML Pipeline
on: [push]
jobs:
train:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Install dependencies
run: pip install -r requirements.txt
- name: Train model
run: python train.py
- name: Evaluate
run: python evaluate.py
- name: Deploy if good
if: success()
run: python deploy.py
Best Practices MLOps
β 1. Version Everything
- Code: Git
- Data: DVC (Data Version Control)
- Model: MLflow, Weights & Biases
- Environment: Docker, Conda
β 2. Reproducibility
# environment.yml
name: ml-project
channels:
- conda-forge
dependencies:
- python=3.9
- scikit-learn=1.2.0
- pandas=1.5.0
β 3. Testing
- Unit tests untuk preprocessing
- Integration tests untuk pipeline
- Model performance tests
β 4. Documentation
- Model cards (data, limitations, bias)
- API documentation
- Runbooks untuk on-call
β 5. Security
- Encrypt model artifacts
- Access control untuk model API
- Audit logs untuk predictions
Tools MLOps Populer
End-to-End Platforms
- Kubeflow: Kubernetes-native ML workflows
- MLflow: Experiment tracking, model registry, deployment
- Azure Machine Learning: Cloud MLOps platform
- AWS SageMaker: Managed ML platform
Specialized Tools
| Category | Tools |
|---|---|
| Experiment Tracking | MLflow, Weights & Biases, Neptune |
| Data Versioning | DVC, Pachyderm |
| Feature Store | Feast, Tecton |
| Model Serving | Seldon, KFServing, BentoML |
| Monitoring | Evidently, WhyLabs, Arize |
Kesimpulan
MLOps adalah bridge antara ML research dan production. Tanpa MLOps, model AI baguspun hanya tinggal di notebook dan tidak bisa deliver value.
Key takeaways:
- Deployment = REST API, Docker, atau cloud service
- Monitoring = Track performance, data drift, concept drift
- Retraining = Automated pipeline untuk keep model fresh
- Testing = A/B testing untuk safe model updates
- Tools = MLflow, Kubeflow, Evidently, dll.
Next step: Coba deploy model sederhana dengan Flask/FastAPI, containerize dengan Docker, dan setup monitoring dasar. Selamat ber-MLOps! π
Pernah deploy model ke production? Share pengalaman sukses atau lessons learned-mu!