AI Learning #mlops #deployment #devops #machine-learning #production #tutorial

MLOps: Deploy dan Maintain Model AI di Production

Pelajari MLOps dari nol: cara deploy model AI ke production, monitoring, retraining, dan best practices untuk maintain model dalam jangka panjang.

AI Content Hub Β· 30 Maret 2026

MLOps: Deploy dan Maintain Model AI di Production

Sudah berhasil training model AI dengan akurasi 95%? Selamat! Tapi tahukah kamu? Training model itu cuma 20% dari pekerjaan. Sisanya? Deploy, monitor, dan maintain model di production β€” itulah yang disebut MLOps (Machine Learning Operations)! πŸš€

Apa Itu MLOps?

MLOps adalah praktik menggabungkan Machine Learning, DevOps, dan Data Engineering untuk mengotomatisasi dan streamline deployment serta maintenance model AI di production.

Perbandingan: ML Research vs MLOps

AspekML ResearchMLOps/Production
CodeJupyter notebookModular, tested, versioned
DataStatic datasetStreaming, real-time
ModelSingle trained modelVersioned, A/B tested
DeploymentManual/saved fileAutomated pipeline
MonitoringValidation metricsReal-time performance
UpdatesManual retrainingAutomated retraining

ML Lifecycle: Dari Eksperimen ke Production

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Data Prep  │────▢│   Training  │────▢│   Evaluate  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                β”‚
                                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Monitor   │◀────│   Deploy    │◀────│    Test     β”‚
β”‚  & Retrain  β”‚     β”‚             β”‚     β”‚             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 1: Model Packaging

Save Model

import joblib
import pickle

# Scikit-learn
joblib.dump(model, 'model.pkl')

# TensorFlow/Keras
model.save('my_model.h5')

# PyTorch
torch.save(model.state_dict(), 'model.pth')

Model Registry

Simpan model dengan versioning:

MLflow:

import mlflow

mlflow.set_experiment("sentiment-analysis")
with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_metric("accuracy", 0.95)
    mlflow.sklearn.log_model(model, "model")

Step 2: Model Deployment

Option 1: REST API (Flask/FastAPI)

FastAPI (Recommended):

from fastapi import FastAPI
from pydantic import BaseModel
import joblib

app = FastAPI()
model = joblib.load('model.pkl')

class PredictionRequest(BaseModel):
    text: str

@app.post("/predict")
def predict(request: PredictionRequest):
    prediction = model.predict([request.text])
    return {"prediction": prediction[0]}

# Run: uvicorn main:app --host 0.0.0.0 --port 8000

Testing:

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "This movie is amazing!"}'

Option 2: Docker Container

Dockerfile:

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY model.pkl .
COPY app.py .

EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Build & Run:

docker build -t my-ml-model .
docker run -p 8000:8000 my-ml-model

Option 3: Cloud Deployment

AWS SageMaker:

import sagemaker
from sagemaker.sklearn import SKLearnModel

model = SKLearnModel(
    model_data='s3://my-bucket/model.tar.gz',
    role=role,
    entry_point='inference.py'
)

predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large'
)

Google Cloud AI Platform:

gcloud ai-platform models create my_model
gcloud ai-platform versions create v1 \
  --model=my_model \
  --runtime-version=2.5 \
  --python-version=3.7 \
  --framework=scikit-learn \
  --origin=gs://my-bucket/model/

Step 3: Model Serving Patterns

Pattern 1: Online Serving (Real-time)

Pattern 2: Batch Serving

Pattern 3: Edge Deployment

# TensorFlow Lite conversion
import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model('my_model')
tflite_model = converter.convert()

with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

Step 4: Monitoring Model di Production

Metrics yang Harus Di-Monitor

1. Model Performance Metrics

# Track prediction confidence
def log_prediction(features, prediction, confidence):
    mlflow.log_metric("confidence", confidence)
    
    if confidence < 0.7:
        send_alert("Low confidence prediction detected!")

2. Data Drift Detection

Data di production berubah dari training data?

Evidently AI:

from evidently import ColumnMapping
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset

report = Report(metrics=[DataDriftPreset()])
report.run(
    reference_data=training_data,
    current_data=production_data
)
report.save_html("drift_report.html")

3. Concept Drift

Hubungan antara features dan target berubah?

Contoh:

Monitoring Tools

ToolUse Case
Prometheus + GrafanaInfrastructure metrics
Evidently AIML-specific metrics
MLflowExperiment tracking
Weights & BiasesExperiment tracking + visualization
WhyLabsData drift detection

Step 5: Retraining Pipeline

Trigger untuk Retraining

  1. Scheduled: Retrain setiap minggu/bulan
  2. Performance-based: Accuracy turun di bawah threshold
  3. Data-based: Data drift terdeteksi
  4. Manual: Data scientist trigger retraining

Automated Retraining dengan Airflow

from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime

def check_drift():
    # Check if retraining needed
    return detect_drift()

def retrain_model():
    # Fetch new data
    # Train model
    # Validate
    # Deploy if better
    pass

with DAG('ml_retraining', start_date=datetime(2024, 1, 1)) as dag:
    check = PythonOperator(task_id='check_drift', python_callable=check_drift)
    retrain = PythonOperator(task_id='retrain', python_callable=retrain_model)
    
    check >> retrain

Step 6: A/B Testing untuk Model

Test model baru tanpa risk ke seluruh user.

# Route 10% traffic ke model baru
def get_model_version(user_id):
    if hash(user_id) % 100 < 10:  # 10%
        return "model_v2"
    return "model_v1"

# Compare metrics
# If v2 better, increase traffic gradually

Tools:

Step 7: CI/CD untuk ML

ML Pipeline dengan GitHub Actions

# .github/workflows/ml-pipeline.yml
name: ML Pipeline
on: [push]

jobs:
  train:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Setup Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      - name: Install dependencies
        run: pip install -r requirements.txt
      - name: Train model
        run: python train.py
      - name: Evaluate
        run: python evaluate.py
      - name: Deploy if good
        if: success()
        run: python deploy.py

Best Practices MLOps

βœ… 1. Version Everything

βœ… 2. Reproducibility

# environment.yml
name: ml-project
channels:
  - conda-forge
dependencies:
  - python=3.9
  - scikit-learn=1.2.0
  - pandas=1.5.0

βœ… 3. Testing

βœ… 4. Documentation

βœ… 5. Security

Tools MLOps Populer

End-to-End Platforms

Specialized Tools

CategoryTools
Experiment TrackingMLflow, Weights & Biases, Neptune
Data VersioningDVC, Pachyderm
Feature StoreFeast, Tecton
Model ServingSeldon, KFServing, BentoML
MonitoringEvidently, WhyLabs, Arize

Kesimpulan

MLOps adalah bridge antara ML research dan production. Tanpa MLOps, model AI baguspun hanya tinggal di notebook dan tidak bisa deliver value.

Key takeaways:

  1. Deployment = REST API, Docker, atau cloud service
  2. Monitoring = Track performance, data drift, concept drift
  3. Retraining = Automated pipeline untuk keep model fresh
  4. Testing = A/B testing untuk safe model updates
  5. Tools = MLflow, Kubeflow, Evidently, dll.

Next step: Coba deploy model sederhana dengan Flask/FastAPI, containerize dengan Docker, dan setup monitoring dasar. Selamat ber-MLOps! πŸš€


Pernah deploy model ke production? Share pengalaman sukses atau lessons learned-mu!