GitLab CI/CD & Alternative Platforms
GitLab Model Registry Integration
English Content
GitLab Model Registry Overview
GitLab introduced a built-in Model Registry that provides MLflow-compatible model versioning directly within the platform. This eliminates the need for external model registries and integrates seamlessly with GitLab CI/CD pipelines.
Key features:
- MLflow-compatible API
- Model versioning with semantic versions
- Candidate and production model stages
- Direct integration with CI/CD pipelines
- Artifact storage with GitLab's infrastructure
Enabling Model Registry
The Model Registry is available at the project level under Deploy > Model registry. To use it programmatically, you'll interact with it through the MLflow API:
# scripts/register_model.py
import mlflow
import os
# Configure MLflow to use GitLab's Model Registry
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
mlflow.set_experiment(os.environ["CI_PROJECT_PATH"])
# Set authentication
os.environ["MLFLOW_TRACKING_TOKEN"] = os.environ["CI_JOB_TOKEN"]
def register_model(model_path, model_name, metrics):
"""Register a model with GitLab Model Registry."""
with mlflow.start_run() as run:
# Log metrics
for metric_name, value in metrics.items():
mlflow.log_metric(metric_name, value)
# Log model artifacts
mlflow.log_artifacts(model_path, "model")
# Register the model
model_uri = f"runs:/{run.info.run_id}/model"
mlflow.register_model(model_uri, model_name)
return run.info.run_id
if __name__ == "__main__":
metrics = {
"accuracy": 0.94,
"f1_score": 0.92,
"latency_ms": 45
}
run_id = register_model("models/trained/", "sentiment-classifier", metrics)
print(f"Registered model with run ID: {run_id}")
CI/CD Integration for Model Registration
Integrate model registration into your GitLab CI/CD pipeline:
# .gitlab-ci.yml
stages:
- train
- register
- deploy
variables:
MODEL_NAME: "sentiment-classifier"
train-model:
stage: train
image: python:3.11
script:
- pip install -r requirements.txt
- python scripts/train.py
- python scripts/evaluate.py > metrics.json
artifacts:
paths:
- models/
- metrics.json
expire_in: 7 days
register-model:
stage: register
image: python:3.11
script:
- pip install mlflow
- |
python << 'EOF'
import mlflow
import json
import os
# Configure GitLab Model Registry
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
with open("metrics.json") as f:
metrics = json.load(f)
with mlflow.start_run():
for k, v in metrics.items():
mlflow.log_metric(k, v)
mlflow.log_artifacts("models/", "model")
model_uri = f"runs:/{mlflow.active_run().info.run_id}/model"
mlflow.register_model(model_uri, os.environ["MODEL_NAME"])
EOF
rules:
- if: $CI_COMMIT_BRANCH == "main"
needs:
- train-model
deploy-model:
stage: deploy
script:
- echo "Deploying model $MODEL_NAME"
- python scripts/deploy_from_registry.py
environment:
name: production
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual
needs:
- register-model
Model Versioning Workflow
Implement a complete model versioning workflow:
# .gitlab-ci.yml
variables:
MODEL_NAME: "fraud-detector"
ACCURACY_THRESHOLD: "0.90"
stages:
- train
- validate
- register
- promote
train-candidate:
stage: train
script:
- python scripts/train.py
- python scripts/evaluate.py --output metrics.json
artifacts:
paths:
- models/candidate/
- metrics.json
validate-candidate:
stage: validate
script:
- |
python << 'EOF'
import json
import sys
with open("metrics.json") as f:
metrics = json.load(f)
threshold = float("$ACCURACY_THRESHOLD")
if metrics["accuracy"] < threshold:
print(f"Model accuracy {metrics['accuracy']} below threshold {threshold}")
sys.exit(1)
print(f"Model passed validation with accuracy {metrics['accuracy']}")
EOF
needs:
- train-candidate
register-candidate:
stage: register
script:
- |
python << 'EOF'
import mlflow
import os
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
with mlflow.start_run():
mlflow.log_artifacts("models/candidate/", "model")
model_uri = f"runs:/{mlflow.active_run().info.run_id}/model"
# Register as new version in "Staging" stage
result = mlflow.register_model(model_uri, "$MODEL_NAME")
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage(
name="$MODEL_NAME",
version=result.version,
stage="Staging"
)
print(f"Registered version {result.version} in Staging")
EOF
needs:
- validate-candidate
promote-to-production:
stage: promote
script:
- |
python << 'EOF'
import mlflow
import os
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
client = mlflow.tracking.MlflowClient()
# Get latest staging version
staging_versions = client.get_latest_versions("$MODEL_NAME", stages=["Staging"])
if staging_versions:
version = staging_versions[0].version
client.transition_model_version_stage(
name="$MODEL_NAME",
version=version,
stage="Production",
archive_existing_versions=True
)
print(f"Promoted version {version} to Production")
EOF
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual
needs:
- register-candidate
Fetching Models from Registry
Deploy models by fetching from the registry:
# scripts/deploy_from_registry.py
import mlflow
import os
def load_production_model(model_name):
"""Load the current production model from GitLab Model Registry."""
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
# Load model in production stage
model_uri = f"models:/{model_name}/Production"
model = mlflow.pyfunc.load_model(model_uri)
return model
def get_model_metadata(model_name):
"""Get metadata for the production model."""
client = mlflow.tracking.MlflowClient()
versions = client.get_latest_versions(model_name, stages=["Production"])
if versions:
version = versions[0]
return {
"version": version.version,
"run_id": version.run_id,
"created_at": version.creation_timestamp,
"description": version.description
}
return None
if __name__ == "__main__":
model_name = os.environ.get("MODEL_NAME", "sentiment-classifier")
metadata = get_model_metadata(model_name)
print(f"Deploying model version {metadata['version']}")
model = load_production_model(model_name)
# Deploy model to serving infrastructure...
Key Takeaways
| Aspect | GitLab Model Registry |
|---|---|
| API Compatibility | MLflow-compatible |
| Model Stages | None, Staging, Production, Archived |
| Authentication | CI_JOB_TOKEN in pipelines |
| Versioning | Automatic version incrementing |
| Integration | Native GitLab CI/CD support |
المحتوى العربي
نظرة عامة على سجل نماذج GitLab
قدم GitLab سجل نماذج مدمج يوفر إصدار نماذج متوافق مع MLflow مباشرة داخل المنصة. هذا يلغي الحاجة لسجلات نماذج خارجية ويتكامل بسلاسة مع خطوط أنابيب GitLab CI/CD.
الميزات الرئيسية:
- API متوافق مع MLflow
- إصدار النماذج مع إصدارات دلالية
- مراحل النموذج المرشح والإنتاج
- تكامل مباشر مع خطوط أنابيب CI/CD
- تخزين artifacts مع بنية GitLab التحتية
تمكين سجل النماذج
سجل النماذج متاح على مستوى المشروع تحت Deploy > Model registry. لاستخدامه برمجياً، ستتفاعل معه من خلال MLflow API:
# scripts/register_model.py
import mlflow
import os
# تكوين MLflow لاستخدام سجل نماذج GitLab
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
mlflow.set_experiment(os.environ["CI_PROJECT_PATH"])
# تعيين المصادقة
os.environ["MLFLOW_TRACKING_TOKEN"] = os.environ["CI_JOB_TOKEN"]
def register_model(model_path, model_name, metrics):
"""تسجيل نموذج مع سجل نماذج GitLab."""
with mlflow.start_run() as run:
# تسجيل المقاييس
for metric_name, value in metrics.items():
mlflow.log_metric(metric_name, value)
# تسجيل artifacts النموذج
mlflow.log_artifacts(model_path, "model")
# تسجيل النموذج
model_uri = f"runs:/{run.info.run_id}/model"
mlflow.register_model(model_uri, model_name)
return run.info.run_id
if __name__ == "__main__":
metrics = {
"accuracy": 0.94,
"f1_score": 0.92,
"latency_ms": 45
}
run_id = register_model("models/trained/", "sentiment-classifier", metrics)
print(f"Registered model with run ID: {run_id}")
تكامل CI/CD لتسجيل النماذج
دمج تسجيل النماذج في خط أنابيب GitLab CI/CD:
# .gitlab-ci.yml
stages:
- train
- register
- deploy
variables:
MODEL_NAME: "sentiment-classifier"
train-model:
stage: train
image: python:3.11
script:
- pip install -r requirements.txt
- python scripts/train.py
- python scripts/evaluate.py > metrics.json
artifacts:
paths:
- models/
- metrics.json
expire_in: 7 days
register-model:
stage: register
image: python:3.11
script:
- pip install mlflow
- |
python << 'EOF'
import mlflow
import json
import os
# تكوين سجل نماذج GitLab
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
with open("metrics.json") as f:
metrics = json.load(f)
with mlflow.start_run():
for k, v in metrics.items():
mlflow.log_metric(k, v)
mlflow.log_artifacts("models/", "model")
model_uri = f"runs:/{mlflow.active_run().info.run_id}/model"
mlflow.register_model(model_uri, os.environ["MODEL_NAME"])
EOF
rules:
- if: $CI_COMMIT_BRANCH == "main"
needs:
- train-model
deploy-model:
stage: deploy
script:
- echo "Deploying model $MODEL_NAME"
- python scripts/deploy_from_registry.py
environment:
name: production
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual
needs:
- register-model
سير عمل إصدار النماذج
نفّذ سير عمل كامل لإصدار النماذج:
# .gitlab-ci.yml
variables:
MODEL_NAME: "fraud-detector"
ACCURACY_THRESHOLD: "0.90"
stages:
- train
- validate
- register
- promote
train-candidate:
stage: train
script:
- python scripts/train.py
- python scripts/evaluate.py --output metrics.json
artifacts:
paths:
- models/candidate/
- metrics.json
validate-candidate:
stage: validate
script:
- |
python << 'EOF'
import json
import sys
with open("metrics.json") as f:
metrics = json.load(f)
threshold = float("$ACCURACY_THRESHOLD")
if metrics["accuracy"] < threshold:
print(f"Model accuracy {metrics['accuracy']} below threshold {threshold}")
sys.exit(1)
print(f"Model passed validation with accuracy {metrics['accuracy']}")
EOF
needs:
- train-candidate
register-candidate:
stage: register
script:
- |
python << 'EOF'
import mlflow
import os
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
with mlflow.start_run():
mlflow.log_artifacts("models/candidate/", "model")
model_uri = f"runs:/{mlflow.active_run().info.run_id}/model"
# التسجيل كإصدار جديد في مرحلة "Staging"
result = mlflow.register_model(model_uri, "$MODEL_NAME")
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage(
name="$MODEL_NAME",
version=result.version,
stage="Staging"
)
print(f"Registered version {result.version} in Staging")
EOF
needs:
- validate-candidate
promote-to-production:
stage: promote
script:
- |
python << 'EOF'
import mlflow
import os
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
client = mlflow.tracking.MlflowClient()
# الحصول على أحدث إصدار staging
staging_versions = client.get_latest_versions("$MODEL_NAME", stages=["Staging"])
if staging_versions:
version = staging_versions[0].version
client.transition_model_version_stage(
name="$MODEL_NAME",
version=version,
stage="Production",
archive_existing_versions=True
)
print(f"Promoted version {version} to Production")
EOF
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: manual
needs:
- register-candidate
جلب النماذج من السجل
انشر النماذج بجلبها من السجل:
# scripts/deploy_from_registry.py
import mlflow
import os
def load_production_model(model_name):
"""تحميل نموذج الإنتاج الحالي من سجل نماذج GitLab."""
mlflow.set_tracking_uri(os.environ["CI_SERVER_URL"])
# تحميل النموذج في مرحلة الإنتاج
model_uri = f"models:/{model_name}/Production"
model = mlflow.pyfunc.load_model(model_uri)
return model
def get_model_metadata(model_name):
"""الحصول على البيانات الوصفية لنموذج الإنتاج."""
client = mlflow.tracking.MlflowClient()
versions = client.get_latest_versions(model_name, stages=["Production"])
if versions:
version = versions[0]
return {
"version": version.version,
"run_id": version.run_id,
"created_at": version.creation_timestamp,
"description": version.description
}
return None
if __name__ == "__main__":
model_name = os.environ.get("MODEL_NAME", "sentiment-classifier")
metadata = get_model_metadata(model_name)
print(f"Deploying model version {metadata['version']}")
model = load_production_model(model_name)
# نشر النموذج إلى بنية التقديم التحتية...
النقاط الرئيسية
| الجانب | سجل نماذج GitLab |
|---|---|
| توافق API | متوافق مع MLflow |
| مراحل النموذج | None، Staging، Production، Archived |
| المصادقة | CI_JOB_TOKEN في pipelines |
| الإصدار | زيادة الإصدار تلقائياً |
| التكامل | دعم GitLab CI/CD الأصلي |