Skip to content

PIPE-1052: Prevent modifying user's config.yaml by introducing config-lock.yaml in clarifai pipeline upload #753

@nitinbhojwani

Description

@nitinbhojwani

Problem

Currently, during clarifai pipeline upload, the CLI tool modifies the user’s config.yaml once all pipeline_steps (from step_directories) are uploaded.
This implicit mutation makes config.yaml stateful and hard to track, as users may unintentionally commit those changes or lose their original configuration.

We want to:

  • Prevent modifications to the user’s config.yaml.
  • Introduce a consistent, lockfile-based mechanism (config-lock.yaml) to capture uploaded pipeline metadata (IDs, version IDs, etc.).

Proposed Solution

1. Introduce config-lock.yaml

  • Do not modify config.yaml during clarifai pipeline upload.
  • Generate or update a config-lock.yaml whenever a pipeline upload happens (unless --no-lockfile is explicitly passed).
  • config-lock.yaml should contain:
    • user_id
    • app_id
    • pipeline_id
    • pipeline_version_id
    • all pipeline-step IDs and version IDs (same as previously captured in config.yaml).

2. Modify CLI behaviors

clarifai pipeline upload

  • Default: generate/modify config-lock.yaml (leave config.yaml untouched).
  • Add flag --no-lockfile → bypass lockfile creation.

clarifai pipeline run

  • If config-lock.yaml is present:
    • Validate presence of at least: user_id, app_id, pipeline_id, pipeline_version_id.
    • Use lockfile values as defaults.
  • CLI inputs take precedence → command line arguments override lockfile values.
  • If lockfile is missing but CLI args are provided, execution should still succeed.

3. Validator CLI Tool (pipeline-config-lock-yaml-validator)

  • Explicit command (not automatically invoked during run)
    clarifai pipeline validate-lock config-lock.yaml (or expect config-lock.yaml in current working directory)

  • Validates that:

    • user_id, app_id, pipeline_id, pipeline_version_id are present.
    • All pipeline-step version references are consistent with templateRef name and template.
    • File schema is well-formed.

Acceptance Criteria

  • config.yaml is never modified by clarifai pipeline upload.
  • A config-lock.yaml is generated/updated during pipeline upload (unless --no-lockfile is used).
  • config-lock.yaml always includes user_id, app_id, pipeline_id, pipeline_version_id, and all step IDs/versions.
  • clarifai pipeline run can run with just the lockfile present.
  • CLI inputs always override lockfile values.
  • Validator can be run explicitly to detect schema or reference issues.

Sample config-lock.yaml (for pipeline):

pipeline:
  id: yolov7-training-pipeline
  user_id: nitin-clarifai
  app_id: workflows-hack
  version_id: 0f95a54ab1c84e2a93bcb544e5ec71f9 
  orchestration_spec:
    argo_orchestration_spec: |
      apiVersion: argoproj.io/v1alpha1
      kind: Workflow
      metadata:
        generateName: yolov7-training-pipeline-
      spec:
        entrypoint: sequence
        templates:
        - name: sequence
          steps:
          - - name: yolov7-training-ps
              templateRef:
                name: users/nitin-clarifai/apps/workflows-hack/pipeline-steps/yolov7-training-ps/versions/5ab6e5b2da4f43158ff2350373f291d8
                template: users/nitin-clarifai/apps/workflows-hack/pipeline-steps/yolov7-training-ps/versions/5ab6e5b2da4f43158ff2350373f291d8

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions