Skip to content

pxkundu7/sc-migration-handler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Migration Project

Deploy Showcase Page

Source Code Migration Handler (sc-migration-handler)

The sc-migration-handler project automates the migration of source code repositories from an on-premises GitLab CE instance to a GitHub organization. This project ensures a seamless transition of repository content (commits, branches, and optionally issues/MRs) while maintaining security, efficiency, and traceability. It is designed for organizations moving from self-hosted GitLab to GitHub.com, with a focus on reliability and DevSecOps best practices.

Table of Contents

  1. Project Plan
  2. Prerequisites and Assumptions
  3. Technical Setup and Execution
  4. Expected Outcomes
  5. DevSecOps Best Practices
  6. Future Improvements
  7. Contributing
  8. License

Project Plan

The sc-migration-handler project was designed to migrate 15 repositories (repo-1 to repo-15) from an on-premises GitLab CE instance (http://localhost:8080, onprem-org group) to a GitHub organization (pxkundu7-org). The plan included the following phases:

  1. Inventory Collection:

    • Use inventory.sh to generate ../data/repo_inventory.json, listing repo details (path, ID, URL) for repo-1 to repo-15.

    • Example repo_inventory.json:

      [
        {"id": 16, "path": "repo-1", "http_url_to_repo": "http://localhost:8080/onprem-org/repo-1.git"},
        ...
        {"id": 30, "path": "repo-15", "http_url_to_repo": "http://localhost:8080/onprem-org/repo-15.git"}
      ]
  2. Dry Run:

    • Execute dry_run.sh to simulate migration for a subset of repos (e.g., repo-1 to repo-5) using GitHub Enterprise Importer (GEI) or equivalent logic.
    • Validate repo existence and configuration on GitHub.
  3. Production Migration:

    • Run production_migration.sh to clone GitLab repos and push content to existing GitHub repos.
    • Handle authentication with GitLab PAT and GitHub PAT, ensuring no user input (e.g., password prompts).
  4. Post-Migration Tasks:

    • Use post_migration.sh to set GitHub repo visibility to private and notify developers via GitHub issues.
    • Update ../docs/migration_plan.md with migration status.
  5. Validation:

    • Verify GitHub repos contain all commits, branches, and expected metadata.
    • Ensure developers can access and clone repos.

The project prioritized automation, error handling, and logging to ensure traceability and minimal downtime.

Prerequisites and Assumptions

Prerequisites

  • System Requirements:
    • Linux environment (e.g., Ubuntu) with bash, git, curl, and jq installed.
    • GitHub CLI (gh) installed: sudo apt-get install gh.
    • Docker for running GitLab CE locally (if testing).
  • GitLab CE Setup:
    • On-premises GitLab CE instance at http://localhost:8080.
    • Group onprem-org (ID: 2) with repos repo-1 to repo-15 (IDs 16-30).
    • GitLab PAT with api, read_repository, write_repository scopes for user group_2_bot_f67ac8ac54c062793d6a1ff0b030e6c8.
  • GitHub Setup:
    • GitHub organization pxkundu7-org with empty repos repo-1 to repo-15.
    • GitHub PAT with repo, admin:org scopes.
  • Project Files:
    • ../config/.env with:

      SOURCE_ORG=onprem-org
      DEST_ORG=pxkundu7
      GITLAB_PAT=glpat-...
      GH_PAT=ghp-...
      GITLAB_HOST=http://localhost:8080
      DATA_DIR=../data
      LOG_DIR=../logs
    • ../data/repo_inventory.json generated by inventory.sh.

  • Permissions:
    • Write access to ../data and ../logs directories.

Assumptions

  • GitLab repos have content (commits, branches) generated by generate_repo_history.sh.
  • GitHub repos are pre-created and empty, requiring only content updates.
  • Network access to http://localhost:8080 and https://github.com.
  • Single user (group_2_bot_...) manages GitLab access; GitHub PAT is for an org admin.
  • No external dependencies (e.g., third-party APIs) beyond GitLab and GitHub.
  • Migration is one-time, with no incremental updates required.

Technical Setup and Execution

Setup

  1. Clone the Repository:

    git clone https://github.com/pxkundu7/sc-migration-handler.git
    cd sc-migration-handler
  2. Configure GitLab CE (if testing locally):

    docker run -d --name gitlab -p 8080:80 -v gitlab-data:/var/opt/gitlab gitlab/gitlab-ce:latest
    • Access http://localhost:8080, set up onprem-org, and create repos repo-1 to repo-15.
    • Generate GitLab PAT:
      • Log in as group_2_bot_....
      • Go to User Settings > Access Tokens.
      • Create token with api, read_repository, write_repository.
  3. Configure .env:

    mkdir -p config data logs
    nano config/.env
    • Add:

      SOURCE_ORG=onprem-org
      DEST_ORG=pxkundu7-org
      GITLAB_PAT=glpat-...
      GH_PAT=ghp-...
      GITLAB_HOST=http://localhost:8080
      DATA_DIR=../data
      LOG_DIR=../logs
    • Set permissions:

      chmod 600 config/.env
      chmod 755 data logs
  4. Install Dependencies:

    sudo apt-get update
    sudo apt-get install -y jq gh
  5. Generate Repo Inventory:

    cd scripts
    ./create_repos.sh
    ./generate_repo_history.sh
    ./inventory.sh
    • Verifies ../data/repo_inventory.json:

      cat ../data/repo_inventory.json | jq -r '.[].path'
  6. Verify GitHub Repos:

    source ../config/.env
    gh repo list $DEST_ORG --limit 15
    • Ensure pxkundu7-org/repo-1 to repo-15 exist (empty).

Execution

  1. Dry Run (Optional):

    ./dry_run.sh
    • Simulates migration for repo-1 to repo-5.

    • Check logs:

      cat ../logs/dry_run_*.log
  2. Production Migration:

    ./production_migration.sh
    • Clones GitLab repos and pushes to GitHub.

    • Example output:

      [DEBUG] Loading ../config/.env...
      [INFO] Logging to ../logs/migration_summary.log
      [INFO] Found 15 repositories to migrate
      [DEBUG] Processing repo-1...
      [INFO] Successfully migrated repo-1
      ...
      [INFO] Migration completed. 15/15 repositories migrated successfully.
      
    • Check logs:

      cat ../logs/migration_summary.log
  3. Post-Migration Tasks:

    ./post_migration.sh
    • Sets repos to private and creates notification issues.

    • Example output:

      [INFO] Logging to ../logs/post_migration_20250519_*.log
      [INFO] Found 15 repositories for post-migration tasks
      [INFO] Successfully processed repo-1
      ...
      [INFO] Post-migration tasks completed. 15/15 repositories processed successfully.
      
    • Check logs:

      cat ../logs/post_migration_*.log
  4. Validate:

    git clone https://github.com/pxkundu7-org/repo-1.git
    cd repo-1
    git log --oneline
    git branch -r
    cd ..
    rm -rf repo-1
    gh issue list --repo pxkundu7-org/repo-1

Expected Outcomes

  • Repository Migration:
    • All 15 GitHub repos (pxkundu7-org/repo-1 to repo-15) contain GitLab content (commits, branches).
    • Repos are private and have notification issues for developers.
  • Traceability:
    • Logs in ../logs/migration_summary.log, dry_run_*.log, and post_migration_*.log detail every step.
    • ../docs/migration_plan.md records completion timestamps.
  • Developer Access:
    • Developers can clone repos and see issues instructing them to update remotes to https://github.com/pxkundu7-org/<repo>.git.
  • No Data Loss:
    • All GitLab repo history is preserved in GitHub.
  • Automation:
    • Scripts run without user input, with errors logged and handled gracefully.

DevSecOps Best Practices

The project adheres to DevSecOps principles to ensure secure, efficient, and reliable migration:

  1. Security:

    • Secure Credentials: Stores GITLAB_PAT and GH_PAT in ../config/.env with 600 permissions.

      chmod 600 ../config/.env
    • Scoped Tokens: Uses GitLab PAT with minimal scopes (api, read_repository, write_repository) and GitHub PAT with repo, admin:org.

    • No Hardcoding: Avoids embedding sensitive data in scripts.

    • Audit Logging: Captures all actions in timestamped logs for security audits.

  2. Automation:

    • Fully automated scripts (inventory.sh, production_migration.sh, post_migration.sh) eliminate manual steps.

    • Validates prerequisites (tools, auth, files) before execution.

      : "${GITLAB_PAT:?Missing GITLAB_PAT}"
      command -v gh >/dev/null || { echo "[ERROR] GitHub CLI required"; exit 1; }
  3. Reliability:

    • Error Handling: Skips failed repos and continues migration, tracking failures.

      if ! git clone --mirror "$fixed_url"; then
        echo "[ERROR] Failed to clone $repo_name"
        ((FAILURES++))
        continue
      fi
    • Idempotency: Handles existing GitHub repos by updating content, avoiding duplicates.

    • Validation: Verifies auth and repo existence before actions.

  4. Traceability:

    • Logs debug, info, and error messages to terminal and files.

      echo "[INFO] Successfully migrated $repo_name" | tee /dev/tty "$LOG_FILE"
    • Maintains migration_plan.md for documentation.

  5. Collaboration:

    • Creates GitHub issues to notify developers, ensuring team awareness.

      gh issue create --title "Migration Complete: Update Remotes" \
        --body "Update your remotes to: https://github.com/$DEST_ORG/$repo_name.git"
  6. Testing:

    • Includes dry_run.sh to simulate migration for a subset, reducing risk.
    • Supports local GitLab CE via Docker for testing.

Future Improvements

To make the project more efficient, dynamic, and robust, consider:

  1. Incremental Migrations:

    • Add support for incremental updates (e.g., new commits post-migration) using git fetch and git push.
    • Implement a --incremental flag in production_migration.sh.
  2. Parallel Processing:

    • Optimize production_migration.sh for parallel cloning/pushing with rate limiting:

      migrate_repo() { ... } & JOB_PIDS[$repo]=$!; sleep 0.2
    • Balance parallelism with GitHub API limits.

  3. Dynamic Repo Selection:

    • Add CLI arguments to select repos dynamically:

      ./production_migration.sh --repos repo-1,repo-2
  4. Enhanced Notifications:

    • Integrate email or Slack notifications via webhooks:

      curl -X POST -H "Content-Type: application/json" \
        -d '{"text":"Migration complete for repo-1"}' $SLACK_WEBHOOK_URL
  5. Security Enhancements:

    • Use secret management (e.g., AWS Secrets Manager) for PATs instead of .env.
    • Implement token rotation scripts to refresh PATs periodically.
  6. Monitoring and Metrics:

    • Add metrics (e.g., migration time, success rate) to logs.
    • Integrate with a monitoring tool (e.g., Prometheus) for real-time alerts.
  7. Support for Additional Metadata:

    • Migrate issues, MRs, and wikis using GitLab and GitHub APIs.

      ISSUES=$(curl -s -H "Private-Token: $GITLAB_PAT" "$GITLAB_HOST/api/v4/projects/$repo_id/issues")
      gh issue create --title "$(echo "$issue" | jq -r '.title')" --body "..."
  8. CI/CD Integration:

    • Run scripts in a CI/CD pipeline (e.g., GitHub Actions) for automated scheduling:

      name: Migration
      on: schedule
        - cron: '0 0 * * *'
      jobs:
        migrate:
          runs-on: ubuntu-latest
          steps:
            - uses: actions/checkout@v3
            - run: ./scripts/production_migration.sh
              env:
                GITLAB_PAT: ${{ secrets.GITLAB_PAT }}
                GH_PAT: ${{ secrets.GH_PAT }}
  9. Cross-Platform Support:

    • Test scripts on macOS and Windows (WSL) to ensure portability.
    • Use platform-agnostic commands (e.g., mktemp alternatives).
  10. Documentation:

    • Add script-specific READMEs in scripts/ with usage examples.
    • Create a troubleshooting guide for common errors (e.g., PAT issues).

Contributing

Contributions are welcome! To contribute:

  1. Fork the repository.
  2. Create a feature branch: git checkout -b feature-name.
  3. Commit changes: git commit -m "Add feature".
  4. Push to branch: git push origin feature-name.
  5. Open a pull request.

Please include tests and update ../docs/migration_plan.md with changes.

License

This project is licensed under the MIT License. See LICENSE for details.


Last updated by @pxkundu[https://github/pxkundu]: May 19, 2025, 08:45 PM EDT

About

This is the handler repo to migrate repos from on-prem to github organization

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages