The sc-migration-handler
project automates the migration of source code repositories from an on-premises GitLab CE instance to a GitHub organization. This project ensures a seamless transition of repository content (commits, branches, and optionally issues/MRs) while maintaining security, efficiency, and traceability. It is designed for organizations moving from self-hosted GitLab to GitHub.com, with a focus on reliability and DevSecOps best practices.
- Project Plan
- Prerequisites and Assumptions
- Technical Setup and Execution
- Expected Outcomes
- DevSecOps Best Practices
- Future Improvements
- Contributing
- License
The sc-migration-handler
project was designed to migrate 15 repositories (repo-1
to repo-15
) from an on-premises GitLab CE instance (http://localhost:8080
, onprem-org
group) to a GitHub organization (pxkundu7-org
). The plan included the following phases:
-
Inventory Collection:
-
Use
inventory.sh
to generate../data/repo_inventory.json
, listing repo details (path, ID, URL) forrepo-1
torepo-15
. -
Example
repo_inventory.json
:[ {"id": 16, "path": "repo-1", "http_url_to_repo": "http://localhost:8080/onprem-org/repo-1.git"}, ... {"id": 30, "path": "repo-15", "http_url_to_repo": "http://localhost:8080/onprem-org/repo-15.git"} ]
-
-
Dry Run:
- Execute
dry_run.sh
to simulate migration for a subset of repos (e.g.,repo-1
torepo-5
) using GitHub Enterprise Importer (GEI) or equivalent logic. - Validate repo existence and configuration on GitHub.
- Execute
-
Production Migration:
- Run
production_migration.sh
to clone GitLab repos and push content to existing GitHub repos. - Handle authentication with GitLab PAT and GitHub PAT, ensuring no user input (e.g., password prompts).
- Run
-
Post-Migration Tasks:
- Use
post_migration.sh
to set GitHub repo visibility to private and notify developers via GitHub issues. - Update
../docs/migration_plan.md
with migration status.
- Use
-
Validation:
- Verify GitHub repos contain all commits, branches, and expected metadata.
- Ensure developers can access and clone repos.
The project prioritized automation, error handling, and logging to ensure traceability and minimal downtime.
- System Requirements:
- Linux environment (e.g., Ubuntu) with
bash
,git
,curl
, andjq
installed. - GitHub CLI (
gh
) installed:sudo apt-get install gh
. - Docker for running GitLab CE locally (if testing).
- Linux environment (e.g., Ubuntu) with
- GitLab CE Setup:
- On-premises GitLab CE instance at
http://localhost:8080
. - Group
onprem-org
(ID: 2) with reposrepo-1
torepo-15
(IDs 16-30). - GitLab PAT with
api
,read_repository
,write_repository
scopes for usergroup_2_bot_f67ac8ac54c062793d6a1ff0b030e6c8
.
- On-premises GitLab CE instance at
- GitHub Setup:
- GitHub organization
pxkundu7-org
with empty reposrepo-1
torepo-15
. - GitHub PAT with
repo
,admin:org
scopes.
- GitHub organization
- Project Files:
-
../config/.env
with:SOURCE_ORG=onprem-org DEST_ORG=pxkundu7 GITLAB_PAT=glpat-... GH_PAT=ghp-... GITLAB_HOST=http://localhost:8080 DATA_DIR=../data LOG_DIR=../logs
-
../data/repo_inventory.json
generated byinventory.sh
.
-
- Permissions:
- Write access to
../data
and../logs
directories.
- Write access to
- GitLab repos have content (commits, branches) generated by
generate_repo_history.sh
. - GitHub repos are pre-created and empty, requiring only content updates.
- Network access to
http://localhost:8080
andhttps://github.com
. - Single user (
group_2_bot_...
) manages GitLab access; GitHub PAT is for an org admin. - No external dependencies (e.g., third-party APIs) beyond GitLab and GitHub.
- Migration is one-time, with no incremental updates required.
-
Clone the Repository:
git clone https://github.com/pxkundu7/sc-migration-handler.git cd sc-migration-handler
-
Configure GitLab CE (if testing locally):
docker run -d --name gitlab -p 8080:80 -v gitlab-data:/var/opt/gitlab gitlab/gitlab-ce:latest
- Access
http://localhost:8080
, set uponprem-org
, and create reposrepo-1
torepo-15
. - Generate GitLab PAT:
- Log in as
group_2_bot_...
. - Go to
User Settings > Access Tokens
. - Create token with
api
,read_repository
,write_repository
.
- Log in as
- Access
-
Configure .env:
mkdir -p config data logs nano config/.env
-
Add:
SOURCE_ORG=onprem-org DEST_ORG=pxkundu7-org GITLAB_PAT=glpat-... GH_PAT=ghp-... GITLAB_HOST=http://localhost:8080 DATA_DIR=../data LOG_DIR=../logs
-
Set permissions:
chmod 600 config/.env chmod 755 data logs
-
-
Install Dependencies:
sudo apt-get update sudo apt-get install -y jq gh
-
Generate Repo Inventory:
cd scripts ./create_repos.sh ./generate_repo_history.sh ./inventory.sh
-
Verifies
../data/repo_inventory.json
:cat ../data/repo_inventory.json | jq -r '.[].path'
-
-
Verify GitHub Repos:
source ../config/.env gh repo list $DEST_ORG --limit 15
- Ensure
pxkundu7-org/repo-1
torepo-15
exist (empty).
- Ensure
-
Dry Run (Optional):
./dry_run.sh
-
Simulates migration for
repo-1
torepo-5
. -
Check logs:
cat ../logs/dry_run_*.log
-
-
Production Migration:
./production_migration.sh
-
Clones GitLab repos and pushes to GitHub.
-
Example output:
[DEBUG] Loading ../config/.env... [INFO] Logging to ../logs/migration_summary.log [INFO] Found 15 repositories to migrate [DEBUG] Processing repo-1... [INFO] Successfully migrated repo-1 ... [INFO] Migration completed. 15/15 repositories migrated successfully.
-
Check logs:
cat ../logs/migration_summary.log
-
-
Post-Migration Tasks:
./post_migration.sh
-
Sets repos to private and creates notification issues.
-
Example output:
[INFO] Logging to ../logs/post_migration_20250519_*.log [INFO] Found 15 repositories for post-migration tasks [INFO] Successfully processed repo-1 ... [INFO] Post-migration tasks completed. 15/15 repositories processed successfully.
-
Check logs:
cat ../logs/post_migration_*.log
-
-
Validate:
git clone https://github.com/pxkundu7-org/repo-1.git cd repo-1 git log --oneline git branch -r cd .. rm -rf repo-1 gh issue list --repo pxkundu7-org/repo-1
- Repository Migration:
- All 15 GitHub repos (
pxkundu7-org/repo-1
torepo-15
) contain GitLab content (commits, branches). - Repos are private and have notification issues for developers.
- All 15 GitHub repos (
- Traceability:
- Logs in
../logs/migration_summary.log
,dry_run_*.log
, andpost_migration_*.log
detail every step. ../docs/migration_plan.md
records completion timestamps.
- Logs in
- Developer Access:
- Developers can clone repos and see issues instructing them to update remotes to
https://github.com/pxkundu7-org/<repo>.git
.
- Developers can clone repos and see issues instructing them to update remotes to
- No Data Loss:
- All GitLab repo history is preserved in GitHub.
- Automation:
- Scripts run without user input, with errors logged and handled gracefully.
The project adheres to DevSecOps principles to ensure secure, efficient, and reliable migration:
-
Security:
-
Secure Credentials: Stores
GITLAB_PAT
andGH_PAT
in../config/.env
with600
permissions.chmod 600 ../config/.env
-
Scoped Tokens: Uses GitLab PAT with minimal scopes (
api
,read_repository
,write_repository
) and GitHub PAT withrepo
,admin:org
. -
No Hardcoding: Avoids embedding sensitive data in scripts.
-
Audit Logging: Captures all actions in timestamped logs for security audits.
-
-
Automation:
-
Fully automated scripts (
inventory.sh
,production_migration.sh
,post_migration.sh
) eliminate manual steps. -
Validates prerequisites (tools, auth, files) before execution.
: "${GITLAB_PAT:?Missing GITLAB_PAT}" command -v gh >/dev/null || { echo "[ERROR] GitHub CLI required"; exit 1; }
-
-
Reliability:
-
Error Handling: Skips failed repos and continues migration, tracking failures.
if ! git clone --mirror "$fixed_url"; then echo "[ERROR] Failed to clone $repo_name" ((FAILURES++)) continue fi
-
Idempotency: Handles existing GitHub repos by updating content, avoiding duplicates.
-
Validation: Verifies auth and repo existence before actions.
-
-
Traceability:
-
Logs debug, info, and error messages to terminal and files.
echo "[INFO] Successfully migrated $repo_name" | tee /dev/tty "$LOG_FILE"
-
Maintains
migration_plan.md
for documentation.
-
-
Collaboration:
-
Creates GitHub issues to notify developers, ensuring team awareness.
gh issue create --title "Migration Complete: Update Remotes" \ --body "Update your remotes to: https://github.com/$DEST_ORG/$repo_name.git"
-
-
Testing:
- Includes
dry_run.sh
to simulate migration for a subset, reducing risk. - Supports local GitLab CE via Docker for testing.
- Includes
To make the project more efficient, dynamic, and robust, consider:
-
Incremental Migrations:
- Add support for incremental updates (e.g., new commits post-migration) using
git fetch
andgit push
. - Implement a
--incremental
flag inproduction_migration.sh
.
- Add support for incremental updates (e.g., new commits post-migration) using
-
Parallel Processing:
-
Optimize
production_migration.sh
for parallel cloning/pushing with rate limiting:migrate_repo() { ... } & JOB_PIDS[$repo]=$!; sleep 0.2
-
Balance parallelism with GitHub API limits.
-
-
Dynamic Repo Selection:
-
Add CLI arguments to select repos dynamically:
./production_migration.sh --repos repo-1,repo-2
-
-
Enhanced Notifications:
-
Integrate email or Slack notifications via webhooks:
curl -X POST -H "Content-Type: application/json" \ -d '{"text":"Migration complete for repo-1"}' $SLACK_WEBHOOK_URL
-
-
Security Enhancements:
- Use secret management (e.g., AWS Secrets Manager) for PATs instead of
.env
. - Implement token rotation scripts to refresh PATs periodically.
- Use secret management (e.g., AWS Secrets Manager) for PATs instead of
-
Monitoring and Metrics:
- Add metrics (e.g., migration time, success rate) to logs.
- Integrate with a monitoring tool (e.g., Prometheus) for real-time alerts.
-
Support for Additional Metadata:
-
Migrate issues, MRs, and wikis using GitLab and GitHub APIs.
ISSUES=$(curl -s -H "Private-Token: $GITLAB_PAT" "$GITLAB_HOST/api/v4/projects/$repo_id/issues") gh issue create --title "$(echo "$issue" | jq -r '.title')" --body "..."
-
-
CI/CD Integration:
-
Run scripts in a CI/CD pipeline (e.g., GitHub Actions) for automated scheduling:
name: Migration on: schedule - cron: '0 0 * * *' jobs: migrate: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - run: ./scripts/production_migration.sh env: GITLAB_PAT: ${{ secrets.GITLAB_PAT }} GH_PAT: ${{ secrets.GH_PAT }}
-
-
Cross-Platform Support:
- Test scripts on macOS and Windows (WSL) to ensure portability.
- Use platform-agnostic commands (e.g.,
mktemp
alternatives).
-
Documentation:
- Add script-specific READMEs in
scripts/
with usage examples. - Create a troubleshooting guide for common errors (e.g., PAT issues).
- Add script-specific READMEs in
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch:
git checkout -b feature-name
. - Commit changes:
git commit -m "Add feature"
. - Push to branch:
git push origin feature-name
. - Open a pull request.
Please include tests and update ../docs/migration_plan.md
with changes.
This project is licensed under the MIT License. See LICENSE for details.
Last updated by @pxkundu[https://github/pxkundu]: May 19, 2025, 08:45 PM EDT