Version Controlling data #11
Deepthi-Chand
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
The DataSpace platform needs a version detection system that automatically determines the appropriate version increment (major, minor, or patch) when resources are updated. This document explains the proposed technical implementation, triggering mechanisms, and version classification logic.
Version Increment Types
The system follows semantic versioning principles (X.Y.Z):
Triggering Mechanisms
Version changes are automatically triggered by the following events:
create_major_version
)Technical Implementation
Components
Signal Flow
Change Detection Logic
The system uses different strategies based on file type:
CSV/Tabular Files
Major Version triggers:
Minor Version triggers:
Patch Version triggers:
JSON Files
Major Version triggers:
Minor Version triggers:
Patch Version triggers:
XML Files
Major Version triggers:
Minor Version triggers:
Patch Version triggers:
Generic Files
For non-structured files, the system uses file size differences:
Major Version triggers:
Minor Version triggers:
Patch Version triggers:
Technical Details
Dependencies
Performance Considerations
Example
When a CSV resource is updated:
detect_version_change_type
loads both versions of the fileManagement Commands
The system includes management commands for manual version control:
create_major_version
: Force a major version increment for a resourcesetup_dvc
: Configure DVC repository and remotesError Handling
The version detection system includes robust error handling to ensure that:
Future Improvements
Beta Was this translation helpful? Give feedback.
All reactions