Skip to content

Implement encryption and shard management functionalities #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

unniznd
Copy link
Contributor

@unniznd unniznd commented Jul 25, 2025

User description

What have I changed

Implemented all the encryption function

Why have I changed

All encryption function is missing

How to test it

Test all the files in test_encryption folder and test_upload_encrypted.py. Make sure you own the cid in the testing files else Access Denied error will come.


PR Type

Enhancement


Description

  • Implement comprehensive encryption and key management system

  • Add Shamir's secret sharing for key sharding and recovery

  • Integrate encrypted file upload with shard distribution

  • Provide access control and ownership transfer capabilities


Diagram Walkthrough

flowchart LR
  A["File Upload"] --> B["Generate Master Key"]
  B --> C["Encrypt File"]
  C --> D["Create Key Shards"]
  D --> E["Save Shards to Nodes"]
  E --> F["Upload Encrypted File"]
  G["Access Control"] --> H["Share to Address"]
  G --> I["Revoke Access"]
  G --> J["Transfer Ownership"]
  K["Key Recovery"] --> L["Collect Shards"]
  L --> M["Reconstruct Master Key"]
Loading

File Walkthrough

Relevant files

- Added save_shards.py to handle saving key shards to multiple nodes with CID validation and error handling.
- Introduced share_to_address.py for sharing access to addresses with CID validation.
- Created shared_key.py for generating key shards using Shamir's secret sharing scheme.
- Developed transfer_ownership.py to manage ownership transfers of encrypted data with CID validation.
- Implemented utility functions in utils.py for CID validation, object comparison, and API request handling.
- Added upload_encrypted.py for uploading encrypted files with key generation and shard saving.
- Created comprehensive unit tests for encryption, shard management, and ownership transfer functionalities.
- Ensured robust error handling and validation across all new functionalities.
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 PR contains tests
🔒 Security concerns

Sensitive information exposure:
The code in tests/test_encryption/test_get_jwt.py (lines 19-22) and other test files contains a hardcoded private key '0x8218aa5dbf4dbec243142286b93e26af521b3e91219583595a06a7765abc9c8b'. Even if this is just a test key, it's a security risk to include private keys directly in the code. These should be stored in environment variables or secure configuration files.

⚡ Recommended focus areas for review

Error Handling

The api_node_handler function has nested exception handling that might mask important errors. The inner exception is only raised if it doesn't contain "fetch", which could lead to silent failures or misleading error messages.

except Exception as error:
    if "fetch" not in str(error):
        raise
    if attempt == retry_count - 1:
        raise
Regex Parsing

The code uses regex to parse JSON from response text instead of using the response.json() method directly, which could lead to parsing errors or unexpected behavior.

response_text = response.text
import re
match = re.search(r'\[.*\]$', response_text, re.DOTALL)
if not match:
    raise Exception('No JSON array found in response')
Hard-coded Values

The function validates that key_shards must be exactly 5 objects, but this value should be configurable to match the key_count parameter used in key generation functions.

if not isinstance(key_shards, list) or len(key_shards) != 5:
    return {
        "isSuccess": False,
        "error": "keyShards must be an array of 5 objects"
    }

Copy link

codiumai-pr-agent-free bot commented Jul 25, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Security
Add authentication data to encryption

The function is using None as the associated data parameter in AES-GCM
encryption, which reduces security. AES-GCM should include associated data (AAD)
for authentication to prevent tampering. Consider using a fixed string or
metadata as AAD.

src/lighthouseweb3/functions/upload_encrypted.py [48-74]

 def encrypt_file(file_data: bytes, password: str) -> bytes:
     try:
         # Convert password to bytes
         password_bytes = password.encode('utf-8')
 
         # Generate random salt and IV
         salt = secrets.token_bytes(16)
         iv = secrets.token_bytes(12)
 
         # Derive key using PBKDF2
         kdf = PBKDF2HMAC(
             algorithm=hashes.SHA256(),
             length=32,  # 256-bit key for AES-GCM
             salt=salt,
             iterations=250000,
             backend=default_backend()
         )
         aes_key = kdf.derive(password_bytes)
 
-        # Encrypt the file data using AES-GCM
+        # Use metadata as associated data for authentication
+        aad = b"lighthouse-encrypted-file"
+        
+        # Encrypt the file data using AES-GCM with AAD
         aesgcm = AESGCM(aes_key)
-        cipher_bytes = aesgcm.encrypt(iv, file_data, None)
+        cipher_bytes = aesgcm.encrypt(iv, file_data, aad)
 
         # Combine salt, IV, and cipher bytes
         result_bytes = salt + iv + cipher_bytes
 
         return result_bytes
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly points out that using Associated Authenticated Data (AAD) with AES-GCM is a security best practice to prevent certain attacks, thus improving the encryption's integrity.

Medium
Possible issue
Handle empty API responses

The function assumes response will always be a list with at least one element.
If the API returns an empty list or a different structure, accessing
response[0]['message'] will cause an IndexError or KeyError. Add validation to
handle potential empty responses.

src/lighthouseweb3/functions/encryption/get_auth_message.py [5-10]

 def get_auth_message(address: str) -> dict[str, Any]:
   try:
     response = api_node_handler(f"/api/message/{address}", "GET")
-    return {'message': response[0]['message'], 'error': None}
+    if response and isinstance(response, list) and len(response) > 0 and 'message' in response[0]:
+      return {'message': response[0]['message'], 'error': None}
+    return {'message': None, 'error': 'Invalid response format'}
   except Exception as e:
     return {'message': None, 'error':str(e)}
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that response[0] could raise an error if the API returns an empty list, and the proposed validation makes the function more robust.

Medium
Fix URL construction logic

The ternary expression for URL construction is using a logical operator (and)
instead of a conditional operator, which can lead to unexpected behavior. If
Config.is_dev is True but Config.lighthouse_bls_node_dev evaluates to a falsy
value, the expression will return that falsy value instead of the intended URL.

src/lighthouseweb3/functions/encryption/utils.py [22-34]

 def api_node_handler(
     endpoint: str,
     verb: str,
     auth_token: str = "",
     body: Any = None,
     retry_count: int = 3
 ) -> Dict[str, Any]:
 
-    url = f"{Config.is_dev and Config.lighthouse_bls_node_dev or Config.lighthouse_bls_node}{endpoint}"
+    base_url = Config.lighthouse_bls_node_dev if Config.is_dev else Config.lighthouse_bls_node
+    url = f"{base_url}{endpoint}"
     headers = {
         "Content-Type": "application/json",
         "Authorization": f"Bearer {auth_token}" if auth_token else ""
     }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that using and/or for conditional assignment is not idiomatic and can be error-prone, proposing a clearer and safer ternary operator for URL construction.

Medium
  • Update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant