Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions metadata-ingestion/src/datahub/ingestion/source/aws/glue.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,26 @@ class GlueSource(StatefulIngestionSourceBase):
}
```

### Glue Cross-account Access

Glue ingestion supports cross-account access and lineage by allowing you to specify the target AWS account's Glue catalog using the `catalog_id` parameter in the ingestion recipe.
This enables ingestion of Glue metadata from different AWS accounts, supporting cross-account lineage scenarios.
You must ensure the correct IAM roles and permissions are set up for cross-account access.

Example: There are 2 AWS accounts A and B, A has shared metadata with B. Account A has Glue table - tableA.
If you ingest account A using Glue it will create dataset tableA in DataHub.
If you want to ingest tableA via account B you can pass `catalog_id` parameter in recipe with A's catalog id.

**Ingestion without platform instance parameter**
- If both catalogs are ingested without platform instance parameter, DataHub should be able to understand that the database and tables are same
- DataHub will create single entity for table tableA
- It should show lineage between Glue and S3.
You have to ingest S3 as separate source (https://docs.datahub.com/docs/generated/ingestion/sources/s3)

**Ingestion with platform instance parameter**
- It will create separate entities for tableA as it will have different URN path
- It should show lineage between Glue and S3

"""

source_config: GlueSourceConfig
Expand Down
Loading