-
Notifications
You must be signed in to change notification settings - Fork 15
dbt: create new env variable override for local development on prod external tables #4187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Terraform plan in iac/cal-itp-data-infra-staging/airflow/us Plan: 0 to add, 3 to change, 0 to destroy.Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
!~ update in-place
Terraform will perform the following actions:
# google_storage_bucket_object.calitp-staging-composer-catalog will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-staging-composer-catalog" {
!~ content = (sensitive value)
!~ crc32c = "scSVQA==" -> (known after apply)
!~ detect_md5hash = "LzB7tkV9giBECRASvwDTJg==" -> "different hash"
!~ generation = 1758060434829105 -> (known after apply)
id = "calitp-staging-composer-data/warehouse/target/catalog.json"
!~ md5hash = "LzB7tkV9giBECRASvwDTJg==" -> (known after apply)
name = "data/warehouse/target/catalog.json"
# (16 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-staging-composer-dags["dbt_project.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-staging-composer-dags" {
!~ crc32c = "41/WGA==" -> (known after apply)
!~ detect_md5hash = "tXa6XCOXKda0qRNtC+2eAg==" -> "different hash"
!~ generation = 1758060092636212 -> (known after apply)
id = "calitp-staging-composer-data/warehouse/dbt_project.yml"
!~ md5hash = "tXa6XCOXKda0qRNtC+2eAg==" -> (known after apply)
name = "data/warehouse/dbt_project.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-staging-composer-manifest will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-staging-composer-manifest" {
!~ content = (sensitive value)
!~ crc32c = "LrVEZQ==" -> (known after apply)
!~ detect_md5hash = "nv12yYcJcck2UWfr3ppWyA==" -> "different hash"
!~ generation = 1758060435641990 -> (known after apply)
id = "calitp-staging-composer-data/warehouse/target/manifest.json"
!~ md5hash = "nv12yYcJcck2UWfr3ppWyA==" -> (known after apply)
name = "data/warehouse/target/manifest.json"
# (16 unchanged attributes hidden)
}
Plan: 0 to add, 3 to change, 0 to destroy. 📝 Plan generated in Plan Terraform for Warehouse and DAG changes #640 |
Terraform plan in iac/cal-itp-data-infra/airflow/us Plan: 0 to add, 22 to change, 0 to destroy.Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
!~ update in-place
Terraform will perform the following actions:
# google_storage_bucket_object.calitp-composer-catalog will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-catalog" {
!~ content = (sensitive value)
!~ crc32c = "Tev42g==" -> (known after apply)
!~ detect_md5hash = "WH9/csQm9d1jFHYJ9tDp6w==" -> "different hash"
!~ generation = 1757541552203979 -> (known after apply)
id = "calitp-composer-data/warehouse/target/catalog.json"
!~ md5hash = "WH9/csQm9d1jFHYJ9tDp6w==" -> (known after apply)
name = "data/warehouse/target/catalog.json"
# (16 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["dbt_project.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "XPuwJQ==" -> (known after apply)
!~ detect_md5hash = "ZVQqwNQ/pizS7TrWVGpWrA==" -> "different hash"
!~ generation = 1755538683311062 -> (known after apply)
id = "calitp-composer-data/warehouse/dbt_project.yml"
!~ md5hash = "ZVQqwNQ/pizS7TrWVGpWrA==" -> (known after apply)
name = "data/warehouse/dbt_project.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/_source_gtfs_schedule_history.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "BXdZpA==" -> (known after apply)
!~ detect_md5hash = "HVHCS36vhuXW2Wdk8hBKlg==" -> "different hash"
!~ generation = 1751416662709951 -> (known after apply)
id = "calitp-composer-data/warehouse/models/_source_gtfs_schedule_history.yml"
!~ md5hash = "HVHCS36vhuXW2Wdk8hBKlg==" -> (known after apply)
name = "data/warehouse/models/_source_gtfs_schedule_history.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/amplitude/_amplitude.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "5ai1mg==" -> (known after apply)
!~ detect_md5hash = "CCXiffBEEPZ5HLszdyijvw==" -> "different hash"
!~ generation = 1751416666866881 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/amplitude/_amplitude.yml"
!~ md5hash = "CCXiffBEEPZ5HLszdyijvw==" -> (known after apply)
name = "data/warehouse/models/staging/amplitude/_amplitude.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/audit/_src_audit.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "yeSEyg==" -> (known after apply)
!~ detect_md5hash = "LP1avYyKTuGYkgdiP9r8GA==" -> "different hash"
!~ generation = 1751416661148830 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/audit/_src_audit.yml"
!~ md5hash = "LP1avYyKTuGYkgdiP9r8GA==" -> (known after apply)
name = "data/warehouse/models/staging/audit/_src_audit.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/gtfs/_src_gtfs_rt_external_tables.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "irg5NA==" -> (known after apply)
!~ detect_md5hash = "VMjLStCSrjbmN/SVzNZsKA==" -> "different hash"
!~ generation = 1751416666081058 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/gtfs/_src_gtfs_rt_external_tables.yml"
!~ md5hash = "VMjLStCSrjbmN/SVzNZsKA==" -> (known after apply)
name = "data/warehouse/models/staging/gtfs/_src_gtfs_rt_external_tables.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/gtfs/_src_gtfs_schedule_external_tables.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "JRuXXA==" -> (known after apply)
!~ detect_md5hash = "Caqsk8kIhrzLrYxkwYo53g==" -> "different hash"
!~ generation = 1751416666931203 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/gtfs/_src_gtfs_schedule_external_tables.yml"
!~ md5hash = "Caqsk8kIhrzLrYxkwYo53g==" -> (known after apply)
name = "data/warehouse/models/staging/gtfs/_src_gtfs_schedule_external_tables.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/hqta/_hqta.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "Uuiodw==" -> (known after apply)
!~ detect_md5hash = "E8kXOfywGmTU9ixt2SEdjw==" -> "different hash"
!~ generation = 1751416667415993 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/hqta/_hqta.yml"
!~ md5hash = "E8kXOfywGmTU9ixt2SEdjw==" -> (known after apply)
name = "data/warehouse/models/staging/hqta/_hqta.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/kuba/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "VFjYpg==" -> (known after apply)
!~ detect_md5hash = "d+eUGgQXAgoXWUQWdd4hHg==" -> "different hash"
!~ generation = 1755538683298627 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/kuba/_src.yml"
!~ md5hash = "d+eUGgQXAgoXWUQWdd4hHg==" -> (known after apply)
name = "data/warehouse/models/staging/kuba/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/ntd_annual_reporting/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "6xrw7Q==" -> (known after apply)
!~ detect_md5hash = "u1GUGr+qHvrdM3CByD7P5g==" -> "different hash"
!~ generation = 1751416664560138 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/ntd_annual_reporting/_src.yml"
!~ md5hash = "u1GUGr+qHvrdM3CByD7P5g==" -> (known after apply)
name = "data/warehouse/models/staging/ntd_annual_reporting/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/ntd_assets/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "xZy+xA==" -> (known after apply)
!~ detect_md5hash = "D87Dg9H2Ttxydrw5TCb+kw==" -> "different hash"
!~ generation = 1751416663042995 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/ntd_assets/_src.yml"
!~ md5hash = "D87Dg9H2Ttxydrw5TCb+kw==" -> (known after apply)
name = "data/warehouse/models/staging/ntd_assets/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/ntd_funding_and_expenses/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "Rk9vbA==" -> (known after apply)
!~ detect_md5hash = "1oG4kAhIV+IUFXaMzCafdg==" -> "different hash"
!~ generation = 1751416666040352 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/ntd_funding_and_expenses/_src.yml"
!~ md5hash = "1oG4kAhIV+IUFXaMzCafdg==" -> (known after apply)
name = "data/warehouse/models/staging/ntd_funding_and_expenses/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/ntd_ridership/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "vxfhWw==" -> (known after apply)
!~ detect_md5hash = "tIKyKSkVeHX6Hy5dMmfDkg==" -> "different hash"
!~ generation = 1751416665043819 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/ntd_ridership/_src.yml"
!~ md5hash = "tIKyKSkVeHX6Hy5dMmfDkg==" -> (known after apply)
name = "data/warehouse/models/staging/ntd_ridership/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/ntd_safety_and_security/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "pFHMCQ==" -> (known after apply)
!~ detect_md5hash = "VdG5ha4mT+RoWi4UQOT35Q==" -> "different hash"
!~ generation = 1758049886357399 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/ntd_safety_and_security/_src.yml"
!~ md5hash = "VdG5ha4mT+RoWi4UQOT35Q==" -> (known after apply)
name = "data/warehouse/models/staging/ntd_safety_and_security/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/ntd_validation/_src_api_externaltable.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "aVRpvA==" -> (known after apply)
!~ detect_md5hash = "V377ulINuAVHuOmUpO696g==" -> "different hash"
!~ generation = 1751416666842087 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/ntd_validation/_src_api_externaltable.yml"
!~ md5hash = "V377ulINuAVHuOmUpO696g==" -> (known after apply)
name = "data/warehouse/models/staging/ntd_validation/_src_api_externaltable.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/payments/elavon/_elavon.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "K73NkQ==" -> (known after apply)
!~ detect_md5hash = "F3n1bpsPdvlWZzdDllkL5g==" -> "different hash"
!~ generation = 1751416668573349 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/payments/elavon/_elavon.yml"
!~ md5hash = "F3n1bpsPdvlWZzdDllkL5g==" -> (known after apply)
name = "data/warehouse/models/staging/payments/elavon/_elavon.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/payments/littlepay/_littlepay.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "z0l1MQ==" -> (known after apply)
!~ detect_md5hash = "pQ3Or9N/wSYHXt235de9Zw==" -> "different hash"
!~ generation = 1751416662768513 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/payments/littlepay/_littlepay.yml"
!~ md5hash = "pQ3Or9N/wSYHXt235de9Zw==" -> (known after apply)
name = "data/warehouse/models/staging/payments/littlepay/_littlepay.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/payments/littlepay_v3/_littlepay_v3.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "OLGd3Q==" -> (known after apply)
!~ detect_md5hash = "CT1Is41WF82GxtvZlgAPfw==" -> "different hash"
!~ generation = 1751416665701155 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/payments/littlepay_v3/_littlepay_v3.yml"
!~ md5hash = "CT1Is41WF82GxtvZlgAPfw==" -> (known after apply)
name = "data/warehouse/models/staging/payments/littlepay_v3/_littlepay_v3.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/rt/_src_gtfs_rt_external_tables.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "SwYd1A==" -> (known after apply)
!~ detect_md5hash = "Z8wuwg61jF+m3F4tDyOR9Q==" -> "different hash"
!~ generation = 1751416664716660 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/rt/_src_gtfs_rt_external_tables.yml"
!~ md5hash = "Z8wuwg61jF+m3F4tDyOR9Q==" -> (known after apply)
name = "data/warehouse/models/staging/rt/_src_gtfs_rt_external_tables.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/state_geoportal/_src.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "wL4kuQ==" -> (known after apply)
!~ detect_md5hash = "RurA8+fnfohppuRhvXSA1w==" -> "different hash"
!~ generation = 1751416661849873 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/state_geoportal/_src.yml"
!~ md5hash = "RurA8+fnfohppuRhvXSA1w==" -> (known after apply)
name = "data/warehouse/models/staging/state_geoportal/_src.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-dags["models/staging/transit_database/_src_airtable.yml"] will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-dags" {
!~ crc32c = "XoDIpQ==" -> (known after apply)
!~ detect_md5hash = "L1F0Z7s3z9BVOE64Fg/zDg==" -> "different hash"
!~ generation = 1751416665331439 -> (known after apply)
id = "calitp-composer-data/warehouse/models/staging/transit_database/_src_airtable.yml"
!~ md5hash = "L1F0Z7s3z9BVOE64Fg/zDg==" -> (known after apply)
name = "data/warehouse/models/staging/transit_database/_src_airtable.yml"
# (17 unchanged attributes hidden)
}
# google_storage_bucket_object.calitp-composer-manifest will be updated in-place
!~ resource "google_storage_bucket_object" "calitp-composer-manifest" {
!~ content = (sensitive value)
!~ crc32c = "0PcBgA==" -> (known after apply)
!~ detect_md5hash = "TuukowzlGluvU+BnYGmOkA==" -> "different hash"
!~ generation = 1757541553964048 -> (known after apply)
id = "calitp-composer-data/warehouse/target/manifest.json"
!~ md5hash = "TuukowzlGluvU+BnYGmOkA==" -> (known after apply)
name = "data/warehouse/target/manifest.json"
# (16 unchanged attributes hidden)
}
Plan: 0 to add, 22 to change, 0 to destroy. 📝 Plan generated in Plan Terraform for Warehouse and DAG changes #640 |
Not a formal reviewer but since we discussed offline with Erika, I took a look -- just one comment:
|
7bb0573
to
4ba7cae
Compare
Terraform plan in iac/cal-itp-data-infra/composer/us No changes. Your infrastructure matches the configuration.
📝 Plan generated in Plan Terraform for Warehouse and DAG changes #640 |
5f26ba9
to
aaf0c29
Compare
@lauriemerrell |
Except for cloudaudit and information_schema. [#4188]
aaf0c29
to
947ec58
Compare
Charlie, I did some tests and changes: ✅ I added some comments to the existing variables in dbt_project.yml ✅ When I was testing locally wasn't working so I had to invert the order of the variables and it worked, we can run with --vars: ✅ Pushed the change to Airflow Staging and run ❌ Tested on Airflow Staging adding
|
@erikamov sorry maybe something changed or I linked the wrong line, I meant to link: https://github.com/cal-itp/data-infra/blob/main/warehouse/dbt_project.yml#L31 -- basically, I think we need to document how this new variable interacts with that existing variable that already points at a project. It looks like the comments in |
@@ -3,7 +3,7 @@ version: 2 | |||
sources: | |||
- name: external_gtfs_rt | |||
description: Hive-partitioned external tables reading GTFS RT data and validation errors from GCS. | |||
database: "{{ env_var('GOOGLE_CLOUD_PROJECT', var('GOOGLE_CLOUD_PROJECT')) }}" | |||
database: "{{ env_var('GOOGLE_CLOUD_PROJECT', var('EXTERNAL_TABLE_SOURCE')) }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR description says that GTFS RT data is meant to be excluded from this change, won't this update mean that it applies to GTFS-RT too? Might remove this change so that GTFS-RT is excluded, or update PR description to be clear that it affects all sources
Description
During our infrastructure refactor, we determined the need to preserve local warehouse development on production external tables as sources of truth WITH THE EXCEPTION OF RT EXTERNAL TABLES. This PR introduces environment variable
DEV_SOURCE_GOOGLE_CLOUD_PROJECT
which, when added locally, allows for dbt to reference production external tables at runtime WITH THE EXCEPTION OF RT EXTERNAL TABLES.Resolves: #4188
Type of change
How has this been tested?
Post-merge follow-ups
To use the production external tables in local warehouse development, the following line needs to be added to your environment:
export DEV_SOURCE_GOOGLE_CLOUD_PROJECT='cal-itp-data-infra'