Skip to content

Conversation

@DavidHuber-NOAA
Copy link
Contributor

@DavidHuber-NOAA DavidHuber-NOAA commented Oct 31, 2025

Description

This updates the staging YAMLs to point to the EE2-standardized filenames in the ICSDIR.

Resolves #4173

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? YES: Old experiments will not be able to restart from old filenames.
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Will test staging and first cycle forecast jobs on all platforms.

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

github-advanced-security[bot]

This comment was marked as resolved.

@@ -0,0 +1,421 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to have a script like this that isn't specific to the IC folders but more general. For example if someone downloads an old experiment or GFSv16 (although that has other complications now I guess).

I'm currently making a new -> old script so I can test if we can reproduce from HPSS or not. Not sure if this PR is ready to be used or not, so I'm doing it kind of as I run into errors, which is probably not the best way to go about it. I just found this so I'll probably use some of what's here to add to that.

@DavidHuber-NOAA DavidHuber-NOAA marked this pull request as ready for review November 7, 2025 14:02
@DavidHuber-NOAA
Copy link
Contributor Author

DavidHuber-NOAA commented Nov 7, 2025

Opening this PR for review. All stage and forecast jobs ran successfully on Ursa. Once I have approvals, I will sync the links in the ICSDIR from Ursa to all other platforms and begin extended testing.

{% for itile in range(6) %}
{% for source_ftype, dest_ftype in
[
('atminc', 'jedi_increment.atm.i006'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be some sort of if statement here and be able to handle either the old or the new names?

Current operations and many old experiments have the old names. We're going to need to run experiments from those ICs. That either needs to be handled here or a script to link to the new names from the old names for generic cases should be provided in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make this happen for operational cold start cases, but not warm starting past development cases -- these are too varied and complicated for us to support. Similarly, current operations runs at a different resolution (C768), so in order to run in it the global workflow, it would need to go through gdas_init for cold starts. I've made a PR into UFS_Utils to update the gdas_init tool to map the operational abias and radstat files to the EE2-compliant names (1114). I will add a utility to this PR to do this conversion as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would we need to do to warm start an experiment that is ongoing at C1152 coupled with old names to run w/the new names?

We have current needs for that functionality. Is that some external script? Something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JessicaMeixner-NOAA I will write a one-off script to handle this. Location TBD. Can you provide a location for the experiment(s) you would need links for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not be the only person that will need this...

But most importantly, we need to continue the realtime.

@DavidHuber-NOAA
Copy link
Contributor Author

Offline testing on both C6 and Ursa have completed successfully. Links have been created on all tier-1 systems.

Additionally, @JessicaMeixner-NOAA started a retro test on Gaea C6 that I attempted to restart from using the updated YAMLs in this PR. I was unable to at first as it was determined that the warm Ice restarts were not being archived on the correct cycle. I have fixed this in this PR. I then spoofed these restart files by copying them from a later cycle of the retro test and editing the global attributes. This allowed the forecasts to run, though they failed with MPI errors -- it's unclear to me if this was a result of the spoofed Ice products or an issue with Gaea. Once this PR is merged into a retro test, this test can be repeated.

I will now remove the make_ee2_links.sh script from this PR.

@JessicaMeixner-NOAA
Copy link
Contributor

I

Offline testing on both C6 and Ursa have completed successfully. Links have been created on all tier-1 systems.

Additionally, @JessicaMeixner-NOAA started a retro test on Gaea C6 that I attempted to restart from using the updated YAMLs in this PR. I was unable to at first as it was determined that the warm Ice restarts were not being archived on the correct cycle. I have fixed this in this PR. I then spoofed these restart files by copying them from a later cycle of the retro test and editing the global attributes. This allowed the forecasts to run, though they failed with MPI errors -- it's unclear to me if this was a result of the spoofed Ice products or an issue with Gaea. Once this PR is merged into a retro test, this test can be repeated.

I will now remove the make_ee2_links.sh script from this PR.

Why was make_ee2_links removed? Wont we want to keep this, allow others to use this?

@JessicaMeixner-NOAA
Copy link
Contributor

I

Offline testing on both C6 and Ursa have completed successfully. Links have been created on all tier-1 systems.
Additionally, @JessicaMeixner-NOAA started a retro test on Gaea C6 that I attempted to restart from using the updated YAMLs in this PR. I was unable to at first as it was determined that the warm Ice restarts were not being archived on the correct cycle. I have fixed this in this PR. I then spoofed these restart files by copying them from a later cycle of the retro test and editing the global attributes. This allowed the forecasts to run, though they failed with MPI errors -- it's unclear to me if this was a result of the spoofed Ice products or an issue with Gaea. Once this PR is merged into a retro test, this test can be repeated.
I will now remove the make_ee2_links.sh script from this PR.

Why was make_ee2_links removed? Wont we want to keep this, allow others to use this?

Although I guess that was only for RTs the last time I saw it and not general. But I likely am not the only person who will want/need this script in a general way.

@DavidHuber-NOAA
Copy link
Contributor Author

@JessicaMeixner-NOAA I am going to re-stage the ICs on the S3 bucket with the links in place, so there should be a future need for the make_ee2_links.sh script. The make_ee2_links.sh script was only needed as a one-time script and should not be used in general cases. I can place it in the glopara-cm-tools for reference, but it shouldn't stay in the GW to avoid confusion. I just wanted it versioned while I was working through the kinks.

@JessicaMeixner-NOAA
Copy link
Contributor

@DavidHuber-NOAA - Have you checked with the GEFS, GCAFS, and SFS leads for what their needs might be? What about other developers running experiments trying to continue things after updating? I don't think this is as one-off as you think this is. I don't think support should be forever, but maybe mark this as a breaking change, as this is breaking things for many people running things outside of CI tests.

@DavidHuber-NOAA
Copy link
Contributor Author

@JessicaMeixner-NOAA I have opened issue #4242 to handle the creation of a script to create GFS links to restart experiments. I would like to move forward with this PR as-is and then the GW team can handle the script creation. Please update #4242 with GFS requirements so they can be adequately captured and addressed.

@JessicaMeixner-NOAA
Copy link
Contributor

What is the timeline for #4242? I'll be honest, I'm concerned about this PR going in without the other. Although we currently have issues that this solves. I'm more concerned about being able to continue the realtime forward which we can no longer do if this PR is merged and #4242 has not been resolved.

@DavidHuber-NOAA
Copy link
Contributor Author

@JessicaMeixner-NOAA I believe I can address #4242 within a week. I will have a one-off script ready for the realtime sooner than that.

Comment on lines +378 to +382
if [[ "${EXP_WARM_START}" = ".true." ]]; then
export DO_STARTMEM_FROM_JEDIICE="YES"
else
export DO_STARTMEM_FROM_JEDIICE="{{ DO_STARTMEM_FROM_JEDIICE }}"
fi
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JessicaMeixner-NOAA
Copy link
Contributor

@JessicaMeixner-NOAA I believe I can address #4242 within a week. I will have a one-off script ready for the realtime sooner than that.

Thanks for the timeline information. I will say I would prefer that the script be included here or that stage_ic was modified to include or statements to handle either naming convention and not handle these as separate issues. If we have things as they are now, while we couldn't start from an HPSS tar-ball yet, we could at least start all of the GFSv17 retros and continue the realtime. If this is merged w/out an old-> new name script until that script exists. I was thinking before this would only impact the realtime, but our staged ICs also have the old names. I understand it's not my call at the end of the day, but just wanted to share my concerns and impacts on the GFS project here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add symbolic links for EE2 filenames to existing IC data

3 participants