Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis routines #46

Open
orianac opened this issue Nov 23, 2021 · 1 comment
Open

Analysis routines #46

orianac opened this issue Nov 23, 2021 · 1 comment

Comments

@orianac
Copy link
Member

orianac commented Nov 23, 2021

Here are some initial thoughts about design for the analysis routines. Feedback (wish lists? prioritization? reorganization?) is heartily welcomed! @jhamman @tcchiao @norlandrhagen

Structure:
qaqc.py
Will include checking for:

  • nans
  • aphysical quantities (e.g. temp < 0K)
  • aphysical temporal patterns (a simple seasonality check on temperature?)

metrics.py
Will include:

  • comparison of downscaled/bias-corrected outputs with training data for historical period
  • assessment of projected future changes
    Comparisons will occur:
  • for different variables
  • at different timescales and aggregations (daily, weekly, monthly, annual, decadal)
  • for different frequencies (assess different return intervals)

plotting.py
Will include:

  • ability to visualize results from qaqc.py and metrics.py

analysis.py
Will include:

  • automatic running of set of plotting routines for every downscaling run
  • ability to compare across two downscaling runs (e.g. feed in runs from two different downscaling set-ups and generate a diff of the outputs from metrics for those two runs)
  • ensemble analyses (for comparing more than 2 runs)
@orianac
Copy link
Member Author

orianac commented Nov 30, 2021

More design for the analysis routines.

analysis.py will include tasks that perform

  1. QAQC (drawn from qaqc.py). It would be great to also include functionality to check for missing chunks.
  2. Assessment of performance with respect to a benchmark. This is dataset agnostic and will take as input a pointer to a benchmark dataset. For example, it could either be of a historical downscaled GCM dataset compared to the historical training dataset (which will be what will run automatically after every downscaled run) or comparing GCM to a coarsened historical training dataset (how good was raw GCM)
  3. Future changes for a given GCM (across three different climatic periods: 2030s, 2050s, 2080s)
  4. Comparisons across multiple datasets (e.g. multiple GCMs, multiple downscaling methods)

1-3 will run automatically as part of every downscaling flow execution. 4 will run on demand to compare multiple selected datasets.

Some other things to think about:

  • Do we want to make summary files as a byproduct of analysis runs or have an automatic summary-file calculation step that we always do and then analysis reads from those files (as in, have a step 0 which calculates summary files)
  • We will use the coarsened ERA5 files as part of the downscaling but then also in analysis- do we want to save those more permanently?

Plots:
For regional averages (SREX regions) and selection of individual points (tower locations?)

  • CDFs of daily, weekly, monthly, annual averages
  • timeseries (daily, weekly, monthly, annual averages)

Maps:
For different time periods (historical, future):

  • Mean
  • Max
  • Stdev
  • Variable-specific metrics: Precipitation: # wet days, Temperature: # days over 85, multiple-variable metrics (cold+dry, hot+wet etc.)

This doesn't include analyses specific to a given downscaling method. The structure also assumes it is post-processing of a complete global dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant