Meta Analysis Quarto NoteBook #3707

AritraDey-Dev · 2025-12-07T03:27:23Z

Description

This PR implements a Quarto notebook for the meta-analysis. It uses the posterior files and trait.data.Rdata to run the Meta Analysis demo. A pecan.xml file is also included to run the workflow.

Motivation and Context

Review Time Estimate

Immediately
Within one week
When possible

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
My name is in the list of CITATION.cff
I agree that PEcAn Project may distribute my contribution under any or all of
- the same license as the existing code,
- and/or the BSD 3-clause license.
I have updated the CHANGELOG.md.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add Demo 03 notebook to run meta-analysis with pre-generated data. Signed-off-by: Aritra Dey <adey01027@gmail.com>

Signed-off-by: Aritra Dey <adey01027@gmail.com>

mdietze · 2025-12-08T11:07:34Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+
+Meta-analysis in PEcAn is a hierarchical Bayesian statistical approach that synthesizes plant trait data from literature to constrain ecosystem model parameters. The **PEcAn.MA** module implements this functionality to combine prior information with observational data, generating posterior distributions for model parameters.
+
+In a standard PEcAn workflow, this step queries the BETYdb database for trait data and priors. For this demonstration, we will use pre-generated data files to simulate the workflow without requiring an active database connection during the notebook execution.


I'd rephase this. The MA runs on tabular data in a specific format. One way to easily get data in that format is to query BETYdb, but you can also generate that format manually if you have other trait data. Indeed, a great new Issue for first time PEcAn developers would be create a helper function(s) that reformats trait data from common trait databases (e.g. TRY) into the tabular format this module is expecting. Given that no one is actively updating BETY, this approach is probably going to be the de facto norm for most users, and in the future there should be an update to this demo once we have functions that enable this

created here #3717

mdietze · 2025-12-08T11:09:38Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+
+**Context & modeling scenario:**
+
+We simulate plant and ecosystem carbon balance (Net Primary Productivity and Net Ecosystem Exchange) at the AmeriFlux Niwot Ridge Forest site ([US‑NR1](https://ameriflux.lbl.gov/sites/siteinfo/US-NR1)) during the year 2004. We use SIPNET parameterized as a temperate conifer PFT and driven by AmeriFlux meteorology following the analysis in [Moore et al. (2007)](https://doi.org/10.1016/j.agrformet.2008.04.013). This notebook also provides a compact template that can be extended to more years, locations, and PFTs.


language here is a bit rough. Is this something you're telling the user they want to do? An example? A reference back to Demo 1 and Demo 2? Also note that one can use the MA without then feeding the posteriors into a model (though we definitely want to highlight the latter)

Actually,This was taken from Demo 1 and Demo 2 to show the scenario we were considering for this notebook. However, in our case this scenario is incorrect because we are not using site information or met data in the settings file.need to update the scenario to match the minimal settings file used in this demo i.e. #3707 (comment).

Adjusted this in ac4ece1..

mdietze · 2025-12-08T11:10:14Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+
+In this specific demo, we are performing a meta-analysis for the **temperate coniferous** Plant Functional Type (PFT). The goal is to estimate the probability distributions for key model parameters (e.g., SLA, leaf turnover rate) by combining:
+*   **Priors**: Existing knowledge about the parameters.
+*   **Data**: Observed trait data (simulated for this demo).


Is the demo really using simulated data? Why? We've got plenty of real data

Not simulated,it should be pre-generated.I just query trait data with the settings and save it as from the db(local postgres setup) trait.data.Rdata particularly for this demo.

I'd hoped that was the case. Just make sure the text matches what you actually did.

mdietze · 2025-12-08T11:12:02Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+  pecanproject = 'https://pecanproject.r-universe.dev',
+  CRAN = 'https://cloud.r-project.org'))
+# Download and install PEcAn.all in R
+install.packages('PEcAn.all')


I'd recommend a bit more nuance here. If you just want to run a trait meta-analysis you don't need to install PEcAn.all, but rather just a subset of packages (and it would be good to show which subset). If you want to then run a model using those posteriors you probably will end up installing PEcAn.all.

Yes, I just copied this part from Demo 1 and Demo 2.fixed it now.

mdietze · 2025-12-08T11:12:45Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+install.packages('PEcAn.all')
+```
+
+*   **A valid `pecan.xml` configuration file**: Start with the example at `pecan/documentation/tutorials/Demo_03_Meta_Analysis/pecan.xml`.


Issue to open: one should be able to run the MA itself without a full settings object

<?xml version="1.0" encoding="UTF-8"?> <pecan> <pfts> <pft> <name>temperate.coniferous</name> <posterior.files>pft/temperate.coniferous</posterior.files> <outdir>pft/temperate.coniferous</outdir> </pft> </pfts> <meta.analysis> <iter>3000</iter> <random.effects> <on>FALSE</on> <use_ghs>TRUE</use_ghs> </random.effects><threshold>1.2</threshold> </meta.analysis> </pecan>

This configuration alone is enough to run the meta-analysis; no additional model or run blocks are required.

Cool. So I think it would be fine in the Issue I'm recommending to recommend that the MA module be refactored to take in the following as arguments:

trait dataframe

prior dataframe

output directory

list containing MA configs (iter, random.effects, use_ghs, theshold, etc.) with some sensible defaults

Here in the demo one could elect to grab those things from a settings object, but one could also build a demo based on just specifying those things.

mdietze · 2025-12-08T11:13:51Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+*   **SIPNET binary**: While not strictly used for the meta-analysis calculation itself, it is part of the broader workflow context.
+*   **Pre-generated Data**: This demo relies on `trait.data.Rdata` and `prior.distns.Rdata` files which are included in the `pft/temperate.coniferous` directory.
+
+## Install SIPNET and Meteorological Data


If this was covered in Demo 1 or Demo 2 send users there rather than duplicating text. If you duplicate then there's twice as much text to keep up-to-date if anything changes

Yes, for this meta-analysis the posterior files are not needed.will remove this block.

A follow-up question: in this case, the model block in the settings configuration file (pecan.xml) isn’t needed, and run$input isn’t needed either. Basically, this whole section:
Basically this part

<model> <type>SIPNET</type> <revision>git</revision> <delete.raw>FALSE</delete.raw> <binary>demo_outdir/sipnet</binary> </model> <run> <site> <met.start>2004/01/01</met.start> <met.end>2004/12/31</met.end> <name>Niwot Ridge Forest/LTER NWT1 (US-NR1)</name> <lat>40.0329</lat> <lon>-105.546</lon> </site> <inputs> <met> <source>AmerifluxLBL</source> <output>SIPNET</output> <username>Aritra_2004</username> <path> <path1>dbfiles/AMF_US-NR1_BASE_HH_23-5.2004-01-01.2004-12-31.clim</path1> </path> </met> </inputs> <start.date>2004/01/01</start.date> <end.date>2004/12/31</end.date> </run>

Should we remove this, or keep it so that users still get a clear idea of what the settings file normally looks like(Demo 1 and demo 2 does it though)?

I'd vote against keeping anything that you don't need. It's fine, in practice, to say that the settings for using these posteriors for forward model simulation is more complicated (see Demo1 and Demo2) and that in practice you can write one settings file that contains both the model run and MA settings and run it all as a single workflow. Making the settings here minimal is an asset for making the module more accessible and for ultimately moving towards dropping settings as an argument and instead passing the function just the info it needs

mdietze · 2025-12-08T11:15:12Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+
+See Demo 1 Section 6 for details on what these functions do. Briefly, they read the XML file, convert it into an R list object that PEcAn can use, check that settings are valid, fill in defaults, and create the output directory.
+
+## Explore the Settings Object


not needed except for the parts related to the MA, which you've already covered

mdietze · 2025-12-08T11:16:37Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+# Run Meta Analysis
+
+We now run the meta-analysis. The `runModule.run.meta.analysis` function will:
+1.  Read the `trait.data.Rdata` and `prior.distns.Rdata` from the PFT output directory.


Issue to open: would be great to be able to pass data into the MA directly, rather than it having to come from a file with a very specific file name

Yes, I believe this needs a separate function to implement it. We just need to pass trait.data.Rdata and prior.distns.Rdata to make it work.

Instead, can we do this within this notebook by adding a section where the user only needs to provide the paths to trait.data.Rdata and prior.distns.Rdata ? If a user already has these two files, they can run the meta-analysis directly.

follow up: I'm not requesting a change to this PR to implement what I'm suggesting. I'm instead asking that you open a new Issue to improve what we're doing in the future. Specifically, I'd recommend that the MA module take in the trait dataframe and prior dataframe as arguments to the function itself, rather than relying on the functions knowing to load those specific files from paths provided within an overly complex settings object. This will push a tiny bit of work into the demo (load the example files, look at them to see how they are formatted, pass them into the MA function) but IMHO will greatly increase the usability of the MA module as a stand-alone tool. Right now, it's functionally easy to to use the MA outside the PEcAn workflow, but its CONCEPTUALLY hard to do so because there's so much mystery in what it's doing. Right now, no one can actually run this as a stand alone module in practice without a whole lot of diving into the code to see what the module actually does and what it actually needs to work. Actually getting this working and documented might be a good place for a new GSOC student.

Done here #3718

@AritraDey-Dev Thanks for creating issue #3717! I've implemented the format_try_for_ma() function in PR #3720.

This connects directly to @mdietze's suggestion about using external data. Once PR #3720 (TRY formatter) and issue #3718 (MA refactoring) are both implemented, this tutorial could show how to use TRY data with PEcAn's meta-analysis.

mdietze · 2025-12-08T13:04:13Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+
+# Visualize Meta Analysis Results {#sec-visualize}
+
+It is important to check the MCMC chains for convergence. We can visualize the trace plots and density plots for each trait.


might be nice to provide additional explanation about what these figures are showing and how to interpret them

Will add them.

added in eb4c2f4

mdietze · 2025-12-08T13:08:04Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+# }
+```
+
+# Conclusion


Notebook doesn't go on to use these posteriors in a set of model runs so there's definitely no need to include the text and code for installing the model, installing all PEcAn packages, etc.

Notebook doesn't explain HOW one goes on to use these posteriors in a set of model runs. Could be as simple as "now edit your pecan.xml to point <posterior.files> to these outputs and rerun Demo 02. How did your results change in terms of the width of the CI and the uncertainty analyses?"

mdietze · 2025-12-08T13:09:42Z

documentation/tutorials/Demo_03_Meta_Analysis/meta_analysis.qmd

+[Explore](https://github.com/PecanProject/pecan/blob/main/documentation/tutorials/sensitivity/PEcAn_sensitivity_tutorial_v1.0.Rmd) how model error changes as a function of parameter value (i.e. data assimilation ‘by hand’)
+
+
+**MCMC Concepts**


Both these two modules are fairly pedagogical about calibration concepts, but neither really show how to run the PEcAn calibration or SDA code. I'd include those modules too.

Signed-off-by: Aritra Dey <adey01027@gmail.com>

AritraDey-Dev · 2025-12-11T17:18:05Z

@mdietze Thanks for the review and the detailed comments. I’ve tried to address and fix the suggested items in the subsequent commits.

Mayanknishad9 · 2025-12-15T07:50:10Z

@mdietze This is a great suggestion. My format_try_for_ma() function (PR #3720 ) creates the trait_data dataframe from TRY exports that could feed directly into such a simplified MA function.

If the MA module is refactored as you suggest, users could:

Use format_try_for_ma() to convert TRY data to proper format
Pass that dataframe directly to a simplified run_meta_analysis() function
Bypass the complex settings object entirely

This would make PEcAn's meta-analysis much more accessible to researchers with external data sources.

AritraDey-Dev added 8 commits December 7, 2025 04:59

feat(demo03): add pecan.xml configuration for meta-analysis demo

a367876

Signed-off-by: Aritra Dey <adey01027@gmail.com>

feat(demo03): add pre-generated trait data and priors

3bf73ee

Signed-off-by: Aritra Dey <adey01027@gmail.com>

meta analysis quarto notebook

02a4f60

Add Demo 03 notebook to run meta-analysis with pre-generated data. Signed-off-by: Aritra Dey <adey01027@gmail.com>

fix module name

fae7c50

Signed-off-by: Aritra Dey <adey01027@gmail.com>

add comments on update = TRUE

8ed8ba5

Signed-off-by: Aritra Dey <adey01027@gmail.com>

renamed quarto notebook

f752656

Signed-off-by: Aritra Dey <adey01027@gmail.com>

rename section name for output files

76389bc

Signed-off-by: Aritra Dey <adey01027@gmail.com>

use pkg name for meta analysis function

a0fecfc

Signed-off-by: Aritra Dey <adey01027@gmail.com>

github-actions bot added the Documentation label Dec 7, 2025

AritraDey-Dev requested a review from dlebauer December 7, 2025 03:27

fix meta analysis function name in comments

aa2f4b2

Signed-off-by: Aritra Dey <adey01027@gmail.com>

AritraDey-Dev added the Type: Enhancement label Dec 7, 2025

add changelog.md

4dbfc10

Signed-off-by: Aritra Dey <adey01027@gmail.com>

mdietze reviewed Dec 8, 2025

View reviewed changes

AritraDey-Dev and others added 14 commits December 9, 2025 00:07

remove duplicate explore settings object

542e49f

Signed-off-by: Aritra Dey <adey01027@gmail.com>

use only subset of pkgs for meta analysis

8eb5f99

Signed-off-by: Aritra Dey <adey01027@gmail.com>

fix: minor formatting

66b167a

Signed-off-by: Aritra Dey <adey01027@gmail.com>

posteriior files not needed meta analysis

d4f65ec

Signed-off-by: Aritra Dey <adey01027@gmail.com>

remove sipnet installation note

9e0dd6f

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Merge branch 'develop' into quarto-meta-analysis-demo

115c24a

fix instructions

3c28589

Signed-off-by: Aritra Dey <adey01027@gmail.com>

meta analysis should work with minimal settings config

9c09299

Signed-off-by: Aritra Dey <adey01027@gmail.com>

remove prepare_settings

c7b0b3f

Signed-off-by: Aritra Dey <adey01027@gmail.com>

add explain on plots of meta analysis

eb4c2f4

Signed-off-by: Aritra Dey <adey01027@gmail.com>

refine conclusion of the meta analysis

ea9c1ce

Signed-off-by: Aritra Dey <adey01027@gmail.com>

fix: pecan.xml

6c6b290

Signed-off-by: Aritra Dey <adey01027@gmail.com>

add context to PDA and SDA links

85322ec

Signed-off-by: Aritra Dey <adey01027@gmail.com>

clarify data sources in introduction

f4489a5

Signed-off-by: Aritra Dey <adey01027@gmail.com>

AritraDey-Dev mentioned this pull request Dec 11, 2025

Helper functions to format trait data from external databases (e.g. TRY) for PEcAn.MA #3717

Open

AritraDey-Dev mentioned this pull request Dec 11, 2025

Allow MA module to accept trait and prior dataframes directly #3718

Open

context and model scneraio

ac4ece1

Signed-off-by: Aritra Dey <adey01027@gmail.com>

AritraDey-Dev requested a review from mdietze December 11, 2025 17:15

Mayanknishad9 mentioned this pull request Dec 15, 2025

Add TRY database formatter for meta-analysis (Issue #3717) #3720

Open

14 tasks


		Meta-analysis in PEcAn is a hierarchical Bayesian statistical approach that synthesizes plant trait data from literature to constrain ecosystem model parameters. The PEcAn.MA module implements this functionality to combine prior information with observational data, generating posterior distributions for model parameters.

		In a standard PEcAn workflow, this step queries the BETYdb database for trait data and priors. For this demonstration, we will use pre-generated data files to simulate the workflow without requiring an active database connection during the notebook execution.


		Context & modeling scenario:

		We simulate plant and ecosystem carbon balance (Net Primary Productivity and Net Ecosystem Exchange) at the AmeriFlux Niwot Ridge Forest site ([US‑NR1](https://ameriflux.lbl.gov/sites/siteinfo/US-NR1)) during the year 2004. We use SIPNET parameterized as a temperate conifer PFT and driven by AmeriFlux meteorology following the analysis in [Moore et al. (2007)](https://doi.org/10.1016/j.agrformet.2008.04.013). This notebook also provides a compact template that can be extended to more years, locations, and PFTs.


		See Demo 1 Section 6 for details on what these functions do. Briefly, they read the XML file, convert it into an R list object that PEcAn can use, check that settings are valid, fill in defaults, and create the output directory.

		## Explore the Settings Object


		# Visualize Meta Analysis Results {#sec-visualize}

		It is important to check the MCMC chains for convergence. We can visualize the trace plots and density plots for each trait.

		[Explore](https://github.com/PecanProject/pecan/blob/main/documentation/tutorials/sensitivity/PEcAn_sensitivity_tutorial_v1.0.Rmd) how model error changes as a function of parameter value (i.e. data assimilation ‘by hand’)


		MCMC Concepts

Meta Analysis Quarto NoteBook #3707

Are you sure you want to change the base?

Meta Analysis Quarto NoteBook #3707

Conversation

AritraDey-Dev commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Review Time Estimate

Types of changes

Checklist:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AritraDey-Dev commented Dec 11, 2025

Uh oh!

Mayanknishad9 commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

AritraDey-Dev commented Dec 7, 2025 •

edited

Loading