Skip to content

Commit 918aec7

Browse files
committed
Update package site:
- Reorganize README/landing page - Move full installation instruction to README - Add R Markdown for MRP method description
1 parent 613bd85 commit 918aec7

File tree

4 files changed

+143
-40
lines changed

4 files changed

+143
-40
lines changed

README.md

Lines changed: 25 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -16,47 +16,49 @@
1616

1717
## Getting Started
1818

19+
You can use **shinymrp** in two flexible ways:
1920

20-
You can use **shinymrp** in two flexible ways, both available through a single easy installation:
21-
22-
1. Shiny App
21+
### Shiny App
2322

2423
The graphical user interface (GUI), built with the Shiny framework, is designed for newcomers and those looking for an interactive, code-free analysis experience.
2524

26-
2. Object-Oriented Programming Interface
25+
Launch the app locally in R with:
2726

28-
Leverage the full flexibility of the exported R6 classes for a programmatic workflow, ideal for advanced users and those integrating MRP into larger R projects.
27+
```r
28+
shinymrp::run_app()
29+
```
2930

30-
### Installation
31+
#### Try the Demo
3132

32-
To get started, install the latest development version from [GitHub](https://github.com/mrp-interface/shinymrp):
33+
Explore the Shiny app without installation via our [online demo](https://mrpinterface.shinyapps.io/shinymrp/).
3334

34-
```R
35-
# If you don't have 'remotes', install it first:
36-
install.packages('remotes')
37-
remotes::install_github('mrp-interface/shinymrp')
38-
```
39-
### Launch the Shiny App
35+
Need a walkthrough? Watch our step-by-step [video tutorial](https://youtu.be/CUcRYn92fmU?si=EhcAbuwuG2XM-0N0).
4036

41-
New to **shinymrp**? We recommend starting with the Shiny app:
37+
### Object-Oriented Programming Interface
4238

43-
```R
44-
shinymrp::run_app()
39+
Leverage the full flexibility of the exported R6 classes for a programmatic workflow, ideal for advanced users and those integrating MRP into larger R projects.
40+
41+
Import **shinymrp** in scripts or R Markdown documents just like any other R package:
42+
43+
```r
44+
library(shinymrp)
4545
```
4646

47-
### Import programmatic components
47+
### Installation
4848

49-
For those experienced with R and object-oriented programming, use the package in scripts or R Markdown documents:
49+
To get started, install the latest development version of **shinymrp** from [GitHub](https://github.com/mrp-interface/shinymrp) using `remotes`:
5050

51-
```R
52-
library(shinymrp)
51+
```r
52+
# If 'remotes' is not installed:
53+
install.packages("remotes")
54+
remotes::install_github("mrp-interface/shinymrp")
5355
```
5456

55-
## Try the Demo
57+
The package installation does not automatically install all prerequisites. Specifically, **shinymrp** uses [CmdStanR](https://mc-stan.org/cmdstanr/) as the bridge to run [Stan](https://mc-stan.org/), a state-of-the-art platform for Bayesian modeling. Stan requires a modern C++ toolchain (compiler and GNU Make build utility).
5658

57-
Explore the **shinymrp** features instantly, no installation required, via our [online demo](https://mrpinterface.shinyapps.io/shinymrp/).
59+
- For setting up your toolchain, see [Stan’s documentation](https://mc-stan.org/docs/cmdstan-guide/installation.html#cpp-toolchain).
60+
- Once ready, follow the [CmdStanR installation instructions](https://mc-stan.org/cmdstanr/articles/cmdstanr.html#installing-cmdstan) to install CmdStanR and CmdStan.
5861

59-
Need a walkthrough? Watch our step-by-step [video tutorial](https://youtu.be/CUcRYn92fmU?si=EhcAbuwuG2XM-0N0).
6062

6163
## Learn More
6264

_pkgdown.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ url: https://mrp-interface.github.io/shinymrp/
44
template:
55
bootstrap: 5
66
bootswatch: cosmo
7+
math-rendering: mathjax
78

89
navbar:
910
logo:
@@ -47,8 +48,9 @@ articles:
4748
- getting-started
4849
- title: "More details"
4950
desc: >
50-
A deeper dive into the programmatic interface and data preprocessing in shinymrp.
51+
A deeper dive into the programmatic interface, the methodology, and data preprocessing in shinymrp.
5152
contents:
5253
- workflow
5354
- data-prep
55+
- method
5456
- example

vignettes/getting-started.Rmd

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -18,22 +18,7 @@ vignette: >
1818

1919
![](./figures/workflow.png)
2020

21-
If you prefer a graphical and interactive experience, you can launch the Shiny app with `shinymrp::run_app()`, which includes a built-in user guide. Users interested in using the programmatic interface can follow examples in the vignettes, starting with the [Key steps](https://mrp-interface.github.io/shinymrp/articles/getting-started#key-steps) section below. Regardless of the interface you choose, please follow the instructions below to install the package prerequisites.
22-
23-
## Installation
24-
25-
To get started, install the latest development version of **shinymrp** from [GitHub](https://github.com/mrp-interface/shinymrp):
26-
27-
```{r, eval = FALSE}
28-
# If 'remotes' is not installed:
29-
install.packages("remotes")
30-
remotes::install_github("mrp-interface/shinymrp")
31-
```
32-
The package installation does not automatically install all prerequisites. Specifically, **shinymrp** uses [CmdStanR](https://mc-stan.org/cmdstanr/) as the bridge to run [Stan](https://mc-stan.org/), a state-of-the-art platform for Bayesian modeling. Stan requires a modern C++ toolchain (compiler and GNU Make build utility).
33-
34-
- For setting up your toolchain, see [Stan’s documentation](https://mc-stan.org/docs/cmdstan-guide/installation.html#cpp-toolchain).
35-
- Once ready, follow the [CmdStanR installation instructions](https://mc-stan.org/cmdstanr/articles/cmdstanr.html#installing-cmdstan) to install CmdStanR and CmdStan.
36-
21+
If you prefer a graphical and interactive experience, you can launch the Shiny app with `shinymrp::run_app()`, which includes a built-in user guide. Users interested in using the programmatic interface can follow examples in the vignettes, starting with the [Key steps](https://mrp-interface.github.io/shinymrp/articles/getting-started#key-steps) section below.
3722

3823
## Key steps
3924

vignettes/method.Rmd

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
title: "MRP methodological guide"
3+
output: rmarkdown::html_vignette
4+
vignette: >
5+
%\VignetteIndexEntry{MRP methodological guide}
6+
%\VignetteEngine{knitr::rmarkdown}
7+
%\VignetteEncoding{UTF-8}
8+
---
9+
10+
```{r, include = FALSE}
11+
knitr::opts_chunk$set(
12+
collapse = TRUE,
13+
comment = "#>"
14+
)
15+
```
16+
17+
MRP has two key steps: (1) fit a multilevel model for the response with the adjustment variables based on the input data; and (2) poststratify using the population distribution of the adjustment variables, yielding prevalence estimates in the target population and subgroups.
18+
19+
## MRP for cross-sectional data
20+
21+
We use cross-sectional data to refer to the dataset with measures collected at a specific time point that does not account for temporal variation in the modeling or poststratification adjustment. We use a binary outcome of interest as an example. Let $y_i (=0/1)$ be the binary response for individual $i$, with $y_i=1$ indicating the positive response. We employ a logistic regression with varying effects for age, race, and ZIP code, where the ZIP-code-level variation is further explained by the ZIP-code-level predictors.
22+
\[
23+
\label{mrp-1}
24+
\textrm{Pr}(y_i = 1) = \textrm{logit}^{-1}(
25+
\beta_1+\beta_2{\rm male}_i +
26+
\alpha_{\rm a[i]}^{\rm age}
27+
+ \alpha_{\rm r[i]}^{\rm race}
28+
+ \alpha_{\rm s[i]}^{\rm ZIP}
29+
),
30+
\]
31+
where ${\rm male}_i$ is an indicator for men, $\alpha_{\rm a}^{\rm age}$ is the age effect, with a value of $a[i]$ for subject $i$, on the log-odds function of the probability of having a positive response, $\alpha_{\rm r}^{\rm race}$ is the racial effect, and $\alpha_{\rm s}^{\rm ZIP}$ is the ZIP-code-level effect. In the Bayesian framework, we assign hierarchical priors to varying intercepts as default:
32+
\begin{align}
33+
\label{prior}
34+
\nonumber &\alpha^{\rm age} \sim \mbox{normal}(0,\sigma^{\rm age} ), \,\,\, \sigma^{\rm age}\sim \mbox{normal}_+ (0,2.5)\\
35+
&\alpha^{\rm race} \sim \mbox{normal}(0,\sigma^{\rm race} ), \,\,\, \sigma^{\rm race}\sim \mbox{normal}_+ (0,2.5).
36+
\end{align}
37+
Here $\mbox{normal}_+ (0,2.5)$ represents a half-normal distribution with the mean $0$ and standard deviation $2.5$ restricted to positive values. As we have ZIP-code-level predictors $\vec{Z}^{\rm ZIP}_{s}$, we need to build another model in which $\alpha_{\rm s}^{\rm ZIP}$ is the outcome of a linear regression with ZIP-code-level predictors:
38+
\begin{align}
39+
\label{prior-zip}
40+
\alpha_{\rm s}^{\rm ZIP} =\vec{\alpha}\vec{Z}^{\rm ZIP}_{s} + e_s, \,\,\, e_s\sim \mbox{normal}(0,\sigma^{\rm ZIP} ),\,\,\, \sigma^{\rm ZIP}\sim \mbox{normal}_+ (0,2.5),
41+
\end{align}
42+
where $e_s$ is a ZIP-code-level random error.
43+
44+
The interface allows users to specify alternative priors, including structured priors for high-order interaction terms developed by [Si et al. (2020)](https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2020002/article/00003-eng.pdf?st=iF1_Fbrh).
45+
46+
Because the outcome model assumes that the people in the same poststratification cell share the same response probability, we can replace the microdata with cellwise aggregates and employ a binomial model for the sum of the responses in cell $j$ as $y^*_j \sim \textrm{binomial}(n_j, \theta_j)$, where $n_j$ is the sample cell size and $\theta_j=\textrm{logit}^{-1}(
47+
\beta_1+\beta_2{\rm male}_j +
48+
\alpha_{\rm a[j]}^{\rm age}
49+
+ \alpha_{\rm r[j]}^{\rm race}
50+
+ \alpha_{\rm s[j]}^{\rm ZIP}
51+
)
52+
$ using the cellwise effects of all factors. The interface thus allows users to upload microdata or cellwise aggregates as the input data.
53+
54+
To generate overall population or subgroup estimates, we combine model predictions within the poststratification cells---in the contingency table of sex, age, race, and ZIP---weighted by the population cell frequencies $N_j$, which are derived from the linked ACS data in our application. Additionally, users may choose to upload custom poststratification data for specific target populations (e.g., a different country, rather than the U.S.). If we write the expected outcome in cell $j$ as $\hat{\theta}_j$ in cell $j$, the population average from MRP is then:
55+
$$
56+
\hat{\theta}^{\rm pop} = \frac{\sum_j N_j \hat{\theta}_j}{\sum_j N_j}.
57+
$$
58+
The MRP estimator for county $c$ aggregates over covered cells $j$ in that county as,
59+
$$
60+
\hat{\theta}_s^{\rm pop} = \frac{\sum_{j \in \textrm{county c}} N_j \hat{\theta}_j}{\sum_{j \in \textrm{county c}} N_j}.
61+
$$
62+
We implement Bayesian inference for the estimates, where the variance estimates and 95\% credible intervals are computed based on the posterior samples.
63+
64+
When the outcome is continuous, we specify linear regression models and estimate residual variance with introduced prior distributions.
65+
66+
## MRP for time-varying data with measurement error
67+
68+
As an example of time-varying data, we model weekly PCR testing results. We use a Bayesian framework to account for the PCR testing sensitivity and specificity. Here, MRP proceeds in two steps: (1) fit a multilevel model to the testing data for incidence incorporating time and covariates, and (2) poststratify using the population distribution of the adjustment variables: sex, age, race, and ZIP codes, where we assume the population distribution is the same during the study period. Hence, the poststratification cell is defined by the cross-tabulation of sex, age, race, ZIP code, and indicators of time in weeks based on the test result dates.
69+
70+
We denote the PCR test result for individual $i$ as $y_i$, where $y_i=1$ indicates a positive result and $y_i=0$ indicates negative. Similarly, with poststratification cells, we assume that people in the same cell have the same infection rate and can directly model cellwise summaries. We obtain aggregated counts as the number of tests $n_j$ and the number of positive cases $y^*_j$ in cell $j$. Let $p_j=\textrm{Pr}(y_{j[i]}=1)$ be the probability that person $i$ in cell $j$ tests positive. We account for the PCR testing sensitivity and specificity, where the positivity $p_j$ is a function of the test sensitivity $\delta$, specificity $\gamma$, and the true incidence $\pi_j$ for people in cell $j$:
71+
\begin{align}
72+
\label{positivity}
73+
p_j=(1-\gamma)(1-\pi_j )+\delta \pi_j.
74+
\end{align}
75+
76+
We fit a binomial model for $y^*_j$, $y^*_j \sim \textrm{binomial}(n_j, p_j)$ with a logistic regression for $\pi_j$ with covariates---sex, age, race, ZIP codes, and time in weeks---to allow time-varying incidence in the multilevel model.
77+
\begin{align}
78+
\label{pi}
79+
\textrm{logit}(\pi_j)=\beta_1+\beta_2{\rm male}_j+\alpha_{{\rm a}[j]}^{\rm age}+\alpha_{{\rm r}[j]}^{\rm race}+\alpha_{{\rm s}[j]}^{\rm ZIP}+\alpha_{{\rm t}[j]}^{\rm time},
80+
\end{align}
81+
where ${\rm male}_j$ is an indicator for men; ${\rm a}[j]$, ${\rm r}[j]$, and ${\rm s}[j]$ represent age, race, and ZIP levels; and ${\rm t}[j]$ denotes the time in weeks when the test result is collected for cell $j$. We include ZIP-code-level predictors $\vec{Z}^{\rm ZIP}_{s}$ for ZIP code $s$,
82+
\[
83+
\alpha_{s}^{\rm ZIP} =\vec{\alpha}\vec{Z}^{\rm ZIP}_{s} + e_s.
84+
\]
85+
We assign the same priors to those in the cross-sectional case to varying intercepts and error terms $e_s$.
86+
\begin{align}
87+
\nonumber &\alpha^{\rm age} \sim \mbox{normal}(0,\sigma^{\rm age} ), \,\,\, \sigma^{\rm age}\sim \mbox{normal}_+ (0,2.5)\\
88+
&\alpha^{\rm race} \sim \mbox{normal}(0,\sigma^{\rm race} ), \,\,\, \sigma^{\rm race}\sim \mbox{normal}_+ (0,2.5).\\
89+
\alpha_{\rm s}^{\rm ZIP} &=\vec{\alpha}\vec{Z}^{\rm ZIP}_{s} + e_s, \,\,\, e_s\sim \mbox{normal}(0,\sigma^{\rm ZIP} ),\,\,\, \sigma^{\rm ZIP}\sim \mbox{normal}_+ (0,2.5).
90+
\end{align}
91+
92+
As to time-varying effects, we assume $\alpha_{{\rm t}}^{\rm time} \sim \mbox{normal}(0,\sigma^{\rm time} )$, with a weakly informative hyperprior, $\sigma^{\rm time}\sim \mbox{normal}_+ (0,5)$.
93+
94+
As an example, we assign normal priors to the ZIP-code-level and time-varying effects. The interface leverages Stan’s modeling capabilities to allow alternative prior choices and can be extended with advanced modeling.
95+
96+
Using the estimated incidence $\hat{\pi}_j$, we adjust for selection bias by applying the sociodemographic distributions in the community with population cell counts $N_j$ based on the ACS, yielding population-level weekly incidence estimates:
97+
\[
98+
\hat{\pi}_{t} = \frac{\sum_{j \in \mbox{week,} t} N_j\hat{\pi}_j}{\sum_{j \in \mbox{week,} t} N_j},
99+
\]
100+
which can be restricted to specific subgroups or regions of interest, as another key property of MRP is to yield robust estimates for small groups. We obtain the Bayesian credible intervals from the posterior samples for inference.
101+
102+
## More readings
103+
104+
1. [Y Si, T Tran, J Gabry, M Morris, and A Gelman (2025), Multilevel Regression and Poststratification Interface: Application to Track Community-level COVID-19 Viral Transmission, Population Health Metrics (under review)](http://arxiv.org/abs/2405.05909).
105+
106+
2. [Y Si (2025). On the Use of Auxiliary Variables in Multilevel Regression and Poststratification, Statistical Science, 40(2), 272--288](http://dx.doi.org/10.1214/24-STS932).
107+
108+
3. [Y Si, L Covello, S Wang, T Covello, and A Gelman (2022). Beyond Vaccination Rates: A Synthetic Random Proxy Metric of Total SARS-CoV-2 Immunity Seroprevalence in the Community, Epidemiology, 33(4), 457--464](https://journals.lww.com/epidem/Fulltext/2022/07000/Beyond_Vaccination_Rates__A_Synthetic_Random_Proxy.3.aspx).
109+
110+
4. [L Covello, A Gelman, Y Si, and S Wang (2021). Routine Hospital-Based SARS-CoV-2 Testing Outperforms State-Based Data in Predicting Clinical Burden, Epidemiology, 32(6), 792--799](https://journals.lww.com/epidem/Fulltext/2021/11000/Routine_Hospital_based_SARS_CoV_2_Testing.4.aspx).
111+
112+
5. [Y Si, R Trangucci, J Gabry, and A Gelman (2020). Bayesian Hierarchical Weighting Adjustment and Survey Inference, Survey Methodology, 46(2), 181--214](https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2020002/article/00003-eng.pdf?st=iF1_Fbrh).
113+
114+

0 commit comments

Comments
 (0)