Skip to content
39 changes: 39 additions & 0 deletions data/nsides/offsides/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
name: offsides
description: OffSIDES is a database of individual drug side effect signals mined from the FDA's Adverse Event Reporting System. The innovation of OffSIDES
is that a propensity score matching (PSM) model is used to identify control drugs and produce better PRR estimates. In OffSIDES we focus on drug safety
signals that are not already established by being listed on the structured product label -- hence they are off-label drug side effects.
targets:
- id: PRR
description: Proportional reporting ratio)
type: continuous
names:
- Proportional reporting ratio
- id: PRR_error
description: Standard error of the PRR estimate
type: continuous
names:
- Proportional reporting ratio error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think this is something the model should be able to predict? (I'm just curious)

- id: mean_reporting_frequency
description: Proportion of reports for the drug that report the side effect
type: continuous
names:
- mean reporting frequency
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the absolute number something a model should be able to predict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, according to the paper, a model should be able to predict it

identifier:
- id: drug_concept_name
description: RxNorm name string for the drug
type: categorical
- id: condition_concept_name
description: MedDRA identifier for the side effect
type: categorical
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both of them simultaneously for the ratio to be meaningful?
That is, a correct prompt would ask the model something like

"What is the proportional reporting ratio for for "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, so the prompt would be "what is the PRR of <condition_concept_name> for the <drug_concept_name>?". higher PRR means higher reported side effect for that particular drug.

license: CC BY 4.0
links:
- url: https://tatonettilab.org/resources/nsides/
description: data source
- url: https://nsides.io/
description: database website
num_points: 3042873
bibtex: "\n @article{Tatonetti2012,\n author = {Tatonetti, Nicholas P. and Ye, Peter P. and Daneshjou, Roxana and Altman, Russ B.},\n \
\ title = {Data-driven prediction of drug effects and interactions},\n journal = {Sci Transl Med},\n volume = {4},\n number\
\ = {125},\n pages = {125ra31},\n year = {2012},\n doi = {10.1126/scitranslmed.3003377},\n pmid = {22422992},\n pmcid\
\ = {PMC3382018}\n }\n "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove those newlines somehow and just have a multiline string? I can help with that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! i'll let you take care of it if thats ok

Loading