Skip to content

Commit 7623a29

Browse files
Update to README + WIP list (#555)
* Update to README + WIP list * Update README.md Co-authored-by: Stephen Bach <stephenhbach@gmail.com> Co-authored-by: Stephen Bach <stephenhbach@gmail.com>
1 parent de86c4c commit 7623a29

File tree

3 files changed

+307
-3
lines changed

3 files changed

+307
-3
lines changed

README.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
11
# PromptSource
2-
Toolkit for collecting and applying templates of prompting instances.
2+
Promptsource is a toolkit for collecting and applying prompts to NLP datasets.
33

4-
WIP
4+
Promptsource uses a simple templating language to programatically map an example of a dataset into a text input and a text target.
5+
6+
Promptsource contains a growing collection of prompts (which we call **P3**: **P**ublic **P**ool of **P**rompts). As of October 18th, there are ~2'000 prompts for 170+ datasets in P3.
7+
Feel free to use these prompts as they are (you'll find citation details [here](##Citation)).
8+
9+
Note that a subset of the prompts are still *Work in Progress*. You'll find the list of the prompts which will potentially be modified in the near future [here](WIP.md). Modifications will in majority consist of metadata collection, but in some cases, will impact the templates themselves. To facilitate traceability, Promptsource is currently pinned at version `0.1.0`.
510

611
## Setup
712
1. Download the repo
813
2. Navigate to root directory of the repo
914
3. Install requirements with `pip install -r requirements.txt` in a Python 3.7 environment
1015

1116
## Running
12-
From the root directory of the repo, you can launch the editor with
17+
You can browse through existing prompts on the [hosted versiond of Promptsource](https://bigscience.huggingface.co/promptsource).
18+
19+
If you want to launch a local version (in particular to write propmts, from the root directory of the repo, launch the editor with:
1320
```
1421
streamlit run promptsource/app.py
1522
```
@@ -68,3 +75,14 @@ For more information, read the [Contribution guidelines](CONTRIBUTING.md).
6875
**Warning or Error about Darwin on OS X:** Try downgrading PyArrow to 3.0.0.
6976

7077
**ConnectionRefusedError: [Errno 61] Connection refused:** Happens occasionally. Try restarting the app.
78+
79+
## Development structure
80+
81+
Promptsource was developed as part of the [BigScience project for open research 🌸](https://bigscience.huggingface.co/), a year-long initiative targeting the study of large models and datasets. The goal of the project is to research language models in a public environment outside large technology companies. The project has 600 researchers from 50 countries and more than 250 institutions.
82+
83+
## Citation
84+
85+
If you want to cite this P3 or Promptsource, you can use this bibtex:
86+
```bibtex
87+
TODO
88+
```

WIP.md

Lines changed: 286 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,286 @@
1+
# Which prompts are finalized?
2+
3+
A subset of the prompts in P3 are still *Work in Progress*. For information, we provide the lists of the datasets for which prompts have been finalized and datasetsf for which prompts are suceptible to be modified in the near future. Modifications will in majority consist of metadata collection, but in some cases, will impact the templates themselves.
4+
5+
To facilitate traceability, Promptsource is currently pinned at version `0.1.0`.
6+
7+
# Finalized datasets
8+
9+
|Dataset|Subset (optional)|
10+
|-|-|
11+
|adversarial_qa|dbert|
12+
|adversarial_qa|dbidaf|
13+
|adversarial_qa|droberta|
14+
|adversarial_qa|adversarialQA|
15+
|ag_news||
16+
|ai2_arc|ARC-Challenge|
17+
|ai2_arc|ARC-Easy|
18+
|amazon_polarity||
19+
|anli||
20+
|app_reviews||
21+
|circa||
22+
|cnn_dailymail|3.0.0|
23+
|common_gen||
24+
|coqa||
25+
|cos_e|v1.11|
26+
|cos_e|v1.0|
27+
|cosmos_qa||
28+
|crows_pairs||
29+
|craffel/openai_lambada||
30+
|dbpedia_14||
31+
|dream||
32+
|drop||
33+
|duorc|ParaphraseRC|
34+
|duorc|SelfRC|
35+
|emo||
36+
|gigaword||
37+
|glue|cola|
38+
|glue|mrpc|
39+
|glue|qqp|
40+
|glue|sst2|
41+
|glue|stsb|
42+
|hans||
43+
|hellaswag||
44+
|imdb||
45+
|jeopardy||
46+
|jigsaw_toxicity_pred||
47+
|kilt_tasks|nq|
48+
|lambada||
49+
|mc_taco||
50+
|multi_news||
51+
|nq_open||
52+
|openbookqa|main|
53+
|openbookqa|additional|
54+
|paws|labeled_final|
55+
|paws|labeled_swap|
56+
|paws|unlabeled_final|
57+
|paws-x|en|
58+
|piqa||
59+
|qa_srl||
60+
|qasc||
61+
|quac||
62+
|quail||
63+
|quarel||
64+
|quartz||
65+
|quoref||
66+
|race|high|
67+
|race|middle|
68+
|race|all|
69+
|ropes||
70+
|rotten_tomatoes||
71+
|samsum||
72+
|sciq||
73+
|scitail|snli_format|
74+
|scitail|tsv_format|
75+
|social_i_qa||
76+
|squad_v2||
77+
|super_glue|wsc.fixed|
78+
|super_glue|boolq|
79+
|super_glue|cb|
80+
|super_glue|copa|
81+
|super_glue|multirc|
82+
|super_glue|record|
83+
|super_glue|rte|
84+
|super_glue|wic|
85+
|swag|regular|
86+
|trec||
87+
|trivia_qa|rc|
88+
|tydiqa||
89+
|web_questions||
90+
|wiki_bio||
91+
|wiki_hop|original|
92+
|wiki_qa||
93+
|winobias|*|
94+
|winogender||
95+
|winogrande|winogrande_debiased|
96+
|winogrande|winogrande_l|
97+
|winogrande|winogrande_m|
98+
|winogrande|winogrande_s|
99+
|winogrande|winogrande_xl|
100+
|winogrande|winogrande_xs|
101+
|wiqa||
102+
|xsum||
103+
|yelp_review_full||
104+
|Zaid/coqa_expanded||
105+
|Zaid/quac_expanded||
106+
107+
# Work in Progress datasets
108+
109+
|Dataset|Subset (optional)|
110+
|-|-|
111+
|acronym_identification||
112+
|ade_corpus_v2|Ade_corpus_v2_classification|
113+
|ade_corpus_v2|Ade_corpus_v2_drug_ade_relation|
114+
|ade_corpus_v2|Ade_corpus_v2_drug_dosage_relation|
115+
|aeslc||
116+
|amazon_reviews_multi|en|
117+
|amazon_us_reviews|Wireless_v1_00|
118+
|ambig_qa|light|
119+
|aqua_rat|raw|
120+
|art||
121+
|asnq||
122+
|asset|ratings|
123+
|asset|simplification|
124+
|banking77||
125+
|billsum||
126+
|bing_coronavirus_query_set||
127+
|blended_skill_talk||
128+
|boolq||
129+
|cbt|CN|
130+
|cbt|NE|
131+
|cbt|P|
132+
|cbt|raw|
133+
|cbt|V|
134+
|cc_news||
135+
|climate_fever||
136+
|codah|codah|
137+
|codah|fold_0|
138+
|codah|fold_1|
139+
|codah|fold_2|
140+
|codah|fold_3|
141+
|codah|fold_4|
142+
|commonsense_qa||
143+
|conv_ai||
144+
|conv_ai_2||
145+
|conv_ai_3||
146+
|coqa||
147+
|cord19|metadata|
148+
|covid_qa_castorini||
149+
|craigslist_bargains||
150+
|discofuse|discofuse-sport|
151+
|discofuse|discofuse-wikipedia|
152+
|discovery|discovery|
153+
|docred||
154+
|e2e_nlg_cleaned||
155+
|ecthr_cases|alleged-violation-prediction|
156+
|emotion||
157+
|esnli||
158+
|evidence_infer_treatment|1.1|
159+
|evidence_infer_treatment|2|
160+
|fever|v1.0|
161+
|fever|v2.0|
162+
|financial_phrasebank|sentences_allagree|
163+
|freebase_qa||
164+
|generated_reviews_enth||
165+
|glue|ax|
166+
|glue|mnli|
167+
|glue|mnli_matched|
168+
|glue|mnli_mismatched|
169+
|glue|qnli|
170+
|glue|rte|
171+
|glue|wnli|
172+
|google_wellformed_query||
173+
|guardian_authorship|cross_genre_1|
174+
|guardian_authorship|cross_topic_1|
175+
|guardian_authorship|cross_topic_4|
176+
|guardian_authorship|cross_topic_7|
177+
|gutenberg_time||
178+
|head_qa|en|
179+
|health_fact||
180+
|hlgd||
181+
|hotpot_qa|distractor|
182+
|hotpot_qa|fullwiki|
183+
|humicroedit|subtask-1|
184+
|humicroedit|subtask-2|
185+
|hyperpartisan_news_detection|byarticle|
186+
|hyperpartisan_news_detection|bypublisher|
187+
|jfleg||
188+
|kelm||
189+
|liar||
190+
|limit||
191+
|math_dataset|algebra__linear_1d|
192+
|math_dataset|algebra__linear_1d_composed|
193+
|math_dataset|algebra__linear_2d|
194+
|math_dataset|algebra__linear_2d_composed|
195+
|math_qa||
196+
|mdd|task1_qa|
197+
|mdd|task2_recs|
198+
|mdd|task3_qarecs|
199+
|medical_questions_pairs||
200+
|meta_woz|dialogues|
201+
|mocha||
202+
|movie_rationales||
203+
|multi_nli||
204+
|multi_nli_mismatch||
205+
|multi_x_science_sum||
206+
|mwsc||
207+
|narrativeqa||
208+
|ncbi_disease||
209+
|neural_code_search|evaluation_dataset|
210+
|newspop||
211+
|nlu_evaluation_data||
212+
|numer_sense||
213+
|onestop_english||
214+
|poem_sentiment||
215+
|pubmed_qa|pqa_labeled|
216+
|qa_zre||
217+
|qed||
218+
|quora||
219+
|samsum||
220+
|scan|addprim_jump|
221+
|scan|addprim_turn_left|
222+
|scan|filler_num0|
223+
|scan|filler_num1|
224+
|scan|filler_num2|
225+
|scan|filler_num3|
226+
|scan|length|
227+
|scan|simple|
228+
|scan|template_around_right|
229+
|scan|template_jump_around_right|
230+
|scan|template_opposite_right|
231+
|scan|template_right|
232+
|scicite||
233+
|scientific_papers|arxiv|
234+
|scientific_papers|pubmed|
235+
|scitldr|Abstract|
236+
|selqa|answer_selection_analysis|
237+
|sem_eval_2010_task_8||
238+
|sem_eval_2014_task_1||
239+
|sent_comp||
240+
|sick||
241+
|sms_spam||
242+
|snips_built_in_intents||
243+
|snli||
244+
|species_800||
245+
|spider||
246+
|squad||
247+
|squad_adversarial|AddSent|
248+
|squadshifts|amazon|
249+
|squadshifts|new_wiki|
250+
|squadshifts|nyt|
251+
|sst|default|
252+
|stsb_multi_mt|en|
253+
|subjqa|books|
254+
|subjqa|electronics|
255+
|subjqa|grocery|
256+
|subjqa|movies|
257+
|subjqa|restaurants|
258+
|subjqa|tripadvisor|
259+
|tab_fact|tab_fact|
260+
|tmu_gfm_dataset||
261+
|turk||
262+
|tweet_eval|emoji|
263+
|tweet_eval|emotion|
264+
|tweet_eval|hate|
265+
|tweet_eval|irony|
266+
|tweet_eval|offensive|
267+
|tweet_eval|sentiment|
268+
|tweet_eval|stance_abortion|
269+
|tweet_eval|stance_atheism|
270+
|tweet_eval|stance_climate|
271+
|tweet_eval|stance_feminist|
272+
|tweet_eval|stance_hillary|
273+
|tydiqa|primary_task|
274+
|tydiqa|secondary_task|
275+
|wiki_hop|masked|
276+
|wiki_qa||
277+
|wiki_split||
278+
|winograd_wsc|wsc273|
279+
|winograd_wsc|wsc285|
280+
|xnli|en|
281+
|xquad|xquad.en|
282+
|xquad_r|en|
283+
|yahoo_answers_qa||
284+
|yahoo_answers_topics||
285+
|yelp_polarity||
286+
|zest||

assets/promptsource_app.png

91.4 KB
Loading

0 commit comments

Comments
 (0)