Skip to content

Commit ce6566a

Browse files
authored
Merge pull request #19 from nttcslab/2024t2/v3.x.x
Made DCASE2024 Task2 legacy
2 parents 68e8f1c + 9f98db6 commit ce6566a

23 files changed

+10314
-39
lines changed

README.md

Lines changed: 48 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,16 @@ We will launch the datasets in three stages. Therefore, please download the data
128128
+ section\_00\_test\_0000.wav
129129
+ ...
130130
+ section\_00\_test\_0199.wav
131+
+ /test_rename (convert from test directory using `tools/rename.py`)
132+
+ /section\_00\_source\_test\_normal\_\<0000\~0200\>\_\<attribute\>.wav
133+
+ ...
134+
+ /section\_00\_source\_test\_anomaly\_\<0000\~0200\>\_\<attribute\>.wav
135+
+ ...
136+
+ /section\_00\_target\_test\_normal\_\<0000\~0200\>\_\<attribute\>.wav
137+
+ ...
138+
+ /section\_00\_target\_test\_anomaly\_\<0000\~0200\>\_\<attribute\>.wav
139+
+ ...
140+
+ attributes\_00.csv (attributes CSV for section 00)
131141
+ \<machine\_type1\_of\_additional\_dataset\> (The other machine types have the same directory structure as \<machine\_type0\_of\_additional\_dataset\>/.)
132142

133143
### 4. Change parameters
@@ -238,7 +248,9 @@ After the evaluation dataset for the test is launched, download and unzip it. Mo
238248
$ 02a_test_2024t2.sh -e
239249
```
240250

241-
Anomaly scores are calculated using the evaluation dataset, i.e., `data/dcase2024t2/eval_data/raw/<machine_type>/test/`. The anomaly scores are stored as CSV files in the directory `results/`. You can submit the CSV files for the challenge. From the submitted CSV files, we will calculate AUC, pAUC, and your ranking.
251+
Anomaly scores are calculated using the evaluation dataset, i.e., `data/dcase2024t2/eval_data/raw/<machine_type>/test/`. The anomaly scores are stored as CSV files in the directory `results/`. ~~You can submit the CSV files for the challenge. From the submitted CSV files, we will calculate AUC, pAUC, and your ranking.~~
252+
253+
If you use [rename script](./tools/rename_eval_wav.py) to generate `test_rename` directory, AUC and pAUC are also calculated.
242254

243255
### 9.2. Testing with the Selective Mahalanobis mode
244256

@@ -248,7 +260,9 @@ After the evaluation dataset for the test is launched, download and unzip it. Mo
248260
$ 02b_test_2024t2.sh -e
249261
```
250262

251-
Anomaly scores are calculated using the evaluation dataset, i.e., `data/dcase2024t2/eval_data/raw/<machine_type>/test/`. The anomaly scores are stored as CSV files in the directory `results/`. You can submit the CSV files for the challenge. From the submitted CSV files, we will calculate AUC, pAUC, and your ranking.
263+
Anomaly scores are calculated using the evaluation dataset, i.e., `data/dcase2024t2/eval_data/raw/<machine_type>/test/`. The anomaly scores are stored as CSV files in the directory `results/`. ~~You can submit the CSV files for the challenge. From the submitted CSV files, we will calculate AUC, pAUC, and your ranking.~~
264+
265+
If you use [rename script](./tools/rename_eval_wav.py) to generate `test_rename` directory, AUC and pAUC are also calculated.
252266

253267
### 10. Summarize results
254268

@@ -268,7 +282,7 @@ If you want to change, summarize results directory or export directory, edit `03
268282

269283
## Legacy support
270284

271-
This version takes the legacy datasets provided in DCASE2020 task2, DCASE2021 task2, DCASE2022 task2, and DCASE2023 task2 dataset for inputs.
285+
This version takes the legacy datasets provided in DCASE2020 task2, DCASE2021 task2, DCASE2022 task2, DCASE2023 task2, and DCASE2024 task2 dataset for inputs.
272286
The Legacy support scripts are similar to the main scripts. These are in `tools` directory.
273287

274288
[learn more](README_legacy.md)
@@ -297,6 +311,18 @@ We developed and tested the source code on Ubuntu 20.04.4 LTS.
297311
- fasteners == 0.18
298312

299313
## Change Log
314+
### [3.3.0](https://github.com/nttcslab/dcase2023_task2_baseline_ae/releases/tag/v3.3.0)
315+
316+
#### Made DCASE2024 Task2 legacy
317+
318+
- Added a link to the DCASE2024 task2 evaluator that calculates the official score.
319+
- [dcase2024_task2_evaluator](https://github.com/nttcslab/dcase2024_task2_evaluator)
320+
- Added DCASE2024 Task2 Ground Truth data.
321+
- [DCASE2024 task2 Ground truth data](datasets/eval_data_list_2024.csv)
322+
- Added DCASE2024 Task2 Ground truth attributes.
323+
- [DCASE2024 task2 Ground truth Attributes](datasets/ground_truth_attributes)
324+
- The legacy script has been updated to be compatible with DCASE2024 Task2.
325+
300326
### [3.2.3](https://github.com/nttcslab/dcase2023_task2_baseline_ae/releases/tag/v3.2.3)
301327

302328
#### Updated citation in README
@@ -372,37 +398,39 @@ We developed and tested the source code on Ubuntu 20.04.4 LTS.
372398

373399
- Provides support for the legacy datasets used in DCASE2020, 2021, 2022, and 2023.
374400

375-
## Truth attribute of evaluation data
376401

377-
### Public ground truth
402+
## Ground truth attribute
403+
404+
### Public ground truth of evaluation dataset
378405

379406
The following code was used to calculate the official score. Among these is evaluation datasets ground truth.
380407

381-
- [dcase2023_task2_evaluator](https://github.com/nttcslab/dcase2023_task2_evaluator)
408+
- [dcase2024_task2_evaluator](https://github.com/nttcslab/dcase2024_task2_evaluator)
382409

383-
### In this repository
410+
### Ground truth for evaluation datasets in this repository
384411

385412
This repository have evaluation data's ground truth csv. this csv is using to rename evaluation datasets.
386413
You can calculate AUC and other score if add ground truth to evaluation datasets file name. *Usually, rename function is executed along with [download script](#description) and [auto download function](#41-enable-auto-download-dataset).
387414

388-
- [DCASE2023 task2](datasets/eval_data_list_2023.csv)
389-
390-
391-
## Truth attribute of evaluation data
392-
393-
### Public ground truth
415+
- [DCASE2024 task2 ground truth](datasets/eval_data_list_2024.csv)
394416

395-
The following code was used to calculate the official score. Among these is evaluation datasets ground truth.
396-
397-
- [dcase2023_task2_evaluator](https://github.com/nttcslab/dcase2023_task2_evaluator)
417+
### Ground truth attributes
398418

399-
### In this repository
419+
Attribute information is hidden by default for the following machine types:
400420

401-
This repository have evaluation data's ground truth csv. this csv is using to rename evaluation datasets.
402-
You can calculate AUC and other score if add ground truth to evaluation datasets file name. *Usually, rename function is executed along with [download script](#description) and [auto download function](#41-enable-auto-download-dataset).
421+
- dev data
422+
- gearbox
423+
- slider
424+
- ToyTrain
425+
- eval data
426+
- AirCompressor
427+
- BrushlessMotor
428+
- HoveringDrone
429+
- ToothBrush
403430

404-
- [DCASE2023 task2](datasets/eval_data_list_2023.csv)
431+
You can view the hidden attributes in the following directory:
405432

433+
- [DCASE2024 task2 Ground truth Attributes](datasets/ground_truth_attributes)
406434

407435
## Citation
408436

README_legacy.md

Lines changed: 123 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Legacy support
22

3-
This version supports reading the datasets from DCASE2020 task2, DCASE2021 task2, DCASE2022 task2 and DCASE2023 task2 dataset for inputs.
3+
This version supports reading the datasets from DCASE2020 task2, DCASE2021 task2, DCASE2022 task2, DCASE2023 task2 and DCASE2024 task2 dataset for inputs.
44

55
## Description
66

@@ -20,6 +20,9 @@ Legacy-support scripts are similar to the main scripts. These are in `tools` dir
2020
- tools/data\_download\_2023.sh
2121
- This script downloads development data and evaluation data files and puts them into `data/dcase2023t2/dev_data/raw/` and `data/dcase2023t2/eval_data/raw/`.
2222
- Rename evaluation data after downloading the dataset to evaluate and calculate AUC score. Renamed data is stored in `data/dcase2023t2/eval_data/raw/test_rename`
23+
- tools/data\_download\_2024.sh
24+
- This script downloads development data and evaluation data files and puts them into `data/dcase2024t2/dev_data/raw/` and `data/dcase2024t2/eval_data/raw/`.
25+
- Rename evaluation data after downloading the dataset to evaluate and calculate AUC score. Renamed data is stored in `data/dcase2024t2/eval_data/raw/test_rename`
2326

2427

2528
- tools/01\_train\_legacy.sh
@@ -43,6 +46,11 @@ Legacy-support scripts are similar to the main scripts. These are in `tools` dir
4346
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2023t2/dev_data/raw/<machine_type>/train/<section_id>`
4447
- "Evaluation" mode:
4548
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2023t2/eval_data/raw/<machine_type>/train/<section_id>`.
49+
- DCASE2024 task2 mode:
50+
- "Development" mode:
51+
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2024t2/dev_data/raw/<machine_type>/train/<section_id>`
52+
- "Evaluation" mode:
53+
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2024t2/eval_data/raw/<machine_type>/train/<section_id>`.
4654

4755

4856
- tools/02a\_test\_legacy.sh (Use MSE as a score function for the Simple Autoencoder mode)
@@ -82,6 +90,15 @@ Legacy-support scripts are similar to the main scripts. These are in `tools` dir
8290
- This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2023t2/eval_data/raw/<machine_type>/test/`. (These directories will be made available with the "evaluation dataset".)
8391
- The generated CSV files are stored in the directory `results/`.
8492
- If `test_rename` directory is available, this script generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
93+
- DCASE2024 task2 mode:
94+
- "Development" mode:
95+
- This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2024t2/dev_data/raw/<machine_type>/test/`.
96+
- The generated CSV files will be stored in the directory `results/`.
97+
- It also generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
98+
- "Evaluation" mode:
99+
- This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2024t2/eval_data/raw/<machine_type>/test/`. (These directories will be made available with the "evaluation dataset".)
100+
- The generated CSV files are stored in the directory `results/`.
101+
- If `test_rename` directory is available, this script generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
85102

86103
- tools/02b\_test\_legacy.sh (Use Mahalanobis distance as a score function for the Selective Mahalanobis mode)
87104
- "Development" mode:
@@ -110,6 +127,15 @@ Legacy-support scripts are similar to the main scripts. These are in `tools` dir
110127
- This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2023t2/eval_data/raw/<machine_type>/test/`. (These directories will be made available with the "evaluation dataset".)
111128
- The generated CSV files are stored in the directory.
112129
- This script also generates a CSV file, containing AUC, pAUC, precision, recall, and F1-score for each section.
130+
- DCASE2024 task2 mode:
131+
- "Development" mode:
132+
- This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2024t2/dev_data/raw/<machine_type>/test/`.
133+
- The CSV files will be stored in the directory `results/`.
134+
- It also makes a csv file including AUC, pAUC, precision, recall, and F1-score for each section.
135+
- "Evaluation" mode:
136+
- This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2024t2/eval_data/raw/<machine_type>/test/`. (These directories will be made available with the "evaluation dataset".)
137+
- The generated CSV files are stored in the directory.
138+
- This script also generates a CSV file, containing AUC, pAUC, precision, recall, and F1-score for each section.
113139
- 03_summarize_results.sh
114140
- This script summarizes results into a csv file.
115141
- Use the same as when summarizing DCASE2023T2 and DCASE2024T2 results.
@@ -147,6 +173,13 @@ Legacy scripts in `tools` directory can be executed regardless of the current di
147173
+ Download "eval\_data_<machine_type>_train.zip" from [https://zenodo.org/record/7830345](https://zenodo.org/record/7830345).
148174
+ "Evaluation Dataset", i.e., the evaluation dataset for test
149175
+ Download "eval\_data_<machine_type>_test.zip" from [https://zenodo.org/record/7860847](https://zenodo.org/record/7860847).
176+
+ DCASE2024T2
177+
+ "Development Dataset"
178+
+ Download "dev\_data_<machine_type>.zip" from [https://zenodo.org/records/10902294](https://zenodo.org/records/10902294).
179+
+ "Additional Training Dataset", i.e., the evaluation dataset for training
180+
+ Download "eval\_data_<machine_type>_train.zip" from [https://zenodo.org/records/11259435](https://zenodo.org/records/11259435).
181+
+ "Evaluation Dataset", i.e., the evaluation dataset for test
182+
+ Download "eval\_data_<machine_type>_test.zip" from [https://zenodo.org/records/11363076](https://zenodo.org/records/11363076).
150183

151184

152185
### 3. Unzip the downloaded files and make the directory structure as follows:
@@ -188,6 +221,7 @@ $ bash tools/01_train.sh DCASE2020T2 -d
188221
- `DCASE2021T2`
189222
- `DCASE2022T2`
190223
- `DCASE2023T2`
224+
- `DCASE2024T2`
191225
- Second parameters
192226
- `-d`
193227
- `-e`
@@ -209,6 +243,7 @@ $ bash tools/02a_test_legacy.sh DCASE2020T2 -d
209243
- `DCASE2021T2`
210244
- `DCASE2022T2`
211245
- `DCASE2023T2`
246+
- `DCASE2024T2`
212247
- Second parameters
213248
- `-d`
214249
- `-e`
@@ -228,6 +263,7 @@ $ bash tools/02b_test_legacy.sh DCASE2020T2 -d
228263
- `DCASE2021T2`
229264
- `DCASE2022T2`
230265
- `DCASE2023T2`
266+
- `DCASE2024T2`
231267
- Second parameters
232268
- `-d`
233269
- `-e`
@@ -574,18 +610,82 @@ Note that the wav file's parent directory. At that time dataset directory is `de
574610
- /ToyTank
575611
- /Vacuum
576612

577-
## Truth attribute of evaluation data
613+
### DCASE2024 task2
614+
- dcase2023\_task2\_baseline\_ae
615+
- /data/dcase2024t2/dev\_data/raw
616+
- /bearing
617+
- /train (only normal clips)
618+
- /section\_00\_source\_train\_normal\_0001\_\<attribute\>.wav
619+
- ...
620+
- /section\_00\_source\_train\_normal\_0990\_\<attribute\>.wav
621+
- /section\_00\_target\_train\_normal\_0001\_\<attribute\>.wav
622+
- ...
623+
- /section\_00\_target\_train\_normal\_0010\_\<attribute\>.wav
624+
- test/
625+
- /section\_00\_source\_test\_normal\_0001\_\<attribute\>.wav
626+
- ...
627+
- /section\_00\_source\_test\_normal\_0050\_\<attribute\>.wav
628+
- /section\_00\_source\_test\_anomaly\_0001\_\<attribute\>.wav
629+
- ...
630+
- /section\_00\_source\_test\_anomaly\_0050\_\<attribute\>.wav
631+
- /section\_00\_target\_test\_normal\_0001\_\<attribute\>.wav
632+
- ...
633+
- /section\_00\_target\_test\_normal\_0050\_\<attribute\>.wav
634+
- /section\_00\_target\_test\_anomaly\_0001\_\<attribute\>.wav
635+
- ...
636+
- /section\_00\_target\_test\_anomaly\_0050\_\<attribute\>.wav
637+
- attributes\_00.csv (attributes CSV for section 00)
638+
- /fan (The other machine types have the same directory structure as fan.)
639+
- /gearbox
640+
- /slider (`slider` means "slide rail")
641+
- /ToyCar
642+
- /ToyTrain
643+
- /valve
644+
- /data/dcase2024t2/eval\_data/raw/
645+
- /3DPrinter
646+
- /train (after launch of the additional training dataset)
647+
- /section\_00\_source\_train\_normal\_0001\_\<attribute\>.wav
648+
- ...
649+
- /section\_00\_source\_train\_normal\_0990\_\<attribute\>.wav
650+
- /section\_00\_target\_train\_normal\_0001\_\<attribute\>.wav
651+
- ...
652+
- /section\_00\_target\_train\_normal\_0010\_\<attribute\>.wav
653+
- /test (after launch of the evaluation dataset)
654+
- /section\_00\_test\_0001.wav
655+
- ...
656+
- /section\_00\_test\_0200.wav
657+
- /test_rename (convert from test directory using `tools/rename.py`)
658+
- /section\_00\_source\_test\_normal\_\<0000\~0200\>\_\<attribute\>.wav
659+
- ...
660+
- /section\_00\_source\_test\_anomaly\_\<0000\~0200\>\_\<attribute\>.wav
661+
- ...
662+
- /section\_00\_target\_test\_normal\_\<0000\~0200\>\_\<attribute\>.wav
663+
- ...
664+
- /section\_00\_target\_test\_anomaly\_\<0000\~0200\>\_\<attribute\>.wav
665+
- ...
666+
- attributes\_00.csv (attributes CSV for section 00)
667+
- /AirCompressor
668+
- /BrushlessMotor
669+
- /HairDryer
670+
- /HoveringDrone
671+
- /RoboticArm
672+
- /Scanner
673+
- /ToothBrush
674+
- /ToyCircuit
675+
676+
## Ground truth attribute
578677

579-
### Public ground truth
678+
### Public ground truth of evaluation dataset
580679

581680
The following code was used to calculate the official score. Among these is evaluation datasets ground truth.
582681

583682
- [dcase2020_task2_evaluator](https://github.com/y-kawagu/dcase2020_task2_evaluator)
584683
- [dcase2021_task2_evaluator](https://github.com/y-kawagu/dcase2021_task2_evaluator)
585684
- [dcase2022_task2_evaluator](https://github.com/Kota-Dohi/dcase2022_evaluator)
586685
- [dcase2023_task2_evaluator](https://github.com/nttcslab/dcase2023_task2_evaluator)
686+
- [dcase2024_task2_evaluator](https://github.com/nttcslab/dcase2024_task2_evaluator)
587687

588-
### In this repository
688+
### Ground truth for evaluation datasets in this repository
589689

590690
This repository have evaluation data's ground truth csv. this csv is using to rename evaluation datasets.
591691
You can calculate AUC and other score if add ground truth to evaluation datasets file name. *Usually, rename function is executed along with [download script](#description) and [auto download function](#41-enable-auto-download-dataset).
@@ -594,7 +694,26 @@ You can calculate AUC and other score if add ground truth to evaluation datasets
594694
- [DCASE2021 task2](datasets/eval_data_list_2021.csv)
595695
- [DCASE2022 task2](datasets/eval_data_list_2022.csv)
596696
- [DCASE2023 task2](datasets/eval_data_list_2023.csv)
697+
- [DCASE2024 task2](datasets/eval_data_list_2024.csv)
698+
699+
### Ground truth attributes
700+
701+
Attribute information is hidden by default for the following machine types:
702+
703+
- DCASE2024 Task2
704+
- dev data
705+
- gearbox
706+
- slider
707+
- ToyTrain
708+
- eval data
709+
- AirCompressor
710+
- BrushlessMotor
711+
- HoveringDrone
712+
- ToothBrush
713+
714+
You can view the hidden attributes in the following directory:
597715

716+
- [DCASE2024 task2 Ground truth Attributes](datasets/ground_truth_attributes)
598717

599718
## Citation
600719

data_download_2024eval.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,5 @@ wget "https://zenodo.org/records/11363076/files/eval_data_${machine_type}_test.z
1919
unzip "eval_data_${machine_type}_test.zip"
2020
done
2121

22+
# Adds reference labels to test data.
23+
python ${ROOT_DIR}/tools/rename_eval_wav.py --dataset_parent_dir=${parent_dir} --dataset_type=DCASE2024T2

0 commit comments

Comments
 (0)