You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Supported Additional Training Dataset for DCASE2025T2 (#22)
* Updated DCASE2025 Task2
* Update README.md
Fixed some typos
* Previous DCASE scripts have been unified into legacy
* Added support for Additional training dataset
---------
Co-authored-by: Noboru Harada <64912994+noboru2000@users.noreply.github.com>
Copy file name to clipboardExpand all lines: README.md
+55-35Lines changed: 55 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,23 +17,23 @@ Differences between the previous dcase2022\_baseline\_ae and this version are as
17
17
This system consists of three main scripts (01_train.sh, 02a_test.sh, and 02b_test.sh) with some helper scripts for DCASE2025T2 (For DCASE2024T2 and DCASE2023T2, see [README_legacy](README_legacy.md)):
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2025t2/dev_data/raw/<machine_type>/train/<section_id>`.
33
-
<!-- - "Evaluation" mode:
34
-
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2025t2/eval_data/raw/<machine_type>/train/<section_id>`.-->
33
+
-"Evaluation" mode:**Newly added!! (2025/05/15)**
34
+
- This script trains a model for each machine type for each section ID by using the directory `data/dcase2025t2/eval_data/raw/<machine_type>/train/<section_id>`.
35
35
36
-
- 02a_test_2025t2.sh (Use MSE as a score function for the Simple Autoencoder mode) **Newly added!! (2025/04/01)**
36
+
- 02a_test_2025t2.sh (Use MSE as a score function for the Simple Autoencoder mode) **Updated on (2025/04/01)**
37
37
- "Development" mode:
38
38
- This script makes a CSV file for each section, including the anomaly scores for each WAV file in the directories `data/dcase2025t2/dev_data/raw/<machine_type>/test/`.
39
39
- The CSV files will be stored in the directory `results/`.
@@ -42,7 +42,7 @@ This system consists of three main scripts (01_train.sh, 02a_test.sh, and 02b_te
42
42
- This script makes a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2025t2/eval_data/raw/<machine_type>/test/`. (These directories will be made available with the "evaluation dataset".)
43
43
- The CSV files are stored in the directory `results/`. -->
44
44
45
-
- 02b_test_2025t2.sh (Use Mahalanobis distance as a score function for the Selective Mahalanobis mode) **Newly added!! (2025/04/01)**
45
+
- 02b_test_2025t2.sh (Use Mahalanobis distance as a score function for the Selective Mahalanobis mode) **Updated on (2025/04/01)**
46
46
- "Development" mode:
47
47
- This script makes a CSV file for each section, including the anomaly scores for each wav file in the directories `data/dcase2025t2/dev_data/raw/<machine_type>/test/`.
48
48
- The CSV files will be stored in the directory `results/`.
@@ -70,9 +70,9 @@ We will launch the datasets in three stages. Therefore, please download the data
70
70
+ DCASE 2025 Challenge Task 2
71
71
+ "Development Dataset" **New! (2025/04/01)**
72
72
+ Download "dev\_data_<machine_type>.zip" from [https://zenodo.org/records/15097779](https://zenodo.org/records/15097779).
73
-
<!--+ "Additional Training Dataset", i.e., the evaluation dataset for training
74
-
+ Download "eval\_data_<machine_type>_train.zip" from []().
75
-
+ "Evaluation Dataset", i.e., the evaluation dataset for test
73
+
+ "Additional Training Dataset", i.e., the evaluation dataset for training **New! (2025/05/15)**
74
+
+ Download "eval\_data_<machine_type>_train.zip" from [https://zenodo.org/records/15392814](https://zenodo.org/records/15392814).
75
+
<!--+ "Evaluation Dataset", i.e., the evaluation dataset for test
76
76
+ Download "eval\_data_<machine_type>_test.zip" from [](). -->
77
77
78
78
+ DCASE 2024 Challenge Task 2 (C.f., for DCASE2024T2, see [README_legacy](README_legacy.md))
@@ -121,32 +121,35 @@ We will launch the datasets in three stages. Therefore, please download the data
121
121
+ ...
122
122
+ section\_00\_target\_test\_anomaly\_0049\_.wav
123
123
+ attributes\_00.csv (attributes CSV for section 00)
124
-
++\<machine\_type1\_of\_additional\_dataset\> (The other machine types have the same directory structure as \<machine\_type0\_of\_additional\_dataset\>/.)
125
-
<!--+ data/dcase2025t2/eval\_data/raw/
124
+
+\<machine\_type1\_of\_additional\_dataset\> (The other machine types have the same directory structure as \<machine\_type0\_of\_additional\_dataset\>/.)
125
+
+ data/dcase2025t2/eval\_data/raw/
126
126
+\<machine\_type0\_of\_additional\_dataset\>/
127
+
+ supplemental/ (after launch of the additional training dataset)
128
+
+ section\_00\_machine\_0001\_.wav
129
+
+ ...
130
+
+ section\_00\_machine\_0100\_.wav
127
131
+ train/ (after launch of the additional training dataset)
128
132
+ section\_00\_source\_train\_normal\_0000\_.wav
129
133
+ ...
130
134
+ section\_00\_source\_train\_normal\_0989\_.wav
131
135
+ section\_00\_target\_train\_normal\_0000\_.wav
132
136
+ ...
133
137
+ section\_00\_target\_train\_normal\_0009\_.wav
138
+
<!-- + test/ (after launch of the evaluation dataset)
139
+
+ section\_00\_test\_0000.wav
140
+
+ ...
141
+
+ section\_00\_test\_0199.wav
142
+
+ test_rename/ (convert from test directory using `tools/rename.py`)
+ attributes\_00.csv (attributes CSV for section 00)
149
-
+ \<machine\_type1\_of\_additional\_dataset\> (The other machine types have the same directory structure as \<machine\_type0\_of\_additional\_dataset\>/.) -->
152
+
+\<machine\_type1\_of\_additional\_dataset\> (The other machine types have the same directory structure as \<machine\_type0\_of\_additional\_dataset\>/.)
### 8. Run training script for the additional training dataset (after May 15, 2025)
239
+
### 8. Run training script for the additional training dataset **Newly added!! (2025/05/15)**
237
240
238
-
<!--After the additional training dataset is launched, download and unzip it. Move it to `data/dcase2025t2/eval_data/raw/<machine_type>/train/`. Run the training script `01_train_2025t2.sh` with the option `-e`.
241
+
After the additional training dataset is launched, download and unzip it. Move it to `data/dcase2025t2/eval_data/raw/<machine_type>/train/`. Run the training script `01_train_2025t2.sh` with the option `-e`.
239
242
240
243
```dotnetcli
241
244
$ 01_train_2025t2.sh -e
242
245
```
243
246
244
-
Models are trained by using the additional training dataset `data/dcase2025t2/raw/eval_data/<machine_type>/train/`.-->
247
+
Models are trained by using the additional training dataset `data/dcase2025t2/raw/eval_data/<machine_type>/train/`.
245
248
246
249
### 9. Run the test script for the evaluation dataset (after June 1, 2025)
247
250
@@ -271,17 +274,28 @@ If you use [rename script](./tools/rename_eval_wav.py) to generate `test_rename`
271
274
272
275
### 10. Summarize results
273
276
274
-
After the executed `02a_test_2025t2.sh`, `02b_test_2025t2.sh`, or both. Run the summarize script `03_summarize_results.sh` with the option `DCASE2025T2 -d` or `DCASE2025T2 -e`.
277
+
After the executed `02a_test_2025t2.sh`, `02b_test_2025t2.sh`, or both. Run the summarize script `03_summarize_results.sh` with the option `DCASE2025T2 -d`.
275
278
276
279
```dotnetcli
277
280
# Summarize development dataset 2025
278
281
$ 03_summarize_results.sh DCASE2025T2 -d
279
282
```
280
283
281
-
After the summary, the results are exported in CSV format to `results/dev_data/baseline/summarize/DCASE2025T2` or `results/eval_data/baseline/summarize/DCASE2025T2`.
284
+
After the summary, the results are exported in CSV format to `results/dev_data/baseline/summarize/DCASE2025T2`.
282
285
283
286
If you want to change, summarize results directory or export directory, edit `03_summarize_results.sh`.
284
287
288
+
<!-- After the executed `02a_test_2025t2.sh`, `02b_test_2025t2.sh`, or both. Run the summarize script `03_summarize_results.sh` with the option `DCASE2025T2 -d` or `DCASE2025T2 -e`.
289
+
290
+
```dotnetcli
291
+
# Summarize development dataset 2025
292
+
$ 03_summarize_results.sh DCASE2025T2 -d
293
+
```
294
+
295
+
After the summary, the results are exported in CSV format to `results/dev_data/baseline/summarize/DCASE2025T2` or `results/eval_data/baseline/summarize/DCASE2025T2`.
296
+
297
+
If you want to change, summarize results directory or export directory, edit `03_summarize_results.sh`. -->
298
+
285
299
## Legacy support
286
300
287
301
This version takes the legacy datasets provided in DCASE2020 task2, DCASE2021 task2, DCASE2022 task2, DCASE2023 task2, and DCASE2024 task2 dataset for inputs.
@@ -314,6 +328,12 @@ We developed and tested the source code on Ubuntu 22.04.5 LTS.
0 commit comments