-
Notifications
You must be signed in to change notification settings - Fork 0
Benchmarking Suites
We aim to collect a curated set of benchmarking suites that measure the performance of different methods in different out-of-distribution scenarios. Those different scenarios can be from the domain adaptation domain, the few-shot-learning domain or from zero-shot learning/domain generalization domain. If you feel the benchmarkins suite/evaluation framework you are developing could fit, please feel free to contact us or submit a pull request with a link to your software to this wiki page. After discussion we will be happy to share your contribution on this platform.
TBD Definition of Benchmarking Suite
From our point of view a benchmarking suite conists of different
- datasets
- machine learning models
- possibly adaptation mechansisms, to account for out-of-distribution data
- metrics, that measure the out-of-distribution performance
-
ADATIME is a benchmarking suite designed to evaluate unsupervised domain adaptation (UDA) methods on time series data systematically and fairly. It standardizes neural network architectures and datasets to ensure consistent evaluations across various UDA techniques. The suite includes implementations of 11 state-of-the-art UDA algorithms and introduces realistic model selection approaches that do not rely on labeled target data. Evaluations span five representative datasets, covering 50 cross-domain scenarios. Findings indicate that, with appropriate hyper-parameter tuning, visual domain adaptation methods can perform comparably to those specifically designed for time series data. ADATIME is implemented in PyTorch and is publicly available, providing a valuable resource for advancing domain adaptation in time series analysis.
TBD. This will contain a list of curated guidelines for the developement of evaluation frameworks, with a focus on evaluating performance for Out-of-Distribution settings.
Contact: ekaterina.kutafina at uni-koeln.de, mayra.elwes at uk-koeln.de