The Metrics class is a bit of a mess. It should behave more like a Keras History callback, and should be the returned object from a call to saber.train().
This would make hyperparm tuning much easier. E.g. a simple script could be written to collect the average performance of some metric or loss from the Metrics object returned by saber.train(), and tune against it.