Skip to content

Version 1.3.0

Compare
Choose a tag to compare
@jamesdolezal jamesdolezal released this 10 Oct 20:29
· 1395 commits to master since this release

Workbench: interactive visualization

This release contains a new interactive whole-slide visualization tool, Slideflow Workbench. Workbench offers an interactive, real-time preview of whole-slide image processing (background filtering, stain normalization, and ROIs); model predictions, heatmaps, uncertainty, and saliency. In addition to whole-slide images, Workbench supports visualization of StyleGAN-generated images, and includes experimental support for generating predictions from a real-time camera feed.

The goal of Workbench is to improve transparency of whole-slide image processing, easily and rapidly generate whole-slide predictions, and provide tools for model troubleshooting. Workbench is both fast and flexible, with an Imgui interface and OpenGL rendering that runs on Linux (x86 and aarch64), MacOS, and Windows. It also runs on Raspberry Pi (tested on Pi 4, 4 GB model).

Run workbench using the workbench.py script in the root repository directory:

python3 workbench.py

or by instancing a Workbench object:

from slideflow.workbench import Workbench

bench = Workbench()
bench.run()

See the full documentation for examples and more information.

StyleGAN3 support

StyleGAN3 models can now be trained in addition to StyleGAN2 models, with the new submodule slideflow/gan/stylegan3. Training is executed as with StyleGAN2, but with model='stylegan3'. Additional StyleGAN3-specific arguments can be passed to the Project.gan_train() function. For example, to train StyleGAN3 with rotational equivariance:

P.gan_train(
    ...,
    model='stylegan3',
    cfg='stylegan3-r'
)

See the documentation for more information.

Train models directly from slides, without TFRecords

Models can now be trained directly from slides, without extracting tiles into TFRecords, using the argument P.train(..., from_wsi=True). Additional related changes include:

  • The sf.io.tensorflow.interleave() function allows interleaving of WSI tensorflow datasets, without intermediate extraction to TFRecords. the first argument was renamed from tfrecords to paths. WSIs undergo background filtering with Otsu's thresholding only (no grayspace filtering). Dataset balancing/clipping not supported.
  • New sf.WSI.tensorflow() function which returns a Tensorflow Dataset from the WSI generator.
  • New Dataset.slide_manifest() function estimates number of non-background tiles among slides in the dataset.
  • New argument lazy_iter to WSI/TMA build_generator, where tile extraction is processed to the Pool only in batches to prevent high memory usage when multiple slides are extracting tiles simultaneously.

Heatmap updates

Heatmaps have undergone extensive optimization and feature expansion, with updates including:

  • New sf.heatmap.ModelHeatmap allows generating heatmaps from models (tf.keras.Model or torch.nn.Module) already loaded in memory
  • Heatmap slide argument now accepts either path to slide (str) or WSI object.
  • Adds device argument to sf.Heatmap, for PyTorch backend
  • More efficient tile processing during whole-slide predictions/heatmaps for PNG-trained models
  • New num_threads and num_processes arguments for sf.Heatmap which allow specifying the type and amount of multithreading/multiprocessing to use during tile extraction from slides. If num_threads is non-zero, a ThreadPool will be used to parallelize tile extraction. If num_processes is non-zero, a multiprocessing Pool will be used.
  • Heatmaps can now be generated asynchronously by setting Heatmap(generate=False) then manually calling Heatmap.generate(asynchronous=True). The latter will return the grid (updated asynchronously with predictions) and the heatmap thread.
  • Heatmap performance improvement by using vectorized normalizers, when available
  • Heatmap logits/uncertainty can be saved with heatmap.save_npz(). heatmap.save() will also automatically export logits/uncertainty in npz format with the filename [slidename].npz.
  • Heatmap logits/uncertainty can be loaded with either heatmap.load() or heatmap.load_npz()

Normalizer optimizations

Several stain normalization improvements were made, including:

  • New PyTorch-native Macenko implementation
  • Efficiency improvements in Macenko (numpy) normalizer
  • Efficiency improvements in Macenko (Tensorflow) normalizer
  • Efficiency improvements in Vahadane normalizer
  • Default Vahadane normalizer ('vahadane') switched back to SPAMS. Both are still accessible via 'vahadane_sklearn' and 'vahadane_spams'
  • Skip Vahadane normalizer throughput testing during functional testing

Other important updates

  • The 'r_squared' metric has been changed from r^2 (square of Pearson correlation coefficient) to R^2 (coefficient of determination), which is the more appropriate metric for determining model strength
  • Tensorflow-32 now disabled by default. TF32 can be enabled by passing allow_tf32=True to any function which also accepts the argument mixed_precision
  • New Dataset.summary() function provides an informative overview of a Dataset object
  • Bespoke QC methods can now be applied to slides with sf.WSI.apply_qc_mask.
  • New load_method argument for any project function that loads Tensorflow models allows specifying how models are loaded (either the full model is loaded, or just weights are loaded; see function docstrings for more information).

Smaller changes

  • SlideMap now supports ParametricUMAP for dimension reduction
  • Slight tweak to the ROI filtering algorithm provides slightly more accurate tile extraction from within ROIs
  • Allows 'jpeg' slide extension (in addition to 'jpg')
  • Can now generate saliency maps by ID, with the following syntax: SaliencyMap.get(img, method=sf.grad.VANILLA)
  • New function _VIPSWrapper.read_from_pyramid() allows for efficient reading from a VIPS image using the best available downsample layer
  • New sf.util.get_gan_config() utility function
  • Error checks for mismatched img_format for evaluate() and train()
  • Allow configuring a Project with sources equal to a string (rather than only list of str)
  • Disable UQ during mid-training validation (Tensorflow)
  • Change silent argument to verbose in sf.WSI
  • Decrease size of threadpool from 16 -> 8 for Dataset.load_indices()
  • Decrease default chunk_size from 16 -> 1 for PyTorch interleave
  • Rename preload_factor argument to prefetch_factor for PyTorch interleave_dataloader, to match PyTorch syntax
  • pin_memory now defaults to False for PyTorch dataloaders
  • persistent_workers now defaults to False for PyTorch dataloaders
  • TestSuite accepts tile_px argument to manually site tile size
  • Improvements to tile extraction progress bar
  • Rename sf.stats.eval_from_dataset -> .eval_dataset()
  • Rename sf.stats.predict_from_dataset -> .predict_dataset()
  • Make sf.model.tensorflow.eval_from_model() public, as with ...torch.eval_from_model(), and .predict_from_model() for both backends
  • Overhaul of Tensorflow evaluate/predict metrics functions with removal of unused arguments (e.g. pred_args) and removal of redundant code.
  • Tensorflow training no long instantiates multiple redundant validation datasets, and now instead only uses one validation dataset.
  • Multiprocessing/pickling support for sf.WSI

Bug fixes

  • Fixes for UncertaintyInterface for Tensorflow when using classmethod .from_model().
  • Fixes tile extraction for JPEG slides.
  • Fixes overflow when calculating loss during evaluation (Tensorflow)
  • Fixes PyTorch models saving without .zip extension
  • Fixes overflow error when calculating loss during evaluation/predictions in Tensorflow backend
  • Fixes bug where preserved site cross validation sometimes hangs when using promo/bonmin

Known issues

  • Progress bars sometimes prevent exiting via KeyboardInterrupt
  • from_wsi not implement for PyTorch backend
  • Heatmaps do not work for multi-outcome models