Version 1.3.0
Workbench: interactive visualization
This release contains a new interactive whole-slide visualization tool, Slideflow Workbench. Workbench offers an interactive, real-time preview of whole-slide image processing (background filtering, stain normalization, and ROIs); model predictions, heatmaps, uncertainty, and saliency. In addition to whole-slide images, Workbench supports visualization of StyleGAN-generated images, and includes experimental support for generating predictions from a real-time camera feed.
The goal of Workbench is to improve transparency of whole-slide image processing, easily and rapidly generate whole-slide predictions, and provide tools for model troubleshooting. Workbench is both fast and flexible, with an Imgui interface and OpenGL rendering that runs on Linux (x86 and aarch64), MacOS, and Windows. It also runs on Raspberry Pi (tested on Pi 4, 4 GB model).
Run workbench using the workbench.py
script in the root repository directory:
python3 workbench.py
or by instancing a Workbench object:
from slideflow.workbench import Workbench
bench = Workbench()
bench.run()
See the full documentation for examples and more information.
StyleGAN3 support
StyleGAN3 models can now be trained in addition to StyleGAN2 models, with the new submodule slideflow/gan/stylegan3
. Training is executed as with StyleGAN2, but with model='stylegan3'
. Additional StyleGAN3-specific arguments can be passed to the Project.gan_train()
function. For example, to train StyleGAN3 with rotational equivariance:
P.gan_train(
...,
model='stylegan3',
cfg='stylegan3-r'
)
See the documentation for more information.
Train models directly from slides, without TFRecords
Models can now be trained directly from slides, without extracting tiles into TFRecords, using the argument P.train(..., from_wsi=True)
. Additional related changes include:
- The
sf.io.tensorflow.interleave()
function allows interleaving of WSI tensorflow datasets, without intermediate extraction to TFRecords. the first argument was renamed fromtfrecords
topaths
. WSIs undergo background filtering with Otsu's thresholding only (no grayspace filtering). Dataset balancing/clipping not supported. - New
sf.WSI.tensorflow()
function which returns a Tensorflow Dataset from the WSI generator. - New
Dataset.slide_manifest()
function estimates number of non-background tiles among slides in the dataset. - New argument
lazy_iter
to WSI/TMA build_generator, where tile extraction is processed to the Pool only in batches to prevent high memory usage when multiple slides are extracting tiles simultaneously.
Heatmap updates
Heatmaps have undergone extensive optimization and feature expansion, with updates including:
- New
sf.heatmap.ModelHeatmap
allows generating heatmaps from models (tf.keras.Model or torch.nn.Module) already loaded in memory - Heatmap
slide
argument now accepts either path to slide (str) or WSI object. - Adds
device
argument tosf.Heatmap
, for PyTorch backend - More efficient tile processing during whole-slide predictions/heatmaps for PNG-trained models
- New
num_threads
andnum_processes
arguments forsf.Heatmap
which allow specifying the type and amount of multithreading/multiprocessing to use during tile extraction from slides. Ifnum_threads
is non-zero, a ThreadPool will be used to parallelize tile extraction. Ifnum_processes
is non-zero, a multiprocessing Pool will be used. - Heatmaps can now be generated asynchronously by setting
Heatmap(generate=False)
then manually callingHeatmap.generate(asynchronous=True)
. The latter will return the grid (updated asynchronously with predictions) and the heatmap thread. - Heatmap performance improvement by using vectorized normalizers, when available
- Heatmap logits/uncertainty can be saved with
heatmap.save_npz()
.heatmap.save()
will also automatically export logits/uncertainty in npz format with the filename[slidename].npz
. - Heatmap logits/uncertainty can be loaded with either
heatmap.load()
orheatmap.load_npz()
Normalizer optimizations
Several stain normalization improvements were made, including:
- New PyTorch-native Macenko implementation
- Efficiency improvements in Macenko (numpy) normalizer
- Efficiency improvements in Macenko (Tensorflow) normalizer
- Efficiency improvements in Vahadane normalizer
- Default Vahadane normalizer ('vahadane') switched back to SPAMS. Both are still accessible via 'vahadane_sklearn' and 'vahadane_spams'
- Skip Vahadane normalizer throughput testing during functional testing
Other important updates
- The
'r_squared'
metric has been changed from r^2 (square of Pearson correlation coefficient) to R^2 (coefficient of determination), which is the more appropriate metric for determining model strength - Tensorflow-32 now disabled by default. TF32 can be enabled by passing
allow_tf32=True
to any function which also accepts the argumentmixed_precision
- New
Dataset.summary()
function provides an informative overview of a Dataset object - Bespoke QC methods can now be applied to slides with
sf.WSI.apply_qc_mask
. - New
load_method
argument for any project function that loads Tensorflow models allows specifying how models are loaded (either the full model is loaded, or just weights are loaded; see function docstrings for more information).
Smaller changes
SlideMap
now supportsParametricUMAP
for dimension reduction- Slight tweak to the ROI filtering algorithm provides slightly more accurate tile extraction from within ROIs
- Allows 'jpeg' slide extension (in addition to 'jpg')
- Can now generate saliency maps by ID, with the following syntax:
SaliencyMap.get(img, method=sf.grad.VANILLA)
- New function
_VIPSWrapper.read_from_pyramid()
allows for efficient reading from a VIPS image using the best available downsample layer - New
sf.util.get_gan_config()
utility function - Error checks for mismatched img_format for
evaluate()
andtrain()
- Allow configuring a
Project
withsources
equal to a string (rather than only list of str) - Disable UQ during mid-training validation (Tensorflow)
- Change
silent
argument toverbose
insf.WSI
- Decrease size of threadpool from 16 -> 8 for
Dataset.load_indices()
- Decrease default chunk_size from 16 -> 1 for PyTorch interleave
- Rename
preload_factor
argument toprefetch_factor
for PyTorch interleave_dataloader, to match PyTorch syntax pin_memory
now defaults to False for PyTorch dataloaderspersistent_workers
now defaults to False for PyTorch dataloaders- TestSuite accepts
tile_px
argument to manually site tile size - Improvements to tile extraction progress bar
- Rename
sf.stats.eval_from_dataset
->.eval_dataset()
- Rename
sf.stats.predict_from_dataset
->.predict_dataset()
- Make
sf.model.tensorflow.eval_from_model()
public, as with...torch.eval_from_model()
, and.predict_from_model()
for both backends - Overhaul of Tensorflow evaluate/predict metrics functions with removal of unused arguments (e.g. pred_args) and removal of redundant code.
- Tensorflow training no long instantiates multiple redundant validation datasets, and now instead only uses one validation dataset.
- Multiprocessing/pickling support for
sf.WSI
Bug fixes
- Fixes for
UncertaintyInterface
for Tensorflow when using classmethod.from_model()
. - Fixes tile extraction for JPEG slides.
- Fixes overflow when calculating loss during evaluation (Tensorflow)
- Fixes PyTorch models saving without
.zip
extension - Fixes overflow error when calculating loss during evaluation/predictions in Tensorflow backend
- Fixes bug where preserved site cross validation sometimes hangs when using promo/bonmin
Known issues
- Progress bars sometimes prevent exiting via
KeyboardInterrupt
from_wsi
not implement for PyTorch backend- Heatmaps do not work for multi-outcome models