Skip to content

TorchIO — System & OOP Architecture

Source: https://github.com/TorchIO-project/torchio · Analyzed: 2026-06-11 · Version: 1.2.1 · Type: Library / Package (with a thin CLI surface → mild Hybrid)


TorchIO is a Python library for efficient loading, preprocessing, augmentation, and patch-based sampling of 3D (and 2D/4D) medical images for deep learning with PyTorch. It targets the realities of medical imaging — large volumetric files, physical-space (affine) geometry, multi-modal subjects, and segmentation labels — that generic vision augmentation tools handle poorly.

The mental model is small and composable:

Image ──compose──▶ Subject ──wrap──▶ SubjectsDataset ──sample──▶ Queue/Loader ──▶ PyTorch model
▲ │
└──────────────── Transform pipeline ────────────────┘ (preprocessing + augmentation)

Repo classification — Library / Package (mild Hybrid)

Section titled “Repo classification — Library / Package (mild Hybrid)”
SignalEvidence
Public API meant to be importedsrc/torchio/__init__.py curates the tio.* surface (ScalarImage, Subject, transforms, samplers, …)
Distribution metadatapyproject.toml: name = "torchio", src/-layout package, py.typed marker, semver 1.2.1
Defines abstractions for callersTransform(ABC), Image, PatchSampler base classes
Thin runnable surface → Hybridthree console scripts in pyproject.toml: tiohd, tiotr, torchio-transform

It is therefore documented primarily as a library (public API, class design, usage flows), with a short note on the CLI.

ConcernDependency
Tensors, Dataset/DataLoadertorch (≥1.9)
Medical image I/OSimpleITK (DICOM, resampling, interpolation) + nibabel (NIfTI)
Numericsnumpy, scipy, einops
Typingjaxtyping (shaped tensor hints), py.typed
CLI / displaytyper, rich, tqdm, humanize
Citation / deprecationduecredit (via external/), deprecated
Optional extraspandas (csv), matplotlib+colorcet (plot), ffmpeg-python (video), scikit-learn (PCA), monai (adapter)

Who calls TorchIO, and what it depends on.

flowchart LR
    user(["ML researcher / training script"])
    cli(["CLI user (tiohd, tiotr)"])

    subgraph System["TorchIO"]
        core["Public API (tio.*)"]
    end

    disk[("Medical image files<br/>NIfTI / DICOM / etc.")]
    torch["PyTorch<br/>Dataset · DataLoader · Tensor"]
    itk["SimpleITK / NiBabel<br/>(I/O + spatial ops)"]
    repos[("Public dataset hosts<br/>IXI, MNI, MedMNIST, …")]

    user --> System
    cli --> System
    System --> disk
    System --> itk
    System --> torch
    System --> repos
  • Inbound: a training script imports torchio as tio; or a user runs the CLI to inspect/transform a file.
  • Outbound: TorchIO reads/writes images through SimpleITK/NiBabel, hands batches to PyTorch’s DataLoader, and can auto-download built-in datasets.

Top-level packages inside src/torchio/ and the direction of dependencies (arrow = “depends on / imports”).

flowchart TD
    cli["cli/ — Typer apps"]
    transforms["transforms/ — preprocessing + augmentation"]
    datasets["datasets/ — built-in datasets"]
    viz["visualization.py — plotting"]
    data["data/ — Image · Subject · Dataset · Queue · sampler · io"]
    download["download.py — archive fetch"]
    foundation["constants.py · types.py · utils.py"]

    cli --> transforms
    cli --> data
    transforms --> data
    datasets --> data
    datasets --> download
    viz --> data
    data --> foundation
    transforms --> foundation
PathResponsibility
src/torchio/data/Core domain: Image/ScalarImage/LabelMap, Subject, SubjectsDataset, Queue, SubjectsLoader, samplers, GridAggregator, I/O (io.py)
src/torchio/transforms/The transform engine: Transform base + preprocessing & augmentation transforms, Compose/OneOf
src/torchio/datasets/Ready-made SubjectsDataset subclasses (IXI, MNI templates, MedMNIST3D, RSNA, …)
src/torchio/cli/apply_transform.py (→ tiotr/torchio-transform), print_info.py (→ tiohd)
src/torchio/visualization.pyMatplotlib-based slicing, GIF/video export
src/torchio/download.pyTorchvision-style download + MD5 integrity + extract
src/torchio/constants.pyMagic strings/keys: INTENSITY, LABEL, DATA, AFFINE, TYPE, PATH, STEM, LOCATION
src/torchio/types.pyjaxtyping-based aliases (TypeData, TypeAffineMatrix, spacing/tuple types)
src/torchio/utils.pyHelpers (to_tuple, get_stem, collation, conversions)
src/torchio/reference.py, external/Citation metadata (duecredit), lazy optional imports

Layering rule of thumb: everything flows down to data/, which flows down to the foundation modules. data/ never imports transforms/ (except for type-only references), keeping the core decoupled from the augmentation engine.


flowchart TD
    subgraph data["data/"]
        image["Image / ScalarImage / LabelMap<br/>(4D tensor + affine, lazy load)"]
        subject["Subject<br/>(dict of Images + history)"]
        dataset["SubjectsDataset<br/>(torch Dataset)"]
        io["io.py<br/>read_image / write_image"]
        queue["Queue<br/>(patch buffer)"]
        loader["SubjectsLoader<br/>(DataLoader)"]

        subgraph sampler["sampler/"]
            patch["PatchSampler"]
            grid["GridSampler"]
        end
        agg["GridAggregator<br/>(inference reassembly)"]
    end

    io --> image
    image --> subject
    subject --> dataset
    dataset --> queue
    patch --> queue
    queue --> loader
    dataset --> loader
    grid --> agg
  • io.py chooses a backend (SimpleITK first, NiBabel fallback) to materialize an Image’s tensor + affine.
  • Queue composes a SubjectsDataset and a PatchSampler to stream patches; SubjectsLoader is the DataLoader that batches Subjects with TorchIO-aware collation.
  • For sliding-window inference, GridSampler enumerates patch locations and GridAggregator stitches predictions back into a full volume.
flowchart TD
    call["transform(input)"]
    parser["DataParser<br/>normalize input → Subject"]
    base["Transform.__call__<br/>(probability · copy · include/exclude)"]
    apply["apply_transform(subject)<br/>(subclass-specific)"]
    history["record in Subject history"]

    call --> parser --> base --> apply --> history --> parser

A transform accepts a Subject, Image, torch.Tensor, np.ndarray, SimpleITK.Image, nibabel image, or dict. DataParser normalizes it to a Subject, the base orchestrates the call, the subclass does the work in apply_transform, the result is recorded for reproducibility, then converted back to the caller’s original type.


5.1 Domain primitives — Image and Subject

Section titled “5.1 Domain primitives — Image and Subject”

Both extend dict so metadata is just key/value data, while protected keys (DATA, AFFINE, TYPE, …) carry the structured payload. Data is lazy-loaded on first access.

classDiagram
    class Image {
        <<dict[str, object]>>
        +data : Tensor (C,W,H,D)
        +affine : 4x4 matrix
        +spatial_shape
        +spacing
        +orientation
        +load() unload() save()
        +as_sitk() as_pil()
    }
    class ScalarImage {
        type = INTENSITY
    }
    class LabelMap {
        type = LABEL
    }
    class Subject {
        <<dict[str, object]>>
        +get_images(intensity_only)
        +add_image() remove_image()
        +check_consistent_space()
        +applied_transforms
        +get_inverse_transform()
    }

    Image <|-- ScalarImage
    Image <|-- LabelMap
    Subject o-- "1..*" Image : composes
  • Pattern — lazy-loading via dict: memory is only spent when .data is touched, so a SubjectsDataset of thousands of volumes stays cheap until __getitem__.
  • Pattern — Composition: a Subject has many Images (e.g. t1, t2, label) and enforces they share physical space.

5.2 The transform hierarchy — base, mixins, and the random/deterministic pair

Section titled “5.2 The transform hierarchy — base, mixins, and the random/deterministic pair”
classDiagram
    class Transform {
        <<abstract>>
        +__call__(data) data
        +apply_transform(subject)* Subject
        +inverse() Transform
        +is_invertible() bool
        #p, copy, include, exclude
        #args_names
    }
    class SpatialTransform {
        get_images() all images
    }
    class IntensityTransform {
        get_images() intensity only
    }
    class LabelTransform {
        get_images() labels only
    }
    class RandomTransform {
        sample_uniform()
        sample_uniform_sextet()
    }
    class FourierTransform {
        <<mixin>>
        fourier_transform()
    }

    Transform <|-- SpatialTransform
    Transform <|-- IntensityTransform
    Transform <|-- LabelTransform
    Transform <|-- RandomTransform

Transform is an ABC (transforms/transform.py:56). __call__ is overloaded (one signature per accepted input type) and acts as a Template Method: it parses input, applies the probability gate p, optionally deep-copies, calls the abstract apply_transform (:254), and records history. Subclasses only implement apply_transform.

The signature pattern — RandomX delegates to deterministic X:

classDiagram
    class RandomAffine {
        scales, degrees, translation : ranges
        get_params() sampled values
        apply_transform() -> builds Affine
    }
    class Affine {
        scales, degrees, translation : fixed
        invert_transform : bool
        apply_transform() -> SimpleITK resample
    }
    RandomTransform <|-- RandomAffine
    SpatialTransform <|-- Affine
    RandomAffine ..> Affine : samples params,\n then delegates

A random transform (e.g. RandomAffine, RandomGamma, RandomNoise) samples its parameters, then constructs and calls the matching deterministic transform (Affine, Gamma, Noise) with those concrete values. Benefits: the actual math lives in one place, deterministic transforms are usable standalone, and the sampled parameters are recorded in the subject’s history for exact reproducibility.

Composites:

classDiagram
    class Compose {
        transforms : list
        apply_transform() applies in sequence
        inverse() reverses + inverts each
    }
    class OneOf {
        transforms : dict[Transform, weight]
        apply_transform() picks one (multinomial)
    }
    Transform <|-- Compose
    RandomTransform <|-- OneOf

Compose and OneOf are themselves Transforms holding other transforms — a Composite pattern that lets a whole pipeline be passed wherever a single transform is expected.

classDiagram
    class PatchSampler {
        patch_size
        extract_patch(subject, index_ini)
        __call__(subject, num_patches)
        _generate_patches()*
    }
    class RandomSampler {
        get_probability_map(subject)*
    }
    class UniformSampler {
        uniform probability
    }
    class WeightedSampler {
        probability_map name
    }
    class LabelSampler {
        label_probabilities
    }
    class GridSampler {
        patch_overlap
        __len__()
    }

    PatchSampler <|-- RandomSampler
    PatchSampler <|-- GridSampler
    RandomSampler <|-- UniformSampler
    RandomSampler <|-- WeightedSampler
    WeightedSampler <|-- LabelSampler
  • Template Method again: PatchSampler.__call__ drives patch generation; subclasses fill in _generate_patches / get_probability_map.
  • GridSampler is deterministic (regular grid → used for inference with GridAggregator); the RandomSampler branch is stochastic (training).
  • LabelSampler specializes WeightedSampler to build class-balanced probability maps from a label image.
classDiagram
    class Dataset { <<torch>> }
    class DataLoader { <<torch>> }
    class SubjectsDataset {
        subjects : list[Subject]
        transform
        __getitem__() copy+load+transform
    }
    class Queue {
        subjects_dataset
        sampler : PatchSampler
        max_length, samples_per_volume
        __getitem__() pops a patch
    }
    class SubjectsLoader {
        TorchIO-aware collate
    }

    Dataset <|-- SubjectsDataset
    Dataset <|-- Queue
    DataLoader <|-- SubjectsLoader
    Queue o-- SubjectsDataset
    Queue o-- PatchSampler

Queue is a Dataset that composes a SubjectsDataset + a PatchSampler — inheritance for the PyTorch contract, composition for the behavior.

PatternWhereWhy
Template MethodTransform.__call__apply_transform; PatchSampler.__call___generate_patchesFix the algorithm skeleton; let subclasses fill the variable step
Random → Deterministic delegationRandomAffineAffine, RandomGammaGamma, …Single implementation, standalone reuse, reproducible sampling
Mixin / intermediate baseSpatialTransform, IntensityTransform, LabelTransform, FourierTransformShare image-selection / FFT behavior across many transforms
CompositeCompose, OneOfA pipeline is itself a Transform
Lazy loading via dictImage, SubjectDefer expensive volume reads until accessed
Composition over inheritanceSubjectImage, Queue↔(SubjectsDataset,PatchSampler), GridAggregatorGridSamplerFlexible, heterogeneous wiring

sequenceDiagram
    participant U as Caller
    participant T as Transform.__call__
    participant P as DataParser
    participant A as apply_transform (subclass)
    participant S as Subject

    U->>T: transform(input)
    T->>P: normalize input → Subject
    T->>T: random() < p ? (probability gate)
    alt skipped
        T-->>U: input unchanged
    else applied
        T->>T: deepcopy if copy=True
        T->>A: apply_transform(subject)
        A->>S: modify images
        A-->>T: subject
        T->>S: record transform in history
        T->>P: Subject → original input type
        T-->>U: transformed output
    end

For a RandomAffine, the apply_transform step additionally samples parameters and delegates to a concrete Affine (§5.2) before returning.

sequenceDiagram
    participant DS as SubjectsDataset
    participant Q as Queue
    participant SAMP as PatchSampler
    participant L as SubjectsLoader
    participant M as Model

    Q->>DS: pull next Subject (background workers)
    Q->>SAMP: sample N patches from Subject
    SAMP-->>Q: patches → buffer (max_length)
    L->>Q: __getitem__ (pop patch)
    Q-->>L: patch (a small Subject)
    L->>M: collated batch of patches
    Note over Q: buffer refills when drained

For inference the dual flow applies: GridSampler enumerates fixed locations → model predicts each patch → GridAggregator.add_batch(...)get_output_tensor() reconstructs the full volume (with crop / average / hann overlap handling).


To customize…Do this
A new transformSubclass Transform (or a mixin like IntensityTransform/SpatialTransform) and implement apply_transform(subject); list configurable attrs in args_names for reproducibility
A reproducible random transformSubclass RandomTransform, sample params, delegate to a deterministic counterpart
An invertible transformSet invert_transform, branch on it inside apply_transform; inverse() toggles the flag
An inline/one-off transformUse Lambda (transforms/lambda_transform.py) to wrap a callable
Reuse MONAI transformstransforms/monai_adapter.py bridges MONAI into the TorchIO pipeline
A new patch-sampling strategySubclass PatchSampler (deterministic) or RandomSampler and implement get_probability_map / _generate_patches
A new built-in datasetSubclass SubjectsDataset, build the Subject list (optionally via download.py)
Selective applicationUse the built-in include / exclude / p args on any transform — no subclassing needed

TermMeaning
Image4D tensor (channels, W, H, D) + a 4×4 affine mapping voxel indices → physical (world) coordinates
ScalarImage vs LabelMapIntensity data (interpolated continuously) vs discrete segmentation labels (nearest-neighbor)
SubjectOne scanned subject = a dict of co-registered Images sharing physical space, plus transform history
affine / physical spaceThe geometry that lets spatial transforms (resample, affine, flip) act in millimeters, not just voxels
TransformA Subject → Subject operation; preprocessing (deterministic) or augmentation (often random)
InvertibilitySome transforms can be reversed (inverse()), e.g. for test-time augmentation or undoing preprocessing
PatchA small sub-volume cropped from a Subject for memory-bounded training
QueueA background buffer of patches that decouples CPU loading/sampling from GPU training
GridSampler / GridAggregatorDeterministic sliding-window splitting (sampler) and reassembly of predictions (aggregator) for inference

  • Confidence: Every class, base class, file path, and inheritance edge above was verified directly against the source (rg on src/torchio/). The package-layout __init__.py import wiring and the random→deterministic pairing were confirmed for representative cases (RandomAffine/Affine, RandomGamma/Gamma).
  • Not deeply traced (treated as black boxes here): the exact numerical algorithms inside GridAggregator overlap modes (crop/average/hann), the Fourier-artifact math in Motion/Ghosting/Spike, and the depth of the MONAI adapter’s two-way mapping. These are implementation detail rather than architecture.
  • CLI is intentionally covered lightly (it is a thin Typer wrapper that resolves a transform by name and applies it); the library API is the architecturally significant surface.
  • args_names / history: reproducibility hinges on each transform faithfully listing its parameters in args_names; a custom transform that omits this will still run but won’t be perfectly reproducible from history.