Spike Sorting and Curation

SpikeLab includes a spike-sorting pipeline in the spikelab.spike_sorting sub-package. It supports three sorting algorithms — Kilosort2 (Pachitariu et al. 2016), Kilosort4 (Pachitariu et al. 2024), and RT-Sort (Van der Molen et al. 2024) — behind a unified interface built on SpikeInterface (Buccino et al. 2020). The pipeline returns curated SpikeData objects ready for downstream analysis.

SpikeLab also provides standalone curation methods that can be used on any SpikeData object, whether it came from the sorting pipeline or from an external source.

References:

Pachitariu, M., Steinmetz, N., Kadir, S., Carandini, M. & Harris, K. D. Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels. bioRxiv (2016).
Pachitariu, M., Sridhar, S., Pennington, J. & Stringer, C. Spike sorting with Kilosort4. Nature Methods 21, 914–921 (2024).
Van der Molen, T., Lim, M., Bartram, J. et al. RT-Sort: An action potential propagation-based algorithm for real time spike detection and sorting with millisecond latencies. PLoS ONE 19, e0312438 (2024).
Buccino, A. P., Hurwitz, C. L., Garcia, S. et al. SpikeInterface, a unified framework for spike sorting. eLife 9, e61834 (2020).

Spike Sorting 

Prerequisites 

The sorting pipeline requires external dependencies that are not installed with SpikeLab by default:

Kilosort2 requires MATLAB (R2019b+) and the Kilosort2 repository. A Docker variant is available that removes the MATLAB requirement.
Kilosort4 is pure Python but requires torch and kilosort. A Docker variant is also available.
RT-Sort requires torch for its neural-network spike detection model.

For Maxwell Biosystems .h5 files the HDF5 decompression plugin must also be installed; follow the instructions printed by the loader if the plugin is missing.

Basic usage 

The main entry point is sort_recording(), which accepts a list of recording files and returns a list of SpikeData objects:

from spikelab.spike_sorting import sort_recording

results = sort_recording(
    recording_files=["session1.raw.h5"],
    sorter="kilosort4",
)

sd = results[0]
print(sd.N, "units")
print(sd.length / 1000, "seconds")

The sorter parameter selects the algorithm: "kilosort2", "kilosort4", or "rt_sort".

Configuration and presets 

The pipeline is configured via a SortingPipelineConfig dataclass composed of sub-configs for recording I/O, sorting, curation, waveform extraction, and execution. Pre-built presets provide sensible defaults:

from spikelab.spike_sorting.config import KILOSORT4

results = sort_recording(
    recording_files=["session1.raw.h5"],
    config=KILOSORT4,
)

To customise a preset, use the override method:

config = KILOSORT4.override(
    fr_min=0.1,             # stricter minimum firing rate (Hz)
    snr_min=6.0,            # stricter SNR threshold
    n_jobs=16,
    total_memory="32G",
)
results = sort_recording(["session1.raw.h5"], config=config)

Individual parameters can also be passed directly as keyword arguments to sort_recording, which builds a config internally:

results = sort_recording(
    recording_files=["session1.raw.h5"],
    sorter="kilosort2",
    kilosort_path="/opt/Kilosort2",
    fr_min=0.1,
    n_jobs=8,
)

Multi-stream recordings 

For multi-stream files (e.g. Maxwell multi-well .raw.h5), use sort_multistream():

from spikelab.spike_sorting import sort_multistream

stream_results = sort_multistream(
    recording="multiwell.raw.h5",
    stream_ids=["well000", "well001"],
    sorter="kilosort4",
)

for stream_id, sds in stream_results.items():
    print(f"{stream_id}: {sds[0].N} units")

This returns a dict mapping stream IDs to lists of SpikeData objects.

Reusing intermediate results 

The pipeline caches intermediate files (binary recordings, sorting output, waveforms). Control which stages are re-run with the recompute_* flags:

results = sort_recording(
    recording_files=["session1.raw.h5"],
    sorter="kilosort4",
    recompute_recording=False,   # reuse existing binary
    recompute_sorting=True,      # force re-sort
    reextract_waveforms=True,    # force waveform re-extraction
)

See the API reference for the full configuration options.

When the same recording is targeted by two sorts at once, the second raises ConcurrentSortError thanks to a per-recording lock file in the intermediate folder; stale locks left behind by a crashed previous run are reclaimed automatically.

Built-in safeguards 

The pipeline runs a set of preflight checks before each sort and live resource watchdogs during it — for free disk, host RAM, GPU memory, sorter log inactivity, and kernel I/O stalls. These are on by default and rarely need attention: a successful sort produces no extra noise, and a failure surfaces as one of the classified exceptions described in Handling Sort Failures (e.g. HostMemoryWatchdogError).

After a successful sort, a human-readable sorting_report.md is written next to the results, summarising configuration, timings, curation outcome, and per-unit quality stats. A machine-readable recording_report.json is written alongside it.

To tune thresholds (e.g. raise the GPU watchdog’s abort percentage, disable the sleep blocker on a workstation that genuinely needs to suspend), pass overrides via the ExecutionConfig sub-config of SortingPipelineConfig — see the autodoc for the full list of knobs.

Pipeline canary 

For long sorts, you can opt-in to a short smoke test that runs the configured backend on the first N seconds of each recording before committing to the full sort. This catches “first-time” failures — broken Docker image, missing CUDA kernel, MEX compile errors — in seconds rather than hours.

from spikelab.spike_sorting.config import KILOSORT4

config = KILOSORT4.override(canary_first_n_s=30.0)
results = sort_recording(["session1.raw.h5"], config=config)

A canary failure that maps to a SpikeSortingClassifiedError short-circuits the full sort with that classified exception. Unexpected canary failures (e.g. canary OOM in a tiny window) are logged but not propagated — the live watchdogs handle resource-shaped issues during the real run. Recordings shorter than canary_first_n_s skip the canary.

The canary is off by default (canary_first_n_s=0.0).

Stimulation Artifact Removal 

When sorting recordings with electrical stimulation, stim artifacts must be removed before spike detection. SpikeLab provides remove_stim_artifacts() for this purpose.

Two methods are available:

polynomial (default) — fits a low-order polynomial to the post-saturation artifact tail and subtracts it, preserving neural spikes that ride on the artifact decay. Saturated samples are blanked (zeroed).
blank — zeros the entire artifact window. Simpler but discards any spikes within the window.

Stim times logged by the hardware may not align exactly with the artifact in the recording. Use recenter_stim_times() to find the true artifact onset:

from spikelab.spike_sorting.stim_sorting.recentering import recenter_stim_times
from spikelab.spike_sorting.stim_sorting.artifact_removal import remove_stim_artifacts

# Recenter stim times to the actual artifact onset
corrected_times = recenter_stim_times(
    traces,                    # (channels, samples) raw voltage
    stim_times_ms,             # logged stim times in ms
    fs_Hz=20000,
    peak_mode="down_edge",     # alignment mode for biphasic pulses
)

# Remove artifacts
cleaned, blanked_mask = remove_stim_artifacts(
    traces,
    corrected_times,
    fs_Hz=20000,
    method="polynomial",
    artifact_window_ms=10.0,   # max tail duration after desaturation
    poly_order=3,              # cubic polynomial (default)
)

The peak_mode parameter controls how each artifact is aligned:

"abs_max" — largest absolute voltage (monophasic pulses)
"down_edge" — positive-to-negative zero-crossing (biphasic anodic-first)
"up_edge" — negative-to-positive zero-crossing (biphasic cathodic-first)
"pos_peak" / "neg_peak" — largest positive or most negative voltage

When multiple stim events occur in rapid succession (e.g. burst or paired-pulse protocols), the module automatically detects whether the signal has returned to baseline between events and extends the blanking region across the entire burst if needed.

Multi-pulse alignment (`multi_peak`)

When a recentering window may contain several pulses (paired-pulse, short burst, high-frequency train), pass multi_peak=True to lock onto a specific pulse rather than the strongest one. Available on recenter_stim_times(), sort_stim_recording(), and preprocess_stim_artifacts().

Parameters:

multi_peak (default False) — opt-in to multi-pulse-aware alignment.
multi_peak_select (default "first") — "first" for first-pulse alignment (typical for train PSTHs); "last" for after-train rebound.
multi_peak_threshold (default 0.6) — minimum amplitude as a fraction of the strongest peak in the search window for a peak to count as a real pulse.
multi_peak_min_separation_ms (default 2.0) — minimum spacing between candidate peaks; prevents a single broad pulse from being counted twice.

corrected_times = recenter_stim_times(
    traces, stim_times_ms, fs_Hz=20000,
    peak_mode="down_edge",
    multi_peak=True,
    multi_peak_select="first",
    multi_peak_threshold=0.6,
    multi_peak_min_separation_ms=2.0,
)

The polynomial detrend approach is related to SALPA, adapted for offline use:

Wagenaar, D. A. & Potter, S. M. Real-time multi-channel stimulus artifact suppression by local curve fitting. J Neurosci Methods 120, 113–120 (2002).

Unit Curation 

SpikeLab provides curation methods that filter units by quality metrics. These work on any SpikeData object — not just output from the sorting pipeline.

Each curation method returns a tuple (sd_curated, result_dict) where result_dict contains:

"metric" — np.ndarray (N,) with the per-unit metric for all original units.
"passed" — np.ndarray (N,) boolean mask of units that passed.

Individual criteria 

Apply a single quality criterion at a time:

# Remove units with fewer than 50 spikes
sd_curated, res = sd.curate_by_min_spikes(min_spikes=50)

# Remove units below 0.1 Hz firing rate
sd_curated, res = sd.curate_by_firing_rate(min_rate_hz=0.1)

# Remove units with > 1% ISI violations
sd_curated, res = sd.curate_by_isi_violations(
    max_violation=1.0, threshold_ms=1.5,
)

# Remove units with low SNR
sd_curated, res = sd.curate_by_snr(min_snr=5.0)

# Remove units with inconsistent waveforms
sd_curated, res = sd.curate_by_std_norm(max_std_norm=1.0)

SNR and waveform consistency (curate_by_snr, curate_by_std_norm) require that the SpikeData object has raw_data attached. If the metrics have not been pre-computed, call compute_waveform_metrics() first:

sd, metrics = sd.compute_waveform_metrics()
sd_curated, res = sd.curate_by_snr(min_snr=5.0)

Merge-based deduplication 

When spike sorting produces duplicate units for the same neuron recorded on adjacent electrodes, you can merge them using curate_by_merge_duplicates(). This finds spatially nearby unit pairs, filters by ISI violation rate and waveform cosine similarity, then greedily merges accepted pairs:

sd_merged, res = sd.curate_by_merge_duplicates(
    dist_um=24.8,            # max inter-electrode distance in um
    cosine_threshold=0.5,    # min waveform similarity
    max_isi_increase=0.04,   # reject merge if ISI violations rise too much
)

n_absorbed = (~res["passed"]).sum()
print(f"Merged {n_absorbed} duplicate units")

This requires neuron_attributes with position and avg_waveform entries, which are set automatically when the SpikeData comes from the sorting pipeline.

Batch curation 

To apply multiple criteria in one call, use curate(). Only criteria with non-None values are applied:

sd_curated, results = sd.curate(
    min_spikes=50,
    min_rate_hz=0.1,
    isi_max=1.0,
    min_snr=5.0,
)

# results contains per-criterion entries
for criterion, res in results.items():
    n_removed = (~res["passed"]).sum()
    print(f"{criterion}: removed {n_removed} units")

Curation history 

For reproducibility, build a serialisable record of what was removed and why:

history = sd.build_curation_history(
    sd_original=sd_raw,
    sd_curated=sd_curated,
    results=results,
)

The returned dict is JSON-serialisable and can be stored in a workspace or saved alongside the curated data.

Visualising sorted units 

To inspect the spatial waveform footprint of one or more sorted units across the channel grid, use plot_unit_footprints() (or the plot_unit_footprints() convenience method on SpikeData). Channels whose per-channel peak-to-peak amplitude exceeds a configurable threshold are drawn at their (x, y) positions, with the primary channel highlighted; each unit gets its own subplot in an auto-sized grid. Useful for sanity-checking footprint locality and spotting duplicated/over-merged units after sorting.

Sorting Concatenated Recordings 

When a directory containing multiple recording files is passed to sort_recording, the pipeline concatenates them into a single recording for sorting. The returned SpikeData objects are automatically split back into per-recording epochs. When a list of recording paths is passed instead, each file is processed sequentially without concatenation.

If you need to re-split a concatenated SpikeData manually (e.g. after loading a saved pickle that was not yet split), use split_epochs(). This requires rec_chunks_ms in metadata (set automatically by the sorting pipeline) and time-shifts each epoch to start at t=0:

epoch_sds = sd.split_epochs()

for i, epoch_sd in enumerate(epoch_sds):
    print(f"Epoch {i}: {epoch_sd.N} units, {epoch_sd.length:.0f} ms")

Handling Sort Failures 

When a sorting run fails, SpikeLab can classify the failure into one of three categories so that batch scripts can implement skip/retry/stop policies without parsing generic error messages.

The three categories are:

BiologicalSortFailure – the recording itself cannot be sorted (too silent, all channels bad, no waveforms to compute metrics on). Recommended policy: mark the target as not-sortable and move on.
EnvironmentSortFailure – the host environment or container runtime is misconfigured (Docker down, HDF5 plugin missing). Recommended policy: stop and fix the environment.
ResourceSortFailure – the job exhausted a machine resource (GPU out of memory). Recommended policy: retry with reduced parameters.

All three inherit from SpikeSortingClassifiedError, which in turn inherits from RuntimeError.

Each sorter has a dedicated classifier that inspects both sorter logs and exception chains to identify known failure signatures:

classify_ks2_failure — Kilosort2
classify_ks4_failure — Kilosort4
classify_rt_sort_failure — RT-Sort

from spikelab.spike_sorting._classifier import (
    classify_ks2_failure,
    classify_ks4_failure,
    classify_rt_sort_failure,
)
from spikelab.spike_sorting._exceptions import (
    BiologicalSortFailure,
    EnvironmentSortFailure,
    ResourceSortFailure,
)

# Pick the classifier matching your sorter
classify = classify_ks4_failure  # or classify_ks2_failure, classify_rt_sort_failure

try:
    results = sort_recording(["session1.raw.h5"], sorter="kilosort4")
except Exception as exc:
    classified = classify(output_folder, exc)
    if classified is not None:
        if isinstance(classified, BiologicalSortFailure):
            print(f"Skipping (biology): {classified}")
        elif isinstance(classified, EnvironmentSortFailure):
            raise  # stop the batch
        elif isinstance(classified, ResourceSortFailure):
            print(f"Retry with smaller batch: {classified}")
    else:
        raise  # unrecognised failure

Specific exception classes carry diagnostic attributes. For example, InsufficientActivityError exposes threshold_crossings, units_at_failure, and nspks_at_failure parsed from sorter logs. RT-Sort additionally raises ModelLoadingError when the detection model cannot be loaded. See the API reference for the full exception hierarchy.

Watchdog and lock failures 

The classified-error hierarchy includes additional subclasses raised by the resource watchdogs and the per-recording sort lock. They surface through sort_recording like any other classified failure and carry a RecordingResult.status string for batch scripts to dispatch on:

Exception	`status`	Recommended remediation
`HostMemoryWatchdogError`	`oom_host_ram`	Reduce `n_jobs` / `total_memory`, or raise `host_ram_abort_pct`.
`GpuMemoryWatchdogError`	`oom_gpu`	Halve the sorter’s per-batch knob (`NT` for KS2, `batch_size` for KS4, `num_processes` for RT-Sort) and retry. The pipeline does this automatically up to `oom_retry_max` times.
`GpuThermalWatchdogError`	`gpu_thermal`	GPU temperature crossed the abort threshold. Pause until the device cools; check airflow, ambient temperature, and heatsink dust. A persistent trip indicates a cooling failure.
`SorterTimeoutError`	`sorter_timeout`	Sorter log went silent past the inactivity tolerance. Raise `sorter_inactivity_max_s` for unusually long preprocessing, or investigate a hung subprocess.
`DiskExhaustionError`	`disk_exhausted`	Free space on the intermediate / results volume. Inspect `disk_exhaustion_report.json` next to the results.
`IOStallError`	`io_stall`	The kernel I/O byte counters stopped advancing. Usually a hung network share or failing disk; investigate before retrying.
`ConcurrentSortError`	`concurrent_sort`	Another process is already sorting this recording (per-recording lock file). Wait for it to finish, or remove the stale lock if the previous sort crashed.

Spike Sorting and Curation

Multi-pulse alignment (multi_peak)

Multi-pulse alignment (`multi_peak`)