Spike Sorting

The spikelab.spike_sorting sub-package provides a full spike-sorting pipeline: loading raw recordings, running a sorter backend (Kilosort2, Kilosort4, or RT-Sort), extracting waveforms, curating units, and compiling results into SpikeData objects.

See the Spike Sorting and Curation guide for usage examples and environment setup instructions.

Entry Points

spikelab.spike_sorting.sort_recording(recording_files, config=None, sorter='kilosort2', intermediate_folders=None, results_folders=None, *, out_report=None, **kwargs)[source]

Run spike sorting on one or more recordings using any registered backend.

This is the primary entry point for the modular sorting pipeline.

Parameters:
  • recording_files (list) – Paths to recording files or directories. Each entry is sorted independently. Directories have their contents concatenated before sorting and split back into per-file SpikeData afterward.

  • config (SortingPipelineConfig or None) – Pre-built configuration. When provided, **kwargs are applied as overrides via config.override(). When None, a fresh config is built from sorter + **kwargs. Preset configs are available in spikelab.spike_sorting.config (e.g. KILOSORT2).

  • sorter (str) – Registered sorter backend name. Only used when config is None. Available: "kilosort2", "kilosort4".

  • intermediate_folders (list or None) – Intermediate result directories, one per recording. Auto-generated if None.

  • results_folders (list or None) – Output directories, one per recording. Auto-generated if None.

  • out_report (SortRunReport or None) – Optional report instance populated in-place with one RecordingResult per input recording. The same information is always written per-recording to <results_folder>/recording_report.json regardless of this argument; out_report only adds a programmatic accessor for the batch.

  • **kwargs – Override individual config fields (e.g. snr_min=5.0, use_docker=True, fr_min=0.05). See spikelab.spike_sorting.config for all available parameters, grouped by: RecordingConfig, SorterConfig, WaveformConfig, CurationConfig, CompilationConfig, FigureConfig, ExecutionConfig.

Returns:

One SpikeData per original recording

file. For directory inputs, the concatenated recording is split back into per-file SpikeData objects.

Return type:

results (list[SpikeData])

Notes

  • Pickle files (sorted_spikedata_curated.pkl and optionally sorted_spikedata.pkl) are saved to each results folder.

  • hdf5_plugin_path (passed via config or kwargs) sets os.environ['HDF5_PLUGIN_PATH'] before any recording is loaded. This is needed for Maxwell .h5 files and applies to all backends.

spikelab.spike_sorting.sort_multistream(recording, stream_ids, config=None, sorter='kilosort2', **kwargs)[source]

Sort a multi-stream recording across multiple stream IDs.

Calls sort_recording once per stream ID, routing each stream to its own intermediate and results folders. Validates that the requested stream IDs exist in the recording file before sorting.

Parameters:
  • recording (str or Path) – Path to a single multi-stream recording file (e.g. MaxTwo .raw.h5) or a directory of such files. When a directory is given, all files are concatenated per stream.

  • stream_ids (list of str) – Stream identifiers to sort, e.g. ["well000", "well001", "well002"].

  • config (SortingPipelineConfig or None) – Pre-built configuration. When provided, **kwargs are applied as overrides.

  • sorter (str) – Registered sorter backend name (default "kilosort2"). Only used when config is None.

  • **kwargs

    Override individual config fields. The following must not be provided:

    • intermediate_folders and results_folders are auto-generated per stream.

    • stream_id is set automatically per iteration.

Returns:

{stream_id: list[SpikeData]}.

Return type:

results (dict)

Notes

  • Stream ID validation uses SpikeInterface’s extractor for the recording format. Currently supports Maxwell .h5 files. For other formats, validation is skipped and invalid stream IDs will produce errors at loading time.

  • When recording is a directory of files, each file is concatenated per stream before sorting. Channel count and sampling frequency must match across files (raises ValueError); mismatched channel IDs or locations produce warnings.

Configuration

Configuration dataclass for the spike sorting pipeline.

Replaces the ~80 module-level globals in kilosort2.py with a single typed, inspectable configuration object that is passed explicitly to every pipeline function.

class spikelab.spike_sorting.config.RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=<factory>, rec_chunks_s=<factory>, start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000)[source]

Bases: object

Parameters for recording loading and preprocessing.

stream_id: str | None = None
hdf5_plugin_path: str | None = None
first_n_mins: float | None = None
mea_y_max: int | None = None
gain_to_uv: float | None = None
offset_to_uv: float | None = None
rec_chunks: List[Tuple[int, int]]
rec_chunks_s: List[Tuple[float, float]]
start_time_s: float | None = None
end_time_s: float | None = None
freq_min: int = 300
freq_max: int = 6000
__init__(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=<factory>, rec_chunks_s=<factory>, start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000)
class spikelab.spike_sorting.config.SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False)[source]

Bases: object

Parameters for the spike sorter itself.

sorter_name: str = 'kilosort2'
sorter_path: str | None = None
sorter_params: Dict[str, Any] | None = None
use_docker: bool = False
__init__(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False)
class spikelab.spike_sorting.config.RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None)[source]

Bases: object

Parameters for the RT-Sort detection and sorting backend.

RT-Sort is an action-potential-propagation-based spike sorter using a deep learning detection model followed by codetection clustering and template matching. See van der Molen, Lim et al. 2024 (PLOS ONE, DOI: 10.1371/journal.pone.0312438) for algorithmic details.

Parameters:
  • model_path (str or None) – Path to a folder containing init_dict.json and state_dict.pt for a pretrained ModelSpikeSorter. When None, the bundled model corresponding to probe is loaded.

  • probe (str) – Which bundled pretrained model to use when model_path is None. "mea" or "neuropixels".

  • device (str) – PyTorch device for inference. "cuda" or "cpu".

  • num_processes (int or None) – Number of worker processes for parallel detection/clustering stages. None selects an automatic value based on CPU count.

  • recording_window_ms (tuple or None) – (start_ms, end_ms) window of the recording to process. None processes the entire recording.

  • save_rt_sort_pickle (bool) – If True, serialize the final RTSort object to the sorter output folder so the trained sequences can be re-used in Phase 2 stim-aware sorting.

  • delete_inter (bool) – If True, delete the intermediate cache directory after sorting completes.

  • verbose (bool) – Print progress messages during sorting.

  • params (dict or None) – Override dictionary merged into the RT-Sort parameter set. Takes precedence over the preset defaults; useful for one-off tuning without editing a preset. Keys must match detect_sequences parameter names.

  • detection_window_s (float or None) – If set, run sequence detection on only the first detection_window_s seconds of the recording (the heavy GPU + clustering phase), then apply the resulting sequences to the full recording during sort_offline. Decouples the detection-phase memory ceiling from total recording length. None uses the full window for both phases (legacy behavior).

model_path: str | None = None
probe: str = 'mea'
device: str = 'cuda'
num_processes: int | None = None
recording_window_ms: Any | None = None
save_rt_sort_pickle: bool = True
delete_inter: bool = False
verbose: bool = True
params: Dict[str, Any] | None = None
detection_window_s: float | None = None
__init__(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None)
class spikelab.spike_sorting.config.WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True)[source]

Bases: object

Parameters for waveform extraction and template computation.

Memory-budget note: the default extractor pre-allocates one (n_spikes, nsamples, num_channels) .npy memmap per unit before extraction begins. For high-unit-count sorters on high-density MEAs this grows to tens of GB (e.g. 400 units × 1018 channels = ~39 GB). When that exceeds host RAM, set streaming=True to use a one-unit-at-a-time path that discards each unit’s waveforms after templates and metrics are computed — peak RAM becomes one unit’s buffer (~100 MB for MaxOne) regardless of total unit count. Waveform files are only written when save_waveform_files=True.

ms_before: float = 2.0
ms_after: float = 2.0
pos_peak_thresh: float = 2.0
max_waveforms_per_unit: int = 300
compiled_ms_before: float = 2.0
compiled_ms_after: float = 2.0
scale_compiled_waveforms: bool = True
std_at_peak: bool = True
std_over_window_ms_before: float = 0.5
std_over_window_ms_after: float = 1.5
streaming: bool = True
save_waveform_files: bool = True
__init__(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True)
class spikelab.spike_sorting.config.CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0)[source]

Bases: object

Parameters for unit quality-control curation.

curate_first: bool = True
curate_second: bool = True
curation_epoch: int | None = None
fr_min: float | None = 0.05
isi_viol_max: float | None = 0.01
isi_violation_method: str = 'percent'
snr_min: float | None = 5.0
spikes_min_first: int | None = 30
spikes_min_second: int | None = 50
std_norm_max: float | None = 1.0
__init__(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0)
class spikelab.spike_sorting.config.CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False)[source]

Bases: object

Parameters for result compilation and export.

compile_single_recording: bool = True
compile_to_mat: bool = False
compile_to_npz: bool = True
compile_waveforms: bool = False
save_electrodes: bool = True
save_spike_times: bool = True
save_raw_pkl: bool = False
save_dl_data: bool = False
__init__(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False)
class spikelab.spike_sorting.config.FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=<factory>, scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)')[source]

Bases: object

Parameters for QC figure generation.

create_figures: bool = False
create_unit_figures: bool = False
dpi: int | None = None
font_size: int = 12
bar_x_label: str = 'Recording'
bar_y_label: str = 'Number of Units'
bar_label_rotation: int = 0
bar_total_label: str = 'First Curation'
bar_selected_label: str = 'Selected Curation'
scatter_std_max_units_per_recording: int | None = None
scatter_recording_colors: List[str]
scatter_recording_alpha: float = 1.0
scatter_x_label: str = 'Number of Spikes'
scatter_y_label: str = 'avg. STD / amplitude'
scatter_x_max_buffer: float = 300.0
scatter_y_max_buffer: float = 0.2
templates_color_curated: str = '#000000'
templates_color_failed: str = '#FF0000'
templates_per_column: int = 50
templates_y_spacing: float = 50.0
templates_y_lim_buffer: float = 10.0
templates_window_ms_before: float = 5.0
templates_window_ms_after: float = 5.0
templates_line_ms_before: float | None = 1.0
templates_line_ms_after: float | None = 4.0
templates_x_label: str = 'Time Rel. to Peak (ms)'
__init__(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=<factory>, scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)')
class spikelab.spike_sorting.config.ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True)[source]

Bases: object

Parameters for pipeline execution control.

Includes safety knobs for the host-memory watchdog and the pre-loop preflight checks under spikelab.spike_sorting.guards. Defaults are tuned for a 32–64 GB workstation; bump the GB thresholds on smaller hosts.

n_jobs: int = 8
total_memory: str = '16G'
use_parallel_processing_for_raw_conversion: bool = True
save_script: bool = False
out_file: str = 'sort_with_kilosort2.out'
random_seed: int = 1
recompute_recording: bool = False
recompute_sorting: bool = False
reextract_waveforms: bool = False
recurate_first: bool = False
recurate_second: bool = False
recompile_single_recording: bool = False
delete_inter: bool = True
host_ram_watchdog: bool = True
host_ram_warn_pct: float = 85.0
host_ram_abort_pct: float = 92.0
host_ram_poll_interval_s: float = 2.0
preflight: bool = True
preflight_strict: bool = False
preflight_min_free_inter_gb: float = 20.0
preflight_min_free_results_gb: float = 2.0
preflight_min_available_ram_gb: float = 4.0
preflight_min_free_vram_gb: float = 2.0
sorter_inactivity_timeout: bool = True
sorter_inactivity_base_s: float = 600.0
sorter_inactivity_per_min_s: float = 30.0
sorter_inactivity_max_s: float | None = 7200.0
sorter_inactivity_in_process_grace_s: float = 10.0
oom_retry_max: int = 1
oom_retry_factor: float = 0.5
canary_first_n_s: float = 0.0
canary_min_recording_s: float = 120.0
docker_image_expected_digest: str | None = None
disk_watchdog: bool = True
disk_warn_free_gb: float = 5.0
disk_abort_free_gb: float = 1.0
disk_poll_interval_s: float = 10.0
io_stall_watchdog: bool = True
io_stall_s: float = 300.0
io_stall_poll_interval_s: float = 10.0
io_stall_mode: str = 'process'
io_stall_include_descendants: bool = True
cleanup_temp_files: bool = True
prevent_system_sleep: bool = True
gpu_watchdog: bool = True
gpu_warn_pct: float = 85.0
gpu_abort_pct: float = 95.0
gpu_poll_interval_s: float = 2.0
gpu_warn_temp_c: float | None = 85.0
gpu_abort_temp_c: float | None = 92.0
gpu_monitor_throttle_reasons: bool = True
tee_log_policy: str = 'delete_on_success'
generate_sorting_report: bool = True
__init__(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True)
class spikelab.spike_sorting.config.SortingPipelineConfig(recording=<factory>, sorter=<factory>, rt_sort=<factory>, waveform=<factory>, curation=<factory>, compilation=<factory>, figures=<factory>, execution=<factory>)[source]

Bases: object

Complete configuration for a spike sorting pipeline run.

Groups all parameters into typed sub-configs. Passed explicitly to every pipeline function, replacing module-level globals.

Parameters:
recording: RecordingConfig
sorter: SorterConfig
rt_sort: RTSortConfig
waveform: WaveformConfig
curation: CurationConfig
compilation: CompilationConfig
figures: FigureConfig
execution: ExecutionConfig
classmethod from_kwargs(**kwargs)[source]

Build a config from flat keyword arguments.

Maps the flat parameter names used by sort_with_kilosort2() to the nested sub-config fields. Unknown keys raise TypeError.

Parameters:

**kwargs – Flat keyword arguments matching sort_with_kilosort2() parameter names.

Returns:

Populated configuration.

Return type:

config (SortingPipelineConfig)

override(**kwargs)[source]

Return a copy of this config with selected fields overridden.

Accepts the same flat keyword arguments as from_kwargs(). Unspecified fields retain their current values.

Parameters:

**kwargs – Flat keyword arguments to override.

Returns:

New config with overrides.

Return type:

config (SortingPipelineConfig)

__init__(recording=<factory>, sorter=<factory>, rt_sort=<factory>, waveform=<factory>, curation=<factory>, compilation=<factory>, figures=<factory>, execution=<factory>)
spikelab.spike_sorting.config.KILOSORT2 = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))

Default configuration for Kilosort2. Parameters are compatible with Maxwell MEA and other probe types. Hardware-specific presets can be created by overriding parameters.

spikelab.spike_sorting.config.KILOSORT2_DOCKER = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=True), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))

Kilosort2 with Docker (no local MATLAB needed).

spikelab.spike_sorting.config.KILOSORT4 = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort4', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))

Default configuration for Kilosort4. Kilosort4 is pure Python (PyTorch) — no MATLAB required. Default parameters are tuned for Neuropixels probes but work for other probe types. Hardware-specific presets (e.g. for Maxwell MEAs) can be created by overriding detection/filtering parameters.

spikelab.spike_sorting.config.KILOSORT4_DOCKER = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort4', sorter_path=None, sorter_params=None, use_docker=True), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))

Kilosort4 with Docker.

spikelab.spike_sorting.config.RT_SORT_MEA = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='rt_sort', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))

RT-Sort with the bundled MEA detection model. Uses the propagation-based RT-Sort algorithm (van der Molen, Lim et al. 2024, PLOS ONE) with the pretrained model tuned for Maxwell multi-electrode arrays.

spikelab.spike_sorting.config.RT_SORT_NEUROPIXELS = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='rt_sort', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='neuropixels', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params={'stringent_thresh': 0.175, 'loose_thresh': 0.075, 'inference_scaling_numerator': 15.4, 'min_amp_dist_p': 0.1, 'max_latency_diff_spikes': 2.5, 'max_amp_median_diff_spikes': 0.45, 'max_latency_diff_sequences': 2.5, 'max_amp_median_diff_sequences': 0.45, 'max_root_amp_median_std_sequences': 2.5}, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))

RT-Sort with the bundled Neuropixels detection model. Uses Neuropixels-tuned detection thresholds and merge parameters.

Backend Registry

Spike sorter backend registry.

Maps sorter names to their backend classes. Backends are imported lazily to avoid requiring all sorter dependencies at import time.

spikelab.spike_sorting.backends.get_backend_class(sorter_name)[source]

Look up and import the backend class for a sorter name.

Parameters:

sorter_name (str) – Registered sorter name (e.g. "kilosort2").

Returns:

The SorterBackend subclass.

Return type:

cls

Raises:

ValueError – If the sorter name is not registered.

spikelab.spike_sorting.backends.list_sorters()[source]

Return the list of registered sorter names.

Returns:

Available sorter names.

Return type:

sorters (list of str)

class spikelab.spike_sorting.backends.base.SorterBackend(config)[source]

Bases: ABC

Interface that each spike sorter backend must implement.

Parameters:

config (SortingPipelineConfig) – Full pipeline configuration. Backends read their relevant sub-configs (config.recording, config.sorter, config.waveform, config.execution).

__init__(config)[source]
abstractmethod load_recording(rec_path)[source]

Load and preprocess a single recording.

Handles format-specific loading (Maxwell .h5, NWB, etc.), gain/offset scaling, and bandpass filtering.

Parameters:

rec_path (Any) – Path to a recording file, a directory of files to concatenate, or a pre-loaded BaseRecording object.

Returns:

A SpikeInterface BaseRecording ready for

sorting (scaled, filtered, single-segment).

Return type:

recording

abstractmethod sort(recording, rec_path, recording_dat_path, output_folder)[source]

Run the spike sorter on a preprocessed recording.

Parameters:
  • recording – SpikeInterface BaseRecording from load_recording.

  • rec_path – Original recording file path (for binary conversion or metadata).

  • recording_dat_path (Path) – Path for the binary .dat file (used by sorters that require pre-converted input).

  • output_folder (Path) – Directory for sorter output files.

Returns:

A SpikeInterface BaseSorting with detected

units and spike trains.

Return type:

sorting

abstractmethod extract_waveforms(recording, sorting, waveforms_folder, curation_folder, rec_path=None, rng=None)[source]

Extract per-unit waveforms and compute templates.

Parameters:
  • recording – SpikeInterface BaseRecording.

  • sorting – SpikeInterface BaseSorting from sort.

  • waveforms_folder (Path) – Root directory for waveform storage.

  • curation_folder (Path) – Directory for initial unit list and metadata.

Returns:

An object providing at minimum:

  • sorting — the sorting object (possibly with centered spike times)

  • recording — the recording object

  • sampling_frequency — float

  • peak_ind — int (peak sample index in template)

  • chans_max_all — dict or array mapping unit_id to max-amplitude channel index

  • use_pos_peak — dict or array mapping unit_id to bool (polarity)

  • get_computed_template(unit_id, mode) — returns (n_samples, n_channels) template array

  • ms_to_samples(ms) — time conversion

  • root_folder — Path to waveform files

This can be the custom WaveformExtractor (Kilosort2 backend) or a wrapper around SpikeInterface’s WaveformExtractor (future backends).

Return type:

waveform_extractor

write_recording(recording, dat_path)[source]

Convert a recording to the binary format needed by the sorter.

Not all sorters need this (some read recordings directly via SpikeInterface). The default implementation is a no-op.

Parameters:
  • recording (Any) – SpikeInterface BaseRecording.

  • dat_path (Path) – Output binary file path.

Return type:

None

scale_oom_params(factor)[source]

Mutate self.config to halve (or scale) the OOM-bound knob.

Each backend overrides this to adjust the parameter most directly responsible for GPU memory consumption — typically the per-batch sample count. The default implementation does nothing and reports failure so callers know retry-on-OOM is not supported for that backend.

Parameters:

factor (float) – Multiplicative factor in (0, 1] to apply. 0.5 halves the parameter.

Returns:

True when at least one parameter was

changed; False when no scaling was applied. Callers should skip the retry when False is returned.

Return type:

scaled (bool)

snapshot_oom_params()[source]

Return a snapshot of OOM-bound config fields for restore.

Used by the per-recording OOM-retry loop so a scale-down applied for one recording does not silently persist into the next. The returned dict is opaque — only restore_oom_params() is expected to read it.

Returns:

Backend-specific snapshot. Default

implementation returns an empty dict.

Return type:

snapshot (dict)

restore_oom_params(snapshot)[source]

Restore the OOM-bound config fields from a prior snapshot.

Default implementation is a no-op. Backends that override scale_oom_params() should also override this so the retry loop can reset the config between recordings.

Parameters:

snapshot (dict) – Object returned by snapshot_oom_params().

Return type:

None

Classified Exceptions

When a sort fails, SpikeLab can classify the failure into one of three categories so that callers can implement skip/retry/stop policies without parsing generic error messages.

Classified spike-sorting exceptions shared across runners and curation.

Failures from Kilosort2, Kilosort4, and the downstream curation/waveform code are grouped into three categories so callers can implement retry / skip / hard-stop policies without parsing generic Exception messages:

  • BiologicalSortFailure — the recording itself cannot be sorted (too silent, all channels bad, no waveforms to compute metrics on). Recommended policy: mark the target as not-sortable, move on, do not retry.

  • EnvironmentSortFailure — the host environment or container runtime is misconfigured. Recommended policy: hard stop and surface to the operator; retrying without intervention will loop.

  • ResourceSortFailure — the job exhausted a machine resource (GPU memory today; disk/CPU in future). Recommended policy: retry with reduced parameters rather than skip or hard-stop.

Classifiers in _classifier inspect sorter logs and exception chains to re-raise generic failures as one of the specific types below. The classes are also usable directly from non-classifier paths (e.g. curation code that already knows the exact condition).

exception spikelab.spike_sorting._exceptions.SpikeSortingClassifiedError[source]

Bases: RuntimeError

Base class for all classified sort-pipeline failures.

Catch this when you want to treat any identified failure uniformly. Prefer catching the more specific categorical bases (BiologicalSortFailure, EnvironmentSortFailure, ResourceSortFailure) when the policy differs by category.

exception spikelab.spike_sorting._exceptions.BiologicalSortFailure[source]

Bases: SpikeSortingClassifiedError

Failure caused by the recording itself (too little signal).

exception spikelab.spike_sorting._exceptions.EnvironmentSortFailure[source]

Bases: SpikeSortingClassifiedError

Failure caused by host or container environment misconfiguration.

exception spikelab.spike_sorting._exceptions.ResourceSortFailure[source]

Bases: SpikeSortingClassifiedError

Failure caused by exhausting a machine resource.

exception spikelab.spike_sorting._exceptions.InsufficientActivityError(message, *, sorter, threshold_crossings=None, units_at_failure=None, nspks_at_failure=None, log_path=None)[source]

Bases: BiologicalSortFailure

Sorting crashed because the recording has too little spiking activity.

Kilosort2, Kilosort4, and RT-Sort all fail on near-silent recordings, but in different ways:

  • Kilosort2: mex kernels launch with degenerate grid/block configurations when template counts and per-batch spike counts approach zero. Pre-Blackwell GPUs tolerated these launches; newer architectures (compute capability ≥ 12) reject them with CUDA error: invalid configuration argument.

  • Kilosort4: sklearn’s TruncatedSVD rejects an empty feature matrix, or KMeans fails the n_samples >= n_clusters check, when the initial spike-detection pass finds essentially no events.

  • RT-Sort: detect_sequences produces zero propagation sequences when the recording lacks sufficient spiking activity for clustering. Returns None, which causes an AttributeError when sort_offline is subsequently called.

threshold_crossings

KS2 only; count of detected threshold crossings parsed from kilosort2.log. None for KS4 / RT-Sort.

units_at_failure

KS2 template count at the crash, or KS4 n_samples when KMeans complained. None when the log did not expose the value.

nspks_at_failure

KS2 only; spikes-per-batch at the failing template-optimization step.

log_path

Sorter log file carrying the full trace when located.

sorter

Short identifier of the sorter that raised ("kilosort2", "kilosort4", "rt_sort").

__init__(message, *, sorter, threshold_crossings=None, units_at_failure=None, nspks_at_failure=None, log_path=None)[source]
exception spikelab.spike_sorting._exceptions.NoGoodChannelsError(message, *, sorter, total_channels=None, bad_channels=None, log_path=None)[source]

Bases: BiologicalSortFailure

All channels were flagged as bad by the sorter’s good-channel check.

Distinct from InsufficientActivityError: the signal may be noisy/present but no channel passes the sorter’s minfr_goodchannels (or equivalent) firing-rate threshold.

total_channels

Total channel count in the recording, when parsed.

bad_channels

Channels flagged as bad.

log_path

Sorter log file carrying the full trace when located.

sorter

Short identifier of the sorter that raised.

__init__(message, *, sorter, total_channels=None, bad_channels=None, log_path=None)[source]
exception spikelab.spike_sorting._exceptions.SaturatedSignalError(message, *, channels_saturated=None, total_channels=None)[source]

Bases: BiologicalSortFailure

Recording appears flat or rail-saturated across all channels.

Typical causes: disconnected electrodes, loss of fluid contact, broken amplifier front-end, or a saved recording that never received real data. Distinct from InsufficientActivityError because it reflects a hardware/acquisition fault rather than biology.

The sort-time log signatures are ambiguous with near-silent biology, so this class is currently intended to be raised by dedicated pre-sort validators (e.g. per-channel variance / rail-clip checks) rather than by the post-failure classifiers. Callers that already know the condition may raise it directly.

channels_saturated

Number of channels identified as saturated, when the caller provides this.

total_channels

Total channel count in the recording.

__init__(message, *, channels_saturated=None, total_channels=None)[source]
exception spikelab.spike_sorting._exceptions.EmptyWaveformMetricsError(message, *, metric_name=None)[source]

Bases: BiologicalSortFailure, ValueError

Waveform metrics (SNR, std-norm) cannot be computed.

Raised when curation requests a waveform-based metric but no precomputed values exist and raw_data on the SpikeData is empty, so there is nothing to extract waveforms from.

This is biology-adjacent: it typically means the upstream sorter produced units that have no usable waveform evidence attached, or that the pipeline skipped the waveform-extraction stage. Callers should treat it as “cannot curate this target” rather than retry.

Inherits from both BiologicalSortFailure (for category-aware handling) and ValueError (for backward compatibility with callers that historically caught ValueError from this site).

metric_name

The metric that could not be computed.

__init__(message, *, metric_name=None)[source]
exception spikelab.spike_sorting._exceptions.ConcurrentSortError(message, *, lock_path=None, holder_pid=None, holder_hostname=None, started_at=None)[source]

Bases: EnvironmentSortFailure

Another sort is already in progress on the same intermediate folder.

Raised by spikelab.spike_sorting.guards.acquire_sort_lock() when a pre-existing lock file points at an alive PID on the same host. Two concurrent sorts against the same intermediate folder would corrupt each other’s binary artefacts (KS2 .dat file, RT-Sort scaled traces, curation cache), so the second sort fails fast rather than racing.

Recommended remediation: wait for the running sort to finish, or point the second sort at a different intermediate_folders path. If you believe the holder is dead but the lock persists, delete <inter_path>/.spikelab_sort.lock by hand.

lock_path

Path to the lock file that triggered the abort.

holder_pid

PID listed in the lock file (when readable).

holder_hostname

Hostname listed in the lock file (when readable).

started_at

ISO timestamp recorded when the holder acquired the lock.

__init__(message, *, lock_path=None, holder_pid=None, holder_hostname=None, started_at=None)[source]
exception spikelab.spike_sorting._exceptions.HDF5PluginMissingError(message, *, configured_path=None)[source]

Bases: EnvironmentSortFailure

HDF5 filter plugin is missing or the plugin path is misconfigured.

Typical signatures in the underlying exception chain: h5py / HDF5 errors about being unable to open a compressed dataset, or the inherited HDF5_PLUGIN_PATH environment variable pointing to a non-existent directory.

Recommended remediation (operator, not the library): set HDF5_PLUGIN_PATH to a directory containing the compression plugin required by the recording’s HDF5 build before any h5py import. The exact directory and plugin name are deployment-specific.

configured_path

The value of HDF5_PLUGIN_PATH at failure time, if known.

__init__(message, *, configured_path=None)[source]
exception spikelab.spike_sorting._exceptions.DockerEnvironmentError(message, *, reason)[source]

Bases: EnvironmentSortFailure

Docker daemon, client library, or image is unusable for sorting.

The reason string narrows the failure mode so callers can render better diagnostics or choose different remediations without catching sub-exceptions.

Recognized reason values:

  • "daemon_down" — Cannot connect to the Docker daemon.

  • "client_missing" — The Python docker client library is not installed in the sorting env.

  • "image_pull_failed" — Image pull returned an error (network, auth, or manifest-not-found).

  • "permission_denied" — Socket permission denied; user not in the docker group or equivalent.

  • "other" — Docker is broken in a way that did not match any known signature; inspect __cause__ for details.

reason

One of the strings above.

__init__(message, *, reason)[source]
exception spikelab.spike_sorting._exceptions.ModelLoadingError(message, *, sorter='rt_sort', model_path=None)[source]

Bases: EnvironmentSortFailure

Detection model could not be loaded or is unusable.

Raised when RT-Sort’s ModelSpikeSorter.load() fails — typically because PyTorch is missing, weights are corrupt, the model folder does not exist, or the architecture parameters do not match the saved state dict.

model_path

Path that was attempted, when known.

sorter

Short identifier of the sorter that raised.

__init__(message, *, sorter='rt_sort', model_path=None)[source]
exception spikelab.spike_sorting._exceptions.GPUOutOfMemoryError(message, *, sorter, log_path=None)[source]

Bases: ResourceSortFailure

The sorter exhausted GPU memory.

Raised when either a PyTorch CUDA out of memory error (KS4) or a MATLAB/mex CUDA_ERROR_OUT_OF_MEMORY diagnostic (KS2) appears in the exception chain or sorter log.

Recommended remediation: reduce batch size / NT / nPCs, split the recording into shorter segments, or run on a larger-memory GPU. Retrying the same command unchanged will loop.

sorter

Short identifier of the sorter that raised.

log_path

Sorter log file carrying the full trace when located.

__init__(message, *, sorter, log_path=None)[source]
exception spikelab.spike_sorting._exceptions.SorterTimeoutError(message, *, sorter, inactivity_s=None, log_path=None)[source]

Bases: ResourceSortFailure

The sorter subprocess produced no output for too long.

Raised by spikelab.spike_sorting.guards.LogInactivityWatchdog when the sorter’s log file has not been updated within the configured inactivity tolerance. Distinct from a hard wall-clock timeout: this fires only when the sort has stopped making progress (no log writes), so legitimate long sorts on dense MEAs / multi-hour recordings are not falsely killed.

Recommended remediation: skip the recording and continue. Retrying without intervention will likely hang again at the same stage. Investigate the sorter log up to the inactivity point for the proximate cause (CUDA hang, MATLAB JVM deadlock, mex kernel failure mode, disk-full stall).

sorter

Short identifier of the sorter that hung.

inactivity_s

Configured inactivity tolerance at the time of the trip, in seconds.

log_path

Path to the sorter log file the watchdog was polling, when known.

__init__(message, *, sorter, inactivity_s=None, log_path=None)[source]
exception spikelab.spike_sorting._exceptions.DiskExhaustionError(message, *, folder=None, free_gb_at_trip=None, abort_threshold_gb=None, report=None)[source]

Bases: ResourceSortFailure

Free disk space crossed the watchdog abort threshold mid-sort.

Raised by spikelab.spike_sorting.guards.DiskUsageWatchdog when shutil.disk_usage(folder).free drops below the configured abort threshold while a sort is in progress. RT-Sort especially can fill a volume mid-run by writing scaled traces, model traces, and model outputs as large .npy files.

The exception carries a DiskExhaustionReport describing free space, projected need, top disk consumers in the watched folder, and suggested operator actions.

Recommended remediation: free disk space (or shorten the recording window via RTSortConfig.recording_window_ms / first_n_mins) and rerun. The report’s top_consumers field flags the largest existing files in the watched folder so the operator can clean up safely.

folder

The folder whose free space crossed the threshold.

free_gb_at_trip

Free space (GB) at the moment of the trip.

abort_threshold_gb

Configured abort threshold (GB).

report

Optional DiskExhaustionReport with the full diagnostic payload. None only when the report could not be assembled (e.g. os.walk failed).

__init__(message, *, folder=None, free_gb_at_trip=None, abort_threshold_gb=None, report=None)[source]
exception spikelab.spike_sorting._exceptions.GpuMemoryWatchdogError(message, *, device_index=None, used_pct_at_trip=None, abort_pct=None)[source]

Bases: ResourceSortFailure

GPU VRAM crossed the watchdog abort threshold mid-sort.

Raised by spikelab.spike_sorting.guards.GpuMemoryWatchdog when free VRAM on the device-in-use drops below the configured abort threshold (or used VRAM crosses the abort percentage). Sharp GPU OOMs typically come from PyTorch allocator fragmentation rather than a clean cudaMalloc failure, so a percentage-based early warning lets the pipeline trigger the existing OOM-retry path with a reduced batch before the next allocation hits the wall.

Recommended remediation: rerun with reduced sorter batch params (the existing OOM-retry path handles this automatically through GPUOutOfMemoryError classification, which this exception subclasses-by-symmetry — both surface as oom_gpu status).

device_index

Index of the GPU device that crossed the threshold.

used_pct_at_trip

GPU memory used percentage at the moment of the trip.

abort_pct

Configured abort percentage threshold.

__init__(message, *, device_index=None, used_pct_at_trip=None, abort_pct=None)[source]
exception spikelab.spike_sorting._exceptions.GpuThermalWatchdogError(message, *, device_index=None, temperature_c_at_trip=None, abort_temp_c=None)[source]

Bases: ResourceSortFailure

GPU temperature crossed the watchdog abort threshold mid-sort.

Raised by spikelab.spike_sorting.guards.GpuMemoryWatchdog when the device’s reported temperature crosses the configured abort threshold. Sustained operation above the GPU’s thermal junction limit risks driver-level throttling that produces silently degraded output, or in extreme cases a hardware shutdown that loses the in-progress sort.

Recommended remediation: pause the batch until the GPU cools (check airflow, ambient temperature, dust on the heatsink), then rerun. A persistent thermal trip across reboots indicates a cooling failure that needs operator attention.

device_index

Index of the GPU device that crossed the threshold.

temperature_c_at_trip

Reported device temperature in degrees Celsius at the moment of the trip.

abort_temp_c

Configured abort temperature threshold.

__init__(message, *, device_index=None, temperature_c_at_trip=None, abort_temp_c=None)[source]
exception spikelab.spike_sorting._exceptions.IOStallError(message, *, device=None, stall_s=None)[source]

Bases: ResourceSortFailure

Disk I/O stalled mid-sort.

Raised by spikelab.spike_sorting.guards.IOStallWatchdog when psutil.disk_io_counters() for the watched volume shows no byte-counter movement for the configured tolerance — typical of a hung NFS / SMB / S3-fuse mount that’s still accepting file handles but not actually reading or writing.

The inactivity watchdog catches some I/O stalls (no log output → trip), but a sorter that keeps logging while waiting for I/O can defeat that signal. The I/O stall watchdog adds a second layer specifically targeting kernel-level read/write progress.

device

Volume identifier (e.g. "sda1", "C:").

stall_s

Configured stall tolerance at the time of the trip.

__init__(message, *, device=None, stall_s=None)[source]
exception spikelab.spike_sorting._exceptions.HostMemoryWatchdogError(message, *, percent_at_trip=None, abort_pct=None)[source]

Bases: ResourceSortFailure

Host RAM pressure exceeded the watchdog abort threshold.

Raised by spikelab.spike_sorting.guards.HostMemoryWatchdog when psutil.virtual_memory().percent crosses the configured abort percentage. Distinct from a Python MemoryError (which fires on a failed allocation): this signals impending host-level thrash before any individual allocation has hit a wall, so the pipeline can skip the current recording and let the workstation recover.

Recommended remediation: skip the current recording, free references and call gc.collect()/torch.cuda.empty_cache(), then continue with the next recording. Investigate the recording that tripped the trigger — long durations, very high unit counts, or oversized intermediate buffers are common causes.

percent_at_trip

psutil system memory percentage at the moment the watchdog tripped.

abort_pct

Configured abort threshold.

__init__(message, *, percent_at_trip=None, abort_pct=None)[source]

Post-Failure Classifiers

The classifier module inspects sorter logs and exception chains to produce specific SpikeSortingClassifiedError subclasses from generic failures.

spikelab.spike_sorting._classifier.classify_ks2_failure(output_folder, exc)[source]

Return a classified exception for a Kilosort2 failure, or None.

Priority: environment → resource → biology. Environment and resource errors can appear on any recording, so they take precedence over biology signatures that would otherwise be consistent with them.

Return type:

Optional[SpikeSortingClassifiedError]

spikelab.spike_sorting._classifier.classify_ks4_failure(output_folder, exc)[source]

Return a classified exception for a Kilosort4 failure, or None.

Priority mirrors KS2. KS4 does not expose a distinct “all channels bad” diagnostic the same way KS2 does, so only the generic biology classifier (insufficient activity) is applied.

Return type:

Optional[SpikeSortingClassifiedError]

spikelab.spike_sorting._classifier.classify_rt_sort_failure(output_folder, exc)[source]

Return a classified exception for an RT-Sort failure, or None.

Priority: environment → resource → biology. RT-Sort does not use Docker, but the HDF5 plugin check applies because it reads HDF5 recordings. GPU OOM is possible during model inference.

Parameters:
  • output_folder (Path) – RT-Sort output directory (may contain rt_sort.log).

  • exc (BaseException) – The caught exception.

Returns:

A classified

exception if a known signature was found, otherwise None.

Return type:

classified (SpikeSortingClassifiedError or None)

Sort Run Reports

sort_recording can return a structured per-run report via the out_report= keyword argument, capturing per-recording status, timings, and any classified failure.

class spikelab.spike_sorting.pipeline.SortRunReport(records=<factory>)[source]

Bases: object

Per-batch summary of a sort_recording() invocation.

Records a RecordingResult for each input recording — both successes and failures — so callers can inspect the outcome programmatically without parsing the per-recording log files.

The report is also serialised to disk:

  • Per-recording: <results_folder>/recording_report.json (always written).

  • Per-batch: optional, see out_report parameter on sort_recording().

Parameters:

records (list[RecordingResult]) – Per-recording outcomes in the order they were processed. Use the convenience properties for filtered views.

records: List[RecordingResult]
add(record)[source]

Append a per-recording result.

Parameters:

record (RecordingResult) – Outcome of one recording.

Return type:

None

property succeeded: List[RecordingResult]

All successful recordings, in run order.

property failed: List[RecordingResult]

All non-successful recordings, in run order.

property all_succeeded: bool

True if every recording in the batch succeeded.

to_dict()[source]

Return a JSON-friendly dict representation.

Return type:

dict

__init__(records=<factory>)
class spikelab.spike_sorting.pipeline.RecordingResult(rec_name, rec_path, results_folder, status, wall_time_s, n_curated_units=None, error_class=None, error_message=None, retries_used=0, log_path=None, peak_host_ram_pct=None, peak_gpu_used_pct=None, min_disk_free_gb=None)[source]

Bases: object

Outcome of sorting a single recording within a batch.

Parameters:
  • rec_name (str) – Short recording identifier (the file’s basename).

  • rec_path (str) – Original recording path as a string.

  • results_folder (str) – Per-recording results folder.

  • status (str) – One of "success", "failed", "oom_gpu", "oom_host_ram", "oom_memoryerror", "sorter_timeout", "disk_exhausted", "gpu_thermal", "io_stall", "concurrent_sort".

  • wall_time_s (float) – Wall-clock time spent on this recording (including OOM retries).

  • n_curated_units (int or None) – Number of curated units when successful, otherwise None.

  • error_class (str or None) – type(exc).__name__ on failure, otherwise None.

  • error_message (str or None) – str(exc) on failure (first 500 chars), otherwise None.

  • retries_used (int) – OOM-retry attempts consumed.

  • log_path (str or None) – Path to the per-recording Tee log file (sorting_<timestamp>.log). Populated by sort_recording so the batch summary can point users at the log for failure diagnosis.

rec_name: str
rec_path: str
results_folder: str
status: str
wall_time_s: float
n_curated_units: int | None = None
error_class: str | None = None
error_message: str | None = None
retries_used: int = 0
log_path: str | None = None
peak_host_ram_pct: float | None = None
peak_gpu_used_pct: float | None = None
min_disk_free_gb: float | None = None
__init__(rec_name, rec_path, results_folder, status, wall_time_s, n_curated_units=None, error_class=None, error_message=None, retries_used=0, log_path=None, peak_host_ram_pct=None, peak_gpu_used_pct=None, min_disk_free_gb=None)

After a successful sort, the pipeline writes a human-readable sorting_report.md next to the results. The functions below let you regenerate it manually or extract its components programmatically.

spikelab.spike_sorting.report.generate_sorting_report(results_folder, *, log_path=None, recording_report_path=None, curated_pkl_path=None, config_used_path=None, output_path=None)[source]

Generate a Markdown sorting report for a single recording.

Reads the per-recording Tee log, recording_report.json, config_used.json, and the curated SpikeData pickle (each auto-detected from results_folder when its argument is None), then writes a structured Markdown report describing the run.

The report is the input the spikelab-spikesorter agent skill consumes — it replaces the manual report-writing instructions with a deterministic, testable artefact.

Parameters:
  • results_folder (path-like) – The per-recording results directory. All other paths default to standard names inside this folder when their argument is None.

  • log_path (path-like or None) – Path to the Tee log file (sorting_<timestamp>.log). None auto-picks the most recent matching file in results_folder.

  • recording_report_path (path-like or None) – Path to recording_report.json. Default: <results_folder>/recording_report.json.

  • curated_pkl_path (path-like or None) – Path to the curated SpikeData pickle. Default: <results_folder>/sorted_spikedata_curated.pkl.

  • config_used_path (path-like or None) – Path to config_used.json. Default: <results_folder>/config_used.json.

  • output_path (path-like or None) – Where to write the report. Default: <results_folder>/sorting_report.md.

Returns:

The written file’s path, or None on

best-effort failure (the surrounding pipeline never lets a report failure abort the batch).

Return type:

path (Path or None)

spikelab.spike_sorting.report.parse_sorting_log(log_text)[source]

Extract structured fields from a Tee-mirrored sorting log.

The sort_recording pipeline writes per-recording stdout to a sorting_<timestamp>.log via Tee. That log includes a structured banner block, ISO-stamped stage banners, the “Curation: N -> M units” line, a closing summary, and any Python traceback on failure. This function pulls those pieces out into a dict suitable for templating into Markdown.

Parameters:

log_text (str) – Full text of the Tee log file.

Returns:

Keys include environment (dict),

run (dict), stage_timings (list of {name, timestamp} dicts), curation_line (str or None), closing_summary (dict), warnings (list[str]), traceback (str or None), last_lines_before_traceback (list[str]).

Return type:

info (dict)

spikelab.spike_sorting.report.extract_unit_quality_stats(curated_pkl_path)[source]

Read the curated SpikeData pickle and return per-metric summary stats.

Reads attributes from sd.neuron_attributes for SNR, std_norm, amplitude. Computes firing rate from sd.train lengths and sd.length. Returns {} when the pickle cannot be loaded or is empty.

Parameters:

curated_pkl_path (Path) – Path to sorted_spikedata_curated.pkl.

Returns:

Dict of metric name → summary stats dict.

Return type:

stats (dict)

Resource Guards

The pipeline ships with a set of preflight checks and live watchdogs that run automatically during a sort. Most users never need to touch these directly — they are configured via ExecutionConfig and surface as classified exceptions when triggered. The pieces below are exposed for advanced users who want to run preflight checks standalone or inspect watchdog state.

spikelab.spike_sorting.guards.run_preflight(config, recording_files, intermediate_folders, results_folders)[source]

Run pre-loop resource checks; return all findings.

Findings are not raised by this function — the caller decides whether to escalate based on ExecutionConfig.preflight_strict.

Parameters:
  • config (SortingPipelineConfig) – Pipeline configuration. Reads thresholds from config.execution; sorter selection from config.sorter; recording-side overrides from config.recording; RT-Sort device + probe from config.rt_sort.

  • recording_files (sequence) – Recording inputs (used for length sanity in future checks; currently unused but kept in the signature for forward compatibility).

  • intermediate_folders (sequence of path-like) – Per-recording intermediate folders. Disk free space is checked at each folder’s nearest existing ancestor.

  • results_folders (sequence of path-like) – Per-recording results folders. Disk free space is checked similarly.

Returns:

All findings produced by

the checks. May be empty when the host has plenty of headroom.

Return type:

findings (list[PreflightFinding])

Raises:

ValueError – If any of config.execution.preflight_min_*_gb is None. The thresholds must be numeric.

Notes

  • Empty recording_files, intermediate_folders, or results_folders produce a fail-level “environment” finding (codes no_recordings, no_intermediate_folders, no_results_folders) but do not short-circuit — the host and dependency checks still run.

class spikelab.spike_sorting.guards.PreflightFinding(level, code, message, remediation=None, category='resource')[source]

A single resource-check finding from run_preflight().

Parameters:
  • level (str) – Either "warn" or "fail".

  • code (str) – Short stable identifier (e.g. "low_disk_inter", "low_vram").

  • message (str) – One-line description of what was observed.

  • remediation (str or None) – Suggested action for the operator.

  • category (str) – One of "resource" or "environment" — controls which exception subclass is raised when the finding is escalated.

level: str
code: str
message: str
remediation: str | None = None
category: str = 'resource'
__init__(level, code, message, remediation=None, category='resource')
class spikelab.spike_sorting.guards.HostMemoryWatchdog(warn_pct=85.0, abort_pct=92.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0)[source]

Daemon-thread watchdog that aborts the sort on host RAM pressure.

Use as a context manager. While the context is active a daemon thread polls system memory; on abort it terminates registered subprocesses and injects a KeyboardInterrupt into the main thread.

Parameters:
  • warn_pct (float) – System memory percentage at which the watchdog prints a (rate-limited) warning. Defaults to 85.0.

  • abort_pct (float) – System memory percentage at which the watchdog terminates registered subprocesses and aborts the main thread. Defaults to 92.0.

  • poll_interval_s (float) – Seconds between polls. Defaults to 2.0.

  • warn_repeat_s (float) – Minimum seconds between repeated warnings at the same level. Defaults to 30.0.

  • kill_grace_s (float) – Default seconds between terminate() and kill() for registered subprocesses. Per-subprocess overrides are accepted in register_subprocess(). Defaults to 5.0.

Notes

  • Degrades to a no-op when psutil is missing.

  • Safe to nest: the inner context is the active one for the duration of its body, and the outer context resumes on exit.

__init__(warn_pct=85.0, abort_pct=92.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0)[source]
register_subprocess(popen, *, kill_grace_s=None)[source]

Track a subprocess for termination on watchdog abort.

Parameters:
  • popen (subprocess.Popen) – The child process handle. The watchdog calls terminate() first, then kill() after kill_grace_s seconds if the process is still alive.

  • kill_grace_s (float or None) – Override the default grace period for this subprocess. None uses the watchdog’s kill_grace_s.

Return type:

None

unregister_subprocess(popen)[source]

Stop tracking a previously registered subprocess.

Parameters:

popen (subprocess.Popen) – Handle previously passed to register_subprocess(). No-op if not registered.

Return type:

None

register_kill_callback(callback)[source]

Track a zero-arg callable to invoke on watchdog abort.

Used for kill targets that are not subprocess.Popen objects — Docker containers, kubernetes pods, custom cleanup hooks. The callback runs after any registered subprocesses have been terminated. Exceptions raised by a callback are logged but do not prevent other callbacks from running.

Parameters:

callback (Callable[[], None]) – Zero-arg function. Should be idempotent and tolerate being called on an already-stopped target — the watchdog cannot tell whether the kill target is still alive.

Return type:

None

Notes

  • To allow the kill target to be garbage-collected even while registered, build the callback with a weakref to the target rather than capturing it directly. See docker_utils.patched_container_client for the container-kill pattern.

unregister_kill_callback(callback)[source]

Stop tracking a previously registered kill callback.

Parameters:

callback (Callable[[], None]) – Callable previously passed to register_kill_callback(). No-op if not registered. Identity comparison is used.

Return type:

None

tripped()[source]

Return True if the watchdog has fired its abort path.

Return type:

bool

interrupt_delivery_failed()[source]

Return True if the trip fired but _thread.interrupt_main raised.

When True, host protection ran successfully (subprocesses terminated, kill callbacks invoked) but the main thread did not receive a KeyboardInterrupt. The pipeline’s catch site checks this to reclassify a downstream BrokenPipeError / RuntimeError (caused by the now-dead subprocess) as the appropriate watchdog error.

Returns:

True only when the watchdog tripped and

the interrupt delivery raised.

Return type:

failed (bool)

percent_at_trip()[source]

Return the memory percent at the trip moment, or None.

Return type:

Optional[float]

make_error(message=None)[source]

Build a HostMemoryWatchdogError from the trip state.

Parameters:

message (str or None) – Override the default message.

Returns:

Exception ready to raise.

Return type:

err (HostMemoryWatchdogError)

class spikelab.spike_sorting.guards.GpuMemoryWatchdog(device_index=0, *, warn_pct=85.0, abort_pct=95.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0, warn_temp_c=85.0, abort_temp_c=92.0, monitor_throttle_reasons=True)[source]

Daemon-thread watchdog that aborts on GPU VRAM or thermal pressure.

Use as a context manager around the per-recording sort. Each poll inspects three signals:

  • VRAM usage — crossing warn_pct prints a rate-limited warning; crossing abort_pct builds a GpuMemoryWatchdogError, terminates registered subprocesses, runs kill callbacks, and raises into the main thread.

  • Device temperature — crossing warn_temp_c prints a rate-limited warning; crossing abort_temp_c aborts with a GpuThermalWatchdogError. Sustained operation above the GPU’s thermal junction limit risks driver-level throttling that silently degrades sort output.

  • Active throttle reasons — when the device reports SW/HW power-cap or thermal slowdown, prints a rate-limited warning (no abort: the device is already protecting itself).

Parameters:
  • device_index (int) – GPU index to monitor. Use resolve_active_device() to pick from the config.

  • warn_pct (float) – Used-memory percentage at which to warn. Defaults to 85.0.

  • abort_pct (float) – Used-memory percentage at which to abort. Defaults to 95.0.

  • poll_interval_s (float) – Seconds between polls. Defaults to 2.0.

  • warn_repeat_s (float) – Minimum seconds between repeated warnings. Defaults to 30.0.

  • kill_grace_s (float) – Seconds between terminate() and kill() on registered subprocesses.

  • warn_temp_c (float or None) – Temperature in degrees Celsius at which to warn. None disables the warn-stage temp check. Defaults to 85.0.

  • abort_temp_c (float or None) – Temperature at which to abort. None disables thermal aborts. Defaults to 92.0.

  • monitor_throttle_reasons (bool) – When True, surface NVML throttle reasons (SW power cap, HW thermal slowdown, HW power brake) as rate-limited warnings. Defaults to True.

Notes

  • Thermal monitoring requires pynvml; the nvidia-smi-only fallback path used by read_gpu_memory() does not surface temperature. When pynvml is missing, thermal/throttle checks silently degrade while VRAM monitoring continues via nvidia-smi.

  • Disabled (no-op context manager) when no usable GPU info source is available.

__init__(device_index=0, *, warn_pct=85.0, abort_pct=95.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0, warn_temp_c=85.0, abort_temp_c=92.0, monitor_throttle_reasons=True)[source]
tripped()[source]

Return True once the watchdog has fired its abort path.

Return type:

bool

interrupt_delivery_failed()[source]

Return True if the trip fired but _thread.interrupt_main raised.

When True, GPU protection ran successfully (subprocesses terminated, kill callbacks invoked) but the main thread did not receive a KeyboardInterrupt. The pipeline’s catch site checks this to reclassify a downstream exception caused by the now-dead subprocess.

Returns:

True only when the watchdog tripped and

the interrupt delivery raised.

Return type:

failed (bool)

used_pct_at_trip()[source]

Return the used-memory percent at the trip moment, or None.

Return type:

Optional[float]

temperature_c_at_trip()[source]

Return the device temperature at the trip moment, or None.

Return type:

Optional[float]

trip_kind()[source]

Return "memory", "thermal", or None if not tripped.

Return type:

Optional[str]

make_error(message=None)[source]

Build the trip-kind-appropriate watchdog error.

Parameters:

message (str or None) – Override the default message.

Returns:

GpuMemoryWatchdogError for VRAM trips,

GpuThermalWatchdogError for temperature trips. Falls back to a memory-shaped error when the trip kind is unset.

Return type:

err

register_subprocess(popen, *, kill_grace_s=None)[source]

Track a subprocess for termination on watchdog abort.

Return type:

None

unregister_subprocess(popen)[source]

Stop tracking a previously registered subprocess.

Return type:

None

register_kill_callback(callback)[source]

Track a zero-arg callable to invoke on watchdog abort.

Return type:

None

unregister_kill_callback(callback)[source]

Stop tracking a previously registered kill callback.

Return type:

None

class spikelab.spike_sorting.guards.DiskUsageWatchdog(folder, *, warn_free_gb=5.0, abort_free_gb=1.0, poll_interval_s=10.0, warn_repeat_s=30.0, sorter='sort', projected_need_gb=None, popen=None, kill_callback=None, kill_grace_s=5.0)[source]

Daemon watchdog that aborts the sort on low free disk space.

Use as a context manager around the per-recording sort. While active, a daemon thread polls free space on folder every poll_interval_s seconds. Crossing warn_free_gb prints a rate-limited warning; crossing abort_free_gb builds a DiskExhaustionReport, terminates any registered subprocess, and runs an optional kill callback (mirroring the in-process kill path used by LogInactivityWatchdog).

Parameters:
  • folder (Path) – The folder to monitor (typically the per-recording intermediate folder).

  • warn_free_gb (float) – Free-disk threshold at which to print a warning. Defaults to 5.0.

  • abort_free_gb (float) – Free-disk threshold at which to abort the sort. Defaults to 1.0.

  • poll_interval_s (float) – Seconds between polls. Defaults to 10.0.

  • warn_repeat_s (float) – Minimum seconds between repeated warnings. Defaults to 30.0.

  • sorter (str) – Short identifier used in diagnostic prints and in the resulting DiskExhaustionError.

  • projected_need_gb (float or None) – Optional sorter-specific disk projection; included verbatim in the trip report when present.

  • popen (subprocess.Popen or None) – Subprocess to terminate on trip (e.g. KS2 MATLAB child).

  • kill_callback (Callable[[], None] or None) – Optional zero-arg callable invoked on trip — used by in-process sorters to install a two-stage interrupt-then-os._exit fallback.

  • kill_grace_s (float) – Seconds between terminate() and kill() on a registered subprocess.

Notes

  • The watchdog only trips once. After trip the polling thread exits.

  • Disabled (no-op) when abort_free_gb is non-positive or when neither a popen nor a kill_callback is provided.

__init__(folder, *, warn_free_gb=5.0, abort_free_gb=1.0, poll_interval_s=10.0, warn_repeat_s=30.0, sorter='sort', projected_need_gb=None, popen=None, kill_callback=None, kill_grace_s=5.0)[source]
tripped()[source]

Return True once the watchdog has fired its abort path.

Return type:

bool

report()[source]

Return the DiskExhaustionReport if the watchdog tripped.

Return type:

Optional[DiskExhaustionReport]

make_error(message=None)[source]

Build a DiskExhaustionError from the trip state.

Parameters:

message (str or None) – Override the default message.

Returns:

Exception ready to raise.

Return type:

err (DiskExhaustionError)

class spikelab.spike_sorting.guards.LogInactivityWatchdog(log_path, popen, inactivity_s, *, sorter, poll_interval_s=5.0, kill_grace_s=5.0, kill_callback=None)[source]

Daemon watchdog that kills a subprocess on sorter-log inactivity.

Use as a context manager around the call that waits for the sorter subprocess. While the context is active a daemon thread polls log_path (via os.stat().st_mtime) every poll_interval_s. If the file’s mtime has not advanced for inactivity_s seconds the watchdog terminates the registered subprocess and records the trip; the wait then returns and the runner can detect the kill via tripped() and raise SorterTimeoutError.

Parameters:
  • log_path (Path) – Path to the sorter’s log file. The file does not need to exist when the watchdog starts — it’s polled for first appearance, and the watchdog is forgiving about “no log yet” until the file shows up. The pre-existing mtime (from a previous run, if any) is recorded at start so an old stale log doesn’t trip immediately.

  • popen (subprocess.Popen or None) – Subprocess handle to terminate on trip. Pass None when the sort runs in-process — see kill_callback instead.

  • inactivity_s (float) – Inactivity tolerance in seconds. Use compute_inactivity_timeout_s() to derive a sensible value from recording duration.

  • sorter (str) – Short identifier of the sorter (used for logging and the resulting SorterTimeoutError).

  • poll_interval_s (float) – Seconds between mtime polls. Defaults to 5.0.

  • kill_grace_s (float) – Seconds between terminate() and kill() if the subprocess does not exit. Defaults to 5.0.

  • kill_callback (Callable[[], None] or None) – Optional callback invoked after the subprocess termination step. Used by in-process backends (KS4 host, RT-Sort) to install a two-stage kill: _thread.interrupt_main first, then os._exit if Python is unresponsive. See make_in_process_kill_callback().

Notes

  • When inactivity_s is None, OR when neither popen nor kill_callback is provided, the watchdog is a no-op context manager. This makes it safe to drop in unconditionally — pass inactivity_s=None to disable.

  • The watchdog only trips once. After trip, the polling thread exits.

__init__(log_path, popen, inactivity_s, *, sorter, poll_interval_s=5.0, kill_grace_s=5.0, kill_callback=None)[source]
tripped()[source]

Return True once the watchdog has fired its terminate path.

Return type:

bool

make_error(message=None)[source]

Build a SorterTimeoutError from the trip state.

Parameters:

message (str or None) – Override the default message.

Returns:

Exception ready to raise.

Return type:

err (SorterTimeoutError)

class spikelab.spike_sorting.guards.IOStallWatchdog(folder=None, *, pids=None, include_descendants=True, stall_s=300.0, poll_interval_s=10.0, warn_repeat_s=60.0, kill_grace_s=5.0)[source]

Daemon-thread watchdog that aborts the sort on I/O stalls.

Use as a context manager around the per-recording sort. Operates in one of two modes (chosen at construction):

  • Device mode — pass folder: polls read_bytes + write_bytes for the volume holding the folder every poll_interval_s. Catches kernel-wide I/O hangs but is sensitive to ambient I/O on the same disk.

  • Process mode — pass pids: polls psutil.Process(pid).io_counters() summed across the registered PIDs (and their descendants by default). Detects stalls in the sort process tree specifically; immune to ambient I/O from unrelated processes on the same device.

Either folder or pids (or both) must be provided. When both are given, process mode is used. Additional PIDs can be registered after construction via register_pid() — useful for catching e.g. a Docker container PID after the container actually starts.

On stall, the watchdog builds an IOStallError, terminates registered subprocesses, runs kill callbacks, and raises into the main thread via _thread.interrupt_main.

Parameters:
  • folder (Path or None) – A path on the volume to monitor (typically the per-recording intermediate folder). Provide for device-mode monitoring. None to skip device monitoring entirely.

  • pids (Sequence[int] or None) – Process IDs to monitor in process mode. Defaults to None (device mode). The watchdog sums I/O bytes across these processes and (if include_descendants) their entire descendant trees on every poll.

  • include_descendants (bool) – When in process mode, recurse into each registered PID’s children on every poll so subprocesses spawned by the sort (e.g. spikeinterface workers, KS2 MATLAB child) are accounted for. Defaults to True. Set False if you want to detect a stall in only the registered PIDs without their descendants — rare; mostly useful for debugging.

  • stall_s (float) – Inactivity tolerance for the byte counter, in seconds. Defaults to 300 (5 min) — long enough to span normal write bursts and quiet stretches, short enough to flag genuinely hung mounts.

  • poll_interval_s (float) – Seconds between polls. Defaults to 10.0.

  • warn_repeat_s (float) – Minimum seconds between repeated warnings.

  • kill_grace_s (float) – Seconds between terminate() and kill() for registered subprocesses.

Notes

  • Process mode requires psutil. Device mode is also disabled when psutil is missing or when no device can be resolved for folder. To skip the I/O-stall check intentionally, omit any register_kill_callback calls — the watchdog still polls but has nothing to abort.

  • Unlike HostMemoryWatchdog, this watchdog does not accept subprocess registrations — only kill callbacks. A Docker-backed sort whose container is registered with the host watchdog will not have its container killed when the I/O stall watchdog trips.

  • Docker container processes are visible to the host’s psutil but are NOT children of the orchestrating Python process — Docker daemon is the parent. To monitor a Docker-backed sort in process mode, register the container’s main PID explicitly via register_pid() once it’s known (docker inspect --format '{{.State.Pid}}' <id>).

__init__(folder=None, *, pids=None, include_descendants=True, stall_s=300.0, poll_interval_s=10.0, warn_repeat_s=60.0, kill_grace_s=5.0)[source]
tripped()[source]

Return True once the watchdog has fired its abort path.

Return type:

bool

interrupt_delivery_failed()[source]

Return True if the trip fired but _thread.interrupt_main raised.

When True, host I/O protection ran successfully (kill callbacks invoked) but the main thread did not receive a KeyboardInterrupt. The pipeline’s catch site checks this to reclassify a downstream exception.

Returns:

True only when the watchdog tripped and

the interrupt delivery raised.

Return type:

failed (bool)

device()[source]

Return the resolved device identifier (e.g. “sda1”).

Return type:

Optional[str]

mode()[source]

Return the active polling mode: "device" or "process".

Return type:

str

pids()[source]

Snapshot of the currently registered PIDs (process mode).

Return type:

List[int]

make_error(message=None)[source]

Build an IOStallError from the trip state.

Return type:

IOStallError

register_kill_callback(callback)[source]

Track a zero-arg callable to invoke on watchdog abort.

Return type:

None

unregister_kill_callback(callback)[source]
Return type:

None

register_pid(pid)[source]

Add a PID to the process-mode poll set.

Useful for tracking processes that don’t exist yet at watchdog construction — e.g. registering the Docker container’s main PID once the container has actually started, or registering a sorter subprocess after Popen returns.

No-op when called in device mode (the watchdog isn’t polling per-PID counters there). The PID is added atomically; the next poll picks it up.

Parameters:

pid (int) – The PID to monitor. Must be a positive integer.

Raises:

ValueError – If pid is not a positive integer.

Return type:

None

unregister_pid(pid)[source]

Remove a PID from the process-mode poll set.

No-op when pid is not currently registered or when called in device mode.

Return type:

None

class spikelab.spike_sorting.guards.DiskExhaustionReport(folder, free_gb_at_trip, abort_threshold_gb, projected_need_gb=None, bytes_consumed_during_sort=0.0, top_consumers=<factory>, suggested_actions=<factory>)[source]

Diagnostic payload built when the disk watchdog trips.

Parameters:
  • folder (str) – The folder whose free space crossed the abort threshold.

  • free_gb_at_trip (float) – Free disk space (GB) at the trip moment.

  • abort_threshold_gb (float) – Configured abort threshold (GB).

  • projected_need_gb (float or None) – Sorter-specific projected on-disk footprint in GB when known (e.g. RT-Sort’s estimate_rt_sort_intermediate_gb value).

  • bytes_consumed_during_sort (float) – Bytes consumed inside folder since the watchdog started — i.e. how much this sort has written. Useful for distinguishing “I started near full and crossed the line” vs “I wrote everything”.

  • top_consumers (list[tuple[str, float]]) – Up to 10 largest files inside folder (depth-bounded os.walk) as (path, gb) tuples, sorted descending. Helps the operator identify what to clean up.

  • suggested_actions (list[str]) – Free-form text hints. The watchdog seeds these from the trip context; callers can extend.

folder: str
free_gb_at_trip: float
abort_threshold_gb: float
projected_need_gb: float | None = None
bytes_consumed_during_sort: float = 0.0
top_consumers: List[Tuple[str, float]]
suggested_actions: List[str]
to_dict()[source]

Return a JSON-friendly dict representation of the report.

Return type:

dict

__init__(folder, free_gb_at_trip, abort_threshold_gb, projected_need_gb=None, bytes_consumed_during_sort=0.0, top_consumers=<factory>, suggested_actions=<factory>)

Pipeline Canary

spikelab.spike_sorting.canary.run_canary(config, recording, rec_path, inter_path, *, sorter_name=None, rec_name='canary', rng=None)[source]

Run a short-window smoke test of the configured backend.

Builds a canary clone of config (see _build_canary_config()), spins up a fresh backend instance against that clone, and invokes spikelab.spike_sorting.pipeline.process_recording() against a <inter_path>/_canary/ subdirectory.

Parameters:
  • config (SortingPipelineConfig) – Live pipeline configuration. Read but never mutated.

  • recording (Any) – Pre-loaded BaseRecording for the canary, or None when only a path is available.

  • rec_path (Any) – Path to the recording on disk. Used by the backend loader when recording is None.

  • inter_path (Any) – The recording’s intermediate folder. The canary writes under a _canary sub-folder so the real sort’s artefacts are untouched.

  • sorter_name (str or None) – Override the sorter resolved from config.sorter.sorter_name. Mostly used by tests.

  • rec_name (str) – Short identifier for the canary in log output.

  • rng (np.random.Generator or None) – Optional RNG passed through to process_recording for reproducibility.

Returns:

A classified exception when

the canary discovered a failure the full sort would also have hit; None when the canary succeeded or when the canary itself hit an unexpected non-classified failure (which the live watchdogs are responsible for during the real run).

Return type:

result (BaseException or None)

Stimulation Sorting

Helpers for spike-sorting recordings with electrical stimulation: artifact removal, alignment recentering (single-pulse and multi-pulse), and the end-to-end stim-aware pipeline. See the Stimulation Artifact Removal section of the guide for usage examples.

spikelab.spike_sorting.stim_sorting.sort_stim_recording(stim_recording, rt_sort, stim_times_ms, pre_ms, post_ms, fs_Hz=None, *, artifact_method='polynomial', artifact_window_ms=10.0, saturation_threshold=None, baseline_threshold=None, poly_order=3, artifact_window_only=True, max_stim_offset_ms=50.0, peak_mode='abs_max', n_reference_channels=8, prewindow_ms=5.0, multi_peak=False, multi_peak_select='first', multi_peak_threshold=0.6, multi_peak_min_separation_ms=2.0, model=None, model_path=None, recording_window_ms=None, verbose=True)[source]

Sort spikes in a stimulation recording using pre-trained RT-Sort sequences.

Takes a raw stimulation recording and a trained RTSort object (or path to a saved one produced by sort_recording(..., sorter="rt_sort")), removes stimulation artifacts, runs offline spike sorting, and returns a SpikeSliceStack of sorted spikes aligned to the corrected stimulation event times.

Memory model. When stim_recording is a path or a lazy SpikeInterface recording, the pipeline processes one per-event time chunk at a time (peak RAM ≈ one chunk’s working set, typically 100-200 MB on MaxOne — independent of recording duration). When stim_recording is a pre-materialised np.ndarray, the full-recording path is used instead (caller has already paid the memory cost).

Parameters:
  • stim_recording

    The stimulation recording. Can be: - str or Path to a recording file (Maxwell .h5 or

    NWB). Chunked path.

    • A SpikeInterface BaseRecording object. Chunked path.

    • np.ndarray of shape (channels, samples). Full-recording path (no chunking possible).

  • rt_sort – The trained RT-Sort object or path to its pickle.

  • stim_times_ms (array-like) – Logged stimulation event times in milliseconds.

  • pre_ms (float) – Output peri-event window radius before each stim event, in milliseconds.

  • post_ms (float) – Output peri-event window radius after each stim event, in milliseconds.

  • fs_Hz (float or None) – Sampling frequency in Hz. Required for ndarray input; inferred from the recording object otherwise.

  • artifact_method (str) – "polynomial" (default) or "blank". Passed to remove_stim_artifacts.

  • artifact_window_ms (float) – Max artifact tail duration after the last desaturation. Default 10.0.

  • saturation_threshold (float or None) – Saturation voltage threshold. None auto-detects (gain-anchored from recording metadata if available).

  • baseline_threshold (float or None) – Baseline envelope threshold. None auto-detects from pre-stim MAD.

  • poly_order (int) – Polynomial order for detrend. Default 3.

  • artifact_window_only (bool) – Only process around stim events. Default True.

  • multi_peak (bool) – When True, enables multi-pulse-aware recentering — the search window is interpreted as potentially containing multiple pulses (a stim train), and the alignment target is the first or last qualifying pulse rather than the strongest. Default False. When False, behaviour is identical to the pre-multi-peak implementation. See recenter_stim_times() for details.

  • multi_peak_select (str) – When multi_peak=True, which qualifying peak to lock onto. "first" (default) / "last".

  • multi_peak_threshold (float) – When multi_peak=True, peaks below this fraction of the largest peak in the search window are ignored. Default 0.6.

  • multi_peak_min_separation_ms (float) – When multi_peak=True, minimum spacing between candidate peaks. Default 2.0.

  • max_stim_offset_ms (float) – Search window radius for stim time recentering. Default 50.0.

  • peak_mode (str) – Alignment target for recenter_stim_times. One of "abs_max" (default), "pos_peak", "neg_peak", "down_edge", "up_edge". For biphasic anodic-first pulses where the AP is triggered at the up→down current reversal, use "down_edge".

  • n_reference_channels (int) – Top-K highest-amplitude channels summed to form the signed reference trace for non- abs_max peak modes. Default 8.

  • prewindow_ms (float) – For down_edge / up_edge, radius of the pre-window before the primary peak. Default 5.0.

  • model (ModelSpikeSorter or None) – Detection model instance for load_rt_sort when rt_sort is a path.

  • model_path (str or Path or None) – Path to a detection model folder for load_rt_sort when rt_sort is a path.

  • recording_window_ms (tuple or None) – (start_ms, end_ms) sub-window to restrict processing to. Only events whose peri-event window falls entirely within this range are sorted. None processes the full recording.

  • verbose (bool) – Print progress messages. Default True.

Returns:

Event-aligned spike slice stack

with one slice per (corrected) stim event. Each slice spans [-pre_ms, +post_ms] relative to the stim time.

Return type:

stim_slices (SpikeSliceStack)

spikelab.spike_sorting.stim_sorting.preprocess_stim_artifacts(recording, stim_times_ms, output_path=None, *, method='polynomial', artifact_window_ms=10.0, recenter=True, max_offset_ms=50.0, poly_order=3, saturation_threshold=None, baseline_threshold=None, artifact_window_only=True, return_scaled=False, dtype='float32')[source]

Remove stim artifacts and return a new SpikeInterface recording.

Materialises recording.get_traces() to an ndarray, optionally recenters the stim times to their artifact peaks, runs remove_stim_artifacts(), and wraps the cleaned traces in either a BinaryRecordingExtractor (when output_path is given) or a NumpyRecording. Channel IDs, locations, gains, and offsets are copied from the input recording.

Parameters:
  • recording (BaseRecording) – SpikeInterface recording to clean. Single-segment only.

  • stim_times_ms (array-like) – Logged stim event times in milliseconds (len(stim_times_ms) may be 0, in which case recentering/artifact removal are skipped and the recording is returned unchanged aside from the BinaryRecordingExtractor wrap when output_path is given).

  • output_path (str or Path, optional) – When provided, cleaned traces are written as a float32 binary (interleaved channels, i.e. shape (num_samples, num_channels) on disk) and a BinaryRecordingExtractor is returned. Parent directories are created as needed. When None (default), a NumpyRecording is returned — NOT dumpable for Docker-based sorters.

  • method (str) – "polynomial" (default) or "blank" — see remove_stim_artifacts(). Polynomial detrend preserves spikes in the 0–10 ms post-stim window (the smooth fit can’t capture a ~1 ms spike feature) and is safe by default thanks to poly_clamp_factor — divergent fits at extreme stim amplitudes are caught and downgraded to blank automatically, with one summary warning per call. Use "blank" only when the post-stim window is genuinely irrelevant to the analysis, or when the clamp warning fires on a non-trivial fraction of events (in which case a uniform blank is cleaner than mixing per-event polynomial subtraction with per-event clamp blanks).

  • artifact_window_ms (float) – Length of the post-stim artifact window in ms. Default 10.0.

  • recenter (bool) – When True (default), align logged stim times to the actual artifact peaks via recenter_stim_times() before artifact removal. Set False when the supplied times are already peak-aligned.

  • max_offset_ms (float) – Maximum recentering shift, passed to recenter_stim_times(). Default 50.0.

  • poly_order (int) – Polynomial order for method="polynomial". Default 3.

  • saturation_threshold (float, optional) – Override the auto-detected thresholds used by remove_stim_artifacts().

  • baseline_threshold (float, optional) – Override the auto-detected thresholds used by remove_stim_artifacts().

  • artifact_window_only (bool) – When True (default), only the windows around stim events are processed; when False, a global sliding-window detrend is applied (useful for very frequent stim protocols).

  • return_scaled (bool) – Whether to materialise µV-scaled traces from recording. Default False — match the recording’s native dtype/units. Set True to force a µV-scaled float output when the recording exposes gains/offsets. Forwarded as return_in_uV on newer SpikeInterface versions and return_scaled on older ones.

  • dtype (str) – dtype of the cleaned output (both for in-memory and on-disk representations). Default "float32".

Return type:

Tuple[BaseRecording, dict]

Returns:

  • cleaned_recording (BaseRecording) – New SpikeInterface recording with artifacts removed. Channel IDs, locations, gains, and offsets are inherited from the input.

  • metadata (dict) –

    Artifact-removal metadata. Keys:
    • stim_times_ms_logged: original stim times as passed in

    • stim_times_ms_corrected: recentered stim times (equals stim_times_ms_logged when recenter=False)

    • recenter_offsets_ms: corrected - logged offsets

    • blanked_fraction: overall fraction of samples blanked

    • blanked_fraction_per_channel: per-channel blanked fractions, shape (num_channels,)

spikelab.spike_sorting.stim_sorting.recenter_stim_times(traces, stim_times_ms, fs_Hz, max_offset_ms=50.0, *, peak_mode='abs_max', n_reference_channels=8, prewindow_ms=5.0, warn_offset_ms=3.0, multi_peak=False, multi_peak_select='first', multi_peak_threshold=0.6, multi_peak_min_separation_ms=2.0)[source]

Find actual stimulation artifact times near logged stim times.

For each logged stim time, searches a window of ±max_offset_ms in the raw voltage traces and returns the sample at the alignment point selected by peak_mode. This corrects for timing offsets between the stimulation hardware trigger log and the artifact in the recording.

Parameters:
  • traces (np.ndarray) – Raw voltage traces, shape (channels, samples).

  • stim_times_ms (array-like) – Logged stimulation event times in milliseconds. Need not be sorted.

  • fs_Hz (float) – Sampling frequency in Hz.

  • max_offset_ms (float) – Radius of the search window around each logged stim time, in milliseconds. Default 50.0.

  • peak_mode (str) –

    Alignment target. One of: * "abs_max" (default): largest |voltage| across

    channels. Backward-compatible with the pre-peak_mode API.

    • "pos_peak": largest positive voltage in the top-K summed reference trace.

    • "neg_peak": most negative voltage in the top-K summed reference.

    • "down_edge": up→down transition for biphasic anodic-first pulses (see module docstring).

    • "up_edge": down→up transition for biphasic cathodic-first pulses.

  • n_reference_channels (int) – Number of highest-amplitude channels summed to build the signed reference trace for non-abs_max modes. Default 8. Ignored for abs_max.

  • prewindow_ms (float) – For down_edge / up_edge, radius of the pre-window in which to search for the preceding opposite-polarity peak. Default 5.0.

  • warn_offset_ms (float or None) – When the median |corrected - logged| shift exceeds this threshold, emit a UserWarning. A large systematic shift usually means a fixed hardware-vs-log delay, a wrong time column in the stim log, or a unit mismatch (ms vs s vs samples) rather than genuine jitter. Set to None to silence. Default 3.0 ms — well above one-sample jitter at 20–30 kHz.

  • multi_peak (bool) – Opt-in support for multi-pulse stim trains. When True, the search window is treated as potentially containing multiple stimulation pulses (e.g. a 100 Hz train), and the alignment target is the first or last qualifying pulse rather than the strongest one. Default False — preserves backward-compatible single- peak behavior.

  • multi_peak_select (str) – When multi_peak=True, which qualifying peak to lock onto. "first" (default) = first pulse onset (matches “first-pulse alignment” used for train PSTHs). "last" = last pulse onset (useful for studying after-train rebound). Ignored when multi_peak=False.

  • multi_peak_threshold (float) – When multi_peak=True, only peaks whose amplitude is at least this fraction of the largest peak in the search window are considered “real pulses”. Default 0.6 — accepts pulses up to 40% weaker than the strongest while still rejecting noise.

  • multi_peak_min_separation_ms (float) – When multi_peak=True, the minimum spacing between candidate peaks. Prevents multi-sample peaks of a single pulse from being counted as separate pulses. Default 2.0 ms — well below any sensible inter-pulse interval (5 ms = 200 Hz; 10 ms = 100 Hz).

Returns:

Corrected stim times in

milliseconds, same length as stim_times_ms. Events whose search window extends outside the recording are clipped to the recording boundary.

Return type:

corrected_ms (np.ndarray)

Notes

  • When multiple stim events have overlapping search windows, each is recentered independently.

  • For monophasic pulses the *_edge modes degrade gracefully: the pre-window search returns the opposite polarity’s noise peak and the zero-crossing fallback lands near the onset of the single artifact — but pos_peak / neg_peak will give cleaner results in that case.

  • For single-pulse stim, multi_peak=True degrades to the original single-peak behavior (only one peak in the window is above threshold; first==last). Set it always-on if you mix single-pulse and train conditions in one recording.

spikelab.spike_sorting.stim_sorting.remove_stim_artifacts(traces, stim_times_ms, fs_Hz, method='polynomial', artifact_window_ms=10.0, saturation_threshold=None, baseline_threshold=None, poly_order=3, artifact_window_only=True, copy=True, *, recording=None, raw_traces=None, poly_clamp_factor=10.0)[source]

Remove stimulation artifacts from multi-channel voltage traces.

Processes each stim event independently per channel. Saturated samples are always blanked (zeroed). For the "polynomial" method, a low-order polynomial is fit to the post-saturation artifact tail and subtracted, preserving neural spikes (which are too fast for the smooth polynomial to capture).

When multiple stim events occur in rapid succession and the signal re-saturates before reaching baseline levels, the blanking region is extended dynamically and the polynomial fit is deferred until after the final desaturation in the burst.

The polynomial detrend is conceptually related to SALPA (Wagenaar & Potter 2002, J Neurosci Methods), adapted for offline processing where look-ahead past saturation is available — see the module docstring for details.

Parameters:
  • traces (np.ndarray) – Raw voltage traces, shape (channels, samples).

  • stim_times_ms (array-like) – Corrected stim times in milliseconds (e.g. from recenter_stim_times).

  • fs_Hz (float) – Sampling frequency in Hz.

  • method (str) – "polynomial" (default) or "blank".

  • artifact_window_ms (float) –

    Maximum duration in milliseconds of the artifact tail after the last desaturation point. The polynomial is fit over this window. Default 10.0.

    Note: when the post-stim window contains a clear descent from the recentered stim time to a subsequent negative peak (typical for biphasic anodic-first pulses sorted with peak_mode="down_edge"), the fit is automatically split into two independent polynomials at the negative peak — one for [stim_time, neg_peak] (the descent) and one for [neg_peak, baseline_recovery] (the tail). When the recentered stim time IS the negative peak (e.g. peak_mode="abs_max" or "neg_peak"), no descent exists and a single fit is used. This is automatic; no user knob.

  • saturation_threshold (float or None) – Absolute voltage value above which a sample is considered saturated. When None, auto-detected — preferring gain-anchored detection from recording metadata when supplied (see recording kwarg below), falling back to the 99.9th percentile of |traces| otherwise.

  • raw_traces (np.ndarray or None) – Optional pre-bandpass traces, same shape as traces, used as the source of truth for saturation detection. Bandpass filtering of a stim artifact produces ringing whose filtered amplitude can exceed the raw ADC rail even on unsaturated samples, so auto-detection from traces (filtered) both over- reports (ringing overshoot) and under-reports (group-delay smoothing) clips. When provided, the threshold is derived from raw_traces and the clip mask is built from np.abs(raw_traces) >= threshold; the filtered traces are blanked at those same sample indices and polynomial- detrended around them.

  • baseline_threshold (float or None) – Absolute voltage envelope below which the signal is considered to have returned to baseline. When None, auto-detected from pre-stim MAD.

  • poly_order (int) – Polynomial order for the detrend. Default 3 (cubic). Higher orders risk fitting spike-like features; lower orders may not capture the artifact decay shape.

  • artifact_window_only (bool) – If True (default), only process windows around stim events. If False, apply a global polynomial detrend to the entire trace (for recordings with very frequent stimulation).

  • copy (bool) – If True (default), return a copy; if False, modify traces in-place.

  • poly_clamp_factor (float or None) – Sanity-clamp factor for the "polynomial" method. After each polynomial subtraction, if any post-subtract sample exceeds poly_clamp_factor * saturation_threshold in absolute value, the segment is treated as a divergent fit (extrapolated wildly across saturated samples), blanked instead of left in place, and counted toward a one-shot warning emitted at the end of the call. Default 10.0 — well above any plausible neural amplitude (~100 µV) when saturation_threshold is in the multi-thousand-µV range. Set to None to disable. Has no effect when saturation_threshold is +inf (no clipping detected) or method="blank".

Returns:

Cleaned traces, shape

(channels, samples).

blanked_mask (np.ndarray): Boolean array, shape

(channels, samples). True for samples that were blanked (zeroed) because they fell within a saturation region.

Return type:

cleaned (np.ndarray)