Spike Sorting
The spikelab.spike_sorting sub-package provides a full spike-sorting
pipeline: loading raw recordings, running a sorter backend (Kilosort2,
Kilosort4, or RT-Sort), extracting waveforms, curating units, and compiling
results into SpikeData objects.
See the Spike Sorting and Curation guide for usage examples and environment setup instructions.
Entry Points
- spikelab.spike_sorting.sort_recording(recording_files, config=None, sorter='kilosort2', intermediate_folders=None, results_folders=None, *, out_report=None, **kwargs)[source]
Run spike sorting on one or more recordings using any registered backend.
This is the primary entry point for the modular sorting pipeline.
- Parameters:
recording_files (list) – Paths to recording files or directories. Each entry is sorted independently. Directories have their contents concatenated before sorting and split back into per-file SpikeData afterward.
config (SortingPipelineConfig or None) – Pre-built configuration. When provided,
**kwargsare applied as overrides viaconfig.override(). When None, a fresh config is built fromsorter+**kwargs. Preset configs are available inspikelab.spike_sorting.config(e.g.KILOSORT2).sorter (str) – Registered sorter backend name. Only used when
configis None. Available:"kilosort2","kilosort4".intermediate_folders (list or None) – Intermediate result directories, one per recording. Auto-generated if None.
results_folders (list or None) – Output directories, one per recording. Auto-generated if None.
out_report (SortRunReport or None) – Optional report instance populated in-place with one
RecordingResultper input recording. The same information is always written per-recording to<results_folder>/recording_report.jsonregardless of this argument;out_reportonly adds a programmatic accessor for the batch.**kwargs – Override individual config fields (e.g.
snr_min=5.0,use_docker=True,fr_min=0.05). Seespikelab.spike_sorting.configfor all available parameters, grouped by:RecordingConfig,SorterConfig,WaveformConfig,CurationConfig,CompilationConfig,FigureConfig,ExecutionConfig.
- Returns:
- One SpikeData per original recording
file. For directory inputs, the concatenated recording is split back into per-file SpikeData objects.
- Return type:
Notes
Pickle files (
sorted_spikedata_curated.pkland optionallysorted_spikedata.pkl) are saved to each results folder.hdf5_plugin_path(passed via config or kwargs) setsos.environ['HDF5_PLUGIN_PATH']before any recording is loaded. This is needed for Maxwell.h5files and applies to all backends.
- spikelab.spike_sorting.sort_multistream(recording, stream_ids, config=None, sorter='kilosort2', **kwargs)[source]
Sort a multi-stream recording across multiple stream IDs.
Calls
sort_recordingonce per stream ID, routing each stream to its own intermediate and results folders. Validates that the requested stream IDs exist in the recording file before sorting.- Parameters:
recording (str or Path) – Path to a single multi-stream recording file (e.g. MaxTwo
.raw.h5) or a directory of such files. When a directory is given, all files are concatenated per stream.stream_ids (list of str) – Stream identifiers to sort, e.g.
["well000", "well001", "well002"].config (SortingPipelineConfig or None) – Pre-built configuration. When provided,
**kwargsare applied as overrides.sorter (str) – Registered sorter backend name (default
"kilosort2"). Only used whenconfigis None.**kwargs –
Override individual config fields. The following must not be provided:
intermediate_foldersandresults_foldersare auto-generated per stream.stream_idis set automatically per iteration.
- Returns:
{stream_id: list[SpikeData]}.- Return type:
results (dict)
Notes
Stream ID validation uses SpikeInterface’s extractor for the recording format. Currently supports Maxwell
.h5files. For other formats, validation is skipped and invalid stream IDs will produce errors at loading time.When recording is a directory of files, each file is concatenated per stream before sorting. Channel count and sampling frequency must match across files (raises
ValueError); mismatched channel IDs or locations produce warnings.
Configuration
Configuration dataclass for the spike sorting pipeline.
Replaces the ~80 module-level globals in kilosort2.py with a single typed, inspectable configuration object that is passed explicitly to every pipeline function.
- class spikelab.spike_sorting.config.RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=<factory>, rec_chunks_s=<factory>, start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000)[source]
Bases:
objectParameters for recording loading and preprocessing.
- __init__(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=<factory>, rec_chunks_s=<factory>, start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000)
- class spikelab.spike_sorting.config.SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False)[source]
Bases:
objectParameters for the spike sorter itself.
- __init__(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False)
- class spikelab.spike_sorting.config.RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None)[source]
Bases:
objectParameters for the RT-Sort detection and sorting backend.
RT-Sort is an action-potential-propagation-based spike sorter using a deep learning detection model followed by codetection clustering and template matching. See van der Molen, Lim et al. 2024 (PLOS ONE, DOI: 10.1371/journal.pone.0312438) for algorithmic details.
- Parameters:
model_path (str or None) – Path to a folder containing
init_dict.jsonandstate_dict.ptfor a pretrainedModelSpikeSorter. When None, the bundled model corresponding toprobeis loaded.probe (str) – Which bundled pretrained model to use when
model_pathis None."mea"or"neuropixels".device (str) – PyTorch device for inference.
"cuda"or"cpu".num_processes (int or None) – Number of worker processes for parallel detection/clustering stages. None selects an automatic value based on CPU count.
recording_window_ms (tuple or None) –
(start_ms, end_ms)window of the recording to process. None processes the entire recording.save_rt_sort_pickle (bool) – If True, serialize the final
RTSortobject to the sorter output folder so the trained sequences can be re-used in Phase 2 stim-aware sorting.delete_inter (bool) – If True, delete the intermediate cache directory after sorting completes.
verbose (bool) – Print progress messages during sorting.
params (dict or None) – Override dictionary merged into the RT-Sort parameter set. Takes precedence over the preset defaults; useful for one-off tuning without editing a preset. Keys must match
detect_sequencesparameter names.detection_window_s (float or None) – If set, run sequence detection on only the first
detection_window_sseconds of the recording (the heavy GPU + clustering phase), then apply the resulting sequences to the full recording duringsort_offline. Decouples the detection-phase memory ceiling from total recording length.Noneuses the full window for both phases (legacy behavior).
- __init__(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None)
- class spikelab.spike_sorting.config.WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True)[source]
Bases:
objectParameters for waveform extraction and template computation.
Memory-budget note: the default extractor pre-allocates one
(n_spikes, nsamples, num_channels).npymemmap per unit before extraction begins. For high-unit-count sorters on high-density MEAs this grows to tens of GB (e.g. 400 units × 1018 channels = ~39 GB). When that exceeds host RAM, setstreaming=Trueto use a one-unit-at-a-time path that discards each unit’s waveforms after templates and metrics are computed — peak RAM becomes one unit’s buffer (~100 MB for MaxOne) regardless of total unit count. Waveform files are only written whensave_waveform_files=True.- __init__(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True)
- class spikelab.spike_sorting.config.CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0)[source]
Bases:
objectParameters for unit quality-control curation.
- __init__(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0)
- class spikelab.spike_sorting.config.CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False)[source]
Bases:
objectParameters for result compilation and export.
- __init__(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False)
- class spikelab.spike_sorting.config.FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=<factory>, scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)')[source]
Bases:
objectParameters for QC figure generation.
- __init__(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=<factory>, scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)')
- class spikelab.spike_sorting.config.ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True)[source]
Bases:
objectParameters for pipeline execution control.
Includes safety knobs for the host-memory watchdog and the pre-loop preflight checks under
spikelab.spike_sorting.guards. Defaults are tuned for a 32–64 GB workstation; bump the GB thresholds on smaller hosts.- __init__(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True)
- class spikelab.spike_sorting.config.SortingPipelineConfig(recording=<factory>, sorter=<factory>, rt_sort=<factory>, waveform=<factory>, curation=<factory>, compilation=<factory>, figures=<factory>, execution=<factory>)[source]
Bases:
objectComplete configuration for a spike sorting pipeline run.
Groups all parameters into typed sub-configs. Passed explicitly to every pipeline function, replacing module-level globals.
- Parameters:
recording (RecordingConfig) – Recording loading and preprocessing.
sorter (SorterConfig) – Spike sorter selection and parameters.
rt_sort (RTSortConfig) – RT-Sort specific parameters (only used when
sorter.sorter_name == "rt_sort").waveform (WaveformConfig) – Waveform extraction and templates.
curation (CurationConfig) – Unit quality-control filters.
compilation (CompilationConfig) – Result export options.
figures (FigureConfig) – QC figure generation.
execution (ExecutionConfig) – Pipeline control and parallelism.
- recording: RecordingConfig
- sorter: SorterConfig
- rt_sort: RTSortConfig
- waveform: WaveformConfig
- curation: CurationConfig
- compilation: CompilationConfig
- figures: FigureConfig
- execution: ExecutionConfig
- classmethod from_kwargs(**kwargs)[source]
Build a config from flat keyword arguments.
Maps the flat parameter names used by
sort_with_kilosort2()to the nested sub-config fields. Unknown keys raiseTypeError.- Parameters:
**kwargs – Flat keyword arguments matching
sort_with_kilosort2()parameter names.- Returns:
Populated configuration.
- Return type:
config (SortingPipelineConfig)
- override(**kwargs)[source]
Return a copy of this config with selected fields overridden.
Accepts the same flat keyword arguments as
from_kwargs(). Unspecified fields retain their current values.- Parameters:
**kwargs – Flat keyword arguments to override.
- Returns:
New config with overrides.
- Return type:
config (SortingPipelineConfig)
- __init__(recording=<factory>, sorter=<factory>, rt_sort=<factory>, waveform=<factory>, curation=<factory>, compilation=<factory>, figures=<factory>, execution=<factory>)
- spikelab.spike_sorting.config.KILOSORT2 = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))
Default configuration for Kilosort2. Parameters are compatible with Maxwell MEA and other probe types. Hardware-specific presets can be created by overriding parameters.
- spikelab.spike_sorting.config.KILOSORT2_DOCKER = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=True), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))
Kilosort2 with Docker (no local MATLAB needed).
- spikelab.spike_sorting.config.KILOSORT4 = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort4', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))
Default configuration for Kilosort4. Kilosort4 is pure Python (PyTorch) — no MATLAB required. Default parameters are tuned for Neuropixels probes but work for other probe types. Hardware-specific presets (e.g. for Maxwell MEAs) can be created by overriding detection/filtering parameters.
- spikelab.spike_sorting.config.KILOSORT4_DOCKER = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort4', sorter_path=None, sorter_params=None, use_docker=True), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))
Kilosort4 with Docker.
- spikelab.spike_sorting.config.RT_SORT_MEA = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='rt_sort', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))
RT-Sort with the bundled MEA detection model. Uses the propagation-based RT-Sort algorithm (van der Molen, Lim et al. 2024, PLOS ONE) with the pretrained model tuned for Maxwell multi-electrode arrays.
- spikelab.spike_sorting.config.RT_SORT_NEUROPIXELS = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='rt_sort', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='neuropixels', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params={'stringent_thresh': 0.175, 'loose_thresh': 0.075, 'inference_scaling_numerator': 15.4, 'min_amp_dist_p': 0.1, 'max_latency_diff_spikes': 2.5, 'max_amp_median_diff_spikes': 0.45, 'max_latency_diff_sequences': 2.5, 'max_amp_median_diff_sequences': 0.45, 'max_root_amp_median_std_sequences': 2.5}, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True, host_ram_watchdog=True, host_ram_warn_pct=85.0, host_ram_abort_pct=92.0, host_ram_poll_interval_s=2.0, preflight=True, preflight_strict=False, preflight_min_free_inter_gb=20.0, preflight_min_free_results_gb=2.0, preflight_min_available_ram_gb=4.0, preflight_min_free_vram_gb=2.0, sorter_inactivity_timeout=True, sorter_inactivity_base_s=600.0, sorter_inactivity_per_min_s=30.0, sorter_inactivity_max_s=7200.0, sorter_inactivity_in_process_grace_s=10.0, oom_retry_max=1, oom_retry_factor=0.5, canary_first_n_s=0.0, canary_min_recording_s=120.0, docker_image_expected_digest=None, disk_watchdog=True, disk_warn_free_gb=5.0, disk_abort_free_gb=1.0, disk_poll_interval_s=10.0, io_stall_watchdog=True, io_stall_s=300.0, io_stall_poll_interval_s=10.0, io_stall_mode='process', io_stall_include_descendants=True, cleanup_temp_files=True, prevent_system_sleep=True, gpu_watchdog=True, gpu_warn_pct=85.0, gpu_abort_pct=95.0, gpu_poll_interval_s=2.0, gpu_warn_temp_c=85.0, gpu_abort_temp_c=92.0, gpu_monitor_throttle_reasons=True, tee_log_policy='delete_on_success', generate_sorting_report=True))
RT-Sort with the bundled Neuropixels detection model. Uses Neuropixels-tuned detection thresholds and merge parameters.
Backend Registry
Spike sorter backend registry.
Maps sorter names to their backend classes. Backends are imported lazily to avoid requiring all sorter dependencies at import time.
- spikelab.spike_sorting.backends.get_backend_class(sorter_name)[source]
Look up and import the backend class for a sorter name.
- Parameters:
sorter_name (str) – Registered sorter name (e.g.
"kilosort2").- Returns:
The
SorterBackendsubclass.- Return type:
cls
- Raises:
ValueError – If the sorter name is not registered.
- class spikelab.spike_sorting.backends.base.SorterBackend(config)[source]
Bases:
ABCInterface that each spike sorter backend must implement.
- Parameters:
config (SortingPipelineConfig) – Full pipeline configuration. Backends read their relevant sub-configs (
config.recording,config.sorter,config.waveform,config.execution).
- abstractmethod load_recording(rec_path)[source]
Load and preprocess a single recording.
Handles format-specific loading (Maxwell
.h5, NWB, etc.), gain/offset scaling, and bandpass filtering.
- abstractmethod sort(recording, rec_path, recording_dat_path, output_folder)[source]
Run the spike sorter on a preprocessed recording.
- Parameters:
recording – SpikeInterface
BaseRecordingfromload_recording.rec_path – Original recording file path (for binary conversion or metadata).
recording_dat_path (Path) – Path for the binary
.datfile (used by sorters that require pre-converted input).output_folder (Path) – Directory for sorter output files.
- Returns:
- A SpikeInterface
BaseSortingwith detected units and spike trains.
- A SpikeInterface
- Return type:
sorting
- abstractmethod extract_waveforms(recording, sorting, waveforms_folder, curation_folder, rec_path=None, rng=None)[source]
Extract per-unit waveforms and compute templates.
- Parameters:
recording – SpikeInterface
BaseRecording.sorting – SpikeInterface
BaseSortingfromsort.waveforms_folder (Path) – Root directory for waveform storage.
curation_folder (Path) – Directory for initial unit list and metadata.
- Returns:
An object providing at minimum:
sorting— the sorting object (possibly with centered spike times)recording— the recording objectsampling_frequency— floatpeak_ind— int (peak sample index in template)chans_max_all— dict or array mapping unit_id to max-amplitude channel indexuse_pos_peak— dict or array mapping unit_id to bool (polarity)get_computed_template(unit_id, mode)— returns(n_samples, n_channels)template arrayms_to_samples(ms)— time conversionroot_folder— Path to waveform files
This can be the custom
WaveformExtractor(Kilosort2 backend) or a wrapper around SpikeInterface’sWaveformExtractor(future backends).- Return type:
waveform_extractor
- write_recording(recording, dat_path)[source]
Convert a recording to the binary format needed by the sorter.
Not all sorters need this (some read recordings directly via SpikeInterface). The default implementation is a no-op.
- scale_oom_params(factor)[source]
Mutate
self.configto halve (or scale) the OOM-bound knob.Each backend overrides this to adjust the parameter most directly responsible for GPU memory consumption — typically the per-batch sample count. The default implementation does nothing and reports failure so callers know retry-on-OOM is not supported for that backend.
- snapshot_oom_params()[source]
Return a snapshot of OOM-bound config fields for restore.
Used by the per-recording OOM-retry loop so a scale-down applied for one recording does not silently persist into the next. The returned dict is opaque — only
restore_oom_params()is expected to read it.- Returns:
- Backend-specific snapshot. Default
implementation returns an empty dict.
- Return type:
snapshot (dict)
- restore_oom_params(snapshot)[source]
Restore the OOM-bound config fields from a prior snapshot.
Default implementation is a no-op. Backends that override
scale_oom_params()should also override this so the retry loop can reset the config between recordings.- Parameters:
snapshot (dict) – Object returned by
snapshot_oom_params().- Return type:
Classified Exceptions
When a sort fails, SpikeLab can classify the failure into one of three categories so that callers can implement skip/retry/stop policies without parsing generic error messages.
Classified spike-sorting exceptions shared across runners and curation.
Failures from Kilosort2, Kilosort4, and the downstream curation/waveform
code are grouped into three categories so callers can implement retry /
skip / hard-stop policies without parsing generic Exception messages:
BiologicalSortFailure— the recording itself cannot be sorted (too silent, all channels bad, no waveforms to compute metrics on). Recommended policy: mark the target as not-sortable, move on, do not retry.EnvironmentSortFailure— the host environment or container runtime is misconfigured. Recommended policy: hard stop and surface to the operator; retrying without intervention will loop.ResourceSortFailure— the job exhausted a machine resource (GPU memory today; disk/CPU in future). Recommended policy: retry with reduced parameters rather than skip or hard-stop.
Classifiers in _classifier inspect sorter logs and exception
chains to re-raise generic failures as one of the specific types below.
The classes are also usable directly from non-classifier paths (e.g.
curation code that already knows the exact condition).
- exception spikelab.spike_sorting._exceptions.SpikeSortingClassifiedError[source]
Bases:
RuntimeErrorBase class for all classified sort-pipeline failures.
Catch this when you want to treat any identified failure uniformly. Prefer catching the more specific categorical bases (
BiologicalSortFailure,EnvironmentSortFailure,ResourceSortFailure) when the policy differs by category.
- exception spikelab.spike_sorting._exceptions.BiologicalSortFailure[source]
Bases:
SpikeSortingClassifiedErrorFailure caused by the recording itself (too little signal).
- exception spikelab.spike_sorting._exceptions.EnvironmentSortFailure[source]
Bases:
SpikeSortingClassifiedErrorFailure caused by host or container environment misconfiguration.
- exception spikelab.spike_sorting._exceptions.ResourceSortFailure[source]
Bases:
SpikeSortingClassifiedErrorFailure caused by exhausting a machine resource.
- exception spikelab.spike_sorting._exceptions.InsufficientActivityError(message, *, sorter, threshold_crossings=None, units_at_failure=None, nspks_at_failure=None, log_path=None)[source]
Bases:
BiologicalSortFailureSorting crashed because the recording has too little spiking activity.
Kilosort2, Kilosort4, and RT-Sort all fail on near-silent recordings, but in different ways:
Kilosort2: mex kernels launch with degenerate grid/block configurations when template counts and per-batch spike counts approach zero. Pre-Blackwell GPUs tolerated these launches; newer architectures (compute capability ≥ 12) reject them with
CUDA error: invalid configuration argument.Kilosort4: sklearn’s
TruncatedSVDrejects an empty feature matrix, orKMeansfails then_samples >= n_clusterscheck, when the initial spike-detection pass finds essentially no events.RT-Sort:
detect_sequencesproduces zero propagation sequences when the recording lacks sufficient spiking activity for clustering. ReturnsNone, which causes anAttributeErrorwhensort_offlineis subsequently called.
- threshold_crossings
KS2 only; count of detected threshold crossings parsed from
kilosort2.log.Nonefor KS4 / RT-Sort.
- units_at_failure
KS2 template count at the crash, or KS4
n_sampleswhen KMeans complained.Nonewhen the log did not expose the value.
- nspks_at_failure
KS2 only; spikes-per-batch at the failing template-optimization step.
- log_path
Sorter log file carrying the full trace when located.
- sorter
Short identifier of the sorter that raised (
"kilosort2","kilosort4","rt_sort").
- exception spikelab.spike_sorting._exceptions.NoGoodChannelsError(message, *, sorter, total_channels=None, bad_channels=None, log_path=None)[source]
Bases:
BiologicalSortFailureAll channels were flagged as bad by the sorter’s good-channel check.
Distinct from
InsufficientActivityError: the signal may be noisy/present but no channel passes the sorter’sminfr_goodchannels(or equivalent) firing-rate threshold.- total_channels
Total channel count in the recording, when parsed.
- bad_channels
Channels flagged as bad.
- log_path
Sorter log file carrying the full trace when located.
- sorter
Short identifier of the sorter that raised.
- exception spikelab.spike_sorting._exceptions.SaturatedSignalError(message, *, channels_saturated=None, total_channels=None)[source]
Bases:
BiologicalSortFailureRecording appears flat or rail-saturated across all channels.
Typical causes: disconnected electrodes, loss of fluid contact, broken amplifier front-end, or a saved recording that never received real data. Distinct from
InsufficientActivityErrorbecause it reflects a hardware/acquisition fault rather than biology.The sort-time log signatures are ambiguous with near-silent biology, so this class is currently intended to be raised by dedicated pre-sort validators (e.g. per-channel variance / rail-clip checks) rather than by the post-failure classifiers. Callers that already know the condition may raise it directly.
- channels_saturated
Number of channels identified as saturated, when the caller provides this.
- total_channels
Total channel count in the recording.
- exception spikelab.spike_sorting._exceptions.EmptyWaveformMetricsError(message, *, metric_name=None)[source]
Bases:
BiologicalSortFailure,ValueErrorWaveform metrics (SNR, std-norm) cannot be computed.
Raised when curation requests a waveform-based metric but no precomputed values exist and
raw_dataon theSpikeDatais empty, so there is nothing to extract waveforms from.This is biology-adjacent: it typically means the upstream sorter produced units that have no usable waveform evidence attached, or that the pipeline skipped the waveform-extraction stage. Callers should treat it as “cannot curate this target” rather than retry.
Inherits from both
BiologicalSortFailure(for category-aware handling) andValueError(for backward compatibility with callers that historically caughtValueErrorfrom this site).- metric_name
The metric that could not be computed.
- exception spikelab.spike_sorting._exceptions.ConcurrentSortError(message, *, lock_path=None, holder_pid=None, holder_hostname=None, started_at=None)[source]
Bases:
EnvironmentSortFailureAnother sort is already in progress on the same intermediate folder.
Raised by
spikelab.spike_sorting.guards.acquire_sort_lock()when a pre-existing lock file points at an alive PID on the same host. Two concurrent sorts against the same intermediate folder would corrupt each other’s binary artefacts (KS2.datfile, RT-Sort scaled traces, curation cache), so the second sort fails fast rather than racing.Recommended remediation: wait for the running sort to finish, or point the second sort at a different
intermediate_folderspath. If you believe the holder is dead but the lock persists, delete<inter_path>/.spikelab_sort.lockby hand.- lock_path
Path to the lock file that triggered the abort.
- holder_pid
PID listed in the lock file (when readable).
- holder_hostname
Hostname listed in the lock file (when readable).
- started_at
ISO timestamp recorded when the holder acquired the lock.
- exception spikelab.spike_sorting._exceptions.HDF5PluginMissingError(message, *, configured_path=None)[source]
Bases:
EnvironmentSortFailureHDF5 filter plugin is missing or the plugin path is misconfigured.
Typical signatures in the underlying exception chain: h5py / HDF5 errors about being unable to open a compressed dataset, or the inherited
HDF5_PLUGIN_PATHenvironment variable pointing to a non-existent directory.Recommended remediation (operator, not the library): set
HDF5_PLUGIN_PATHto a directory containing the compression plugin required by the recording’s HDF5 build before any h5py import. The exact directory and plugin name are deployment-specific.- configured_path
The value of
HDF5_PLUGIN_PATHat failure time, if known.
- exception spikelab.spike_sorting._exceptions.DockerEnvironmentError(message, *, reason)[source]
Bases:
EnvironmentSortFailureDocker daemon, client library, or image is unusable for sorting.
The
reasonstring narrows the failure mode so callers can render better diagnostics or choose different remediations without catching sub-exceptions.Recognized
reasonvalues:"daemon_down"— Cannot connect to the Docker daemon."client_missing"— The Pythondockerclient library is not installed in the sorting env."image_pull_failed"— Image pull returned an error (network, auth, or manifest-not-found)."permission_denied"— Socket permission denied; user not in thedockergroup or equivalent."other"— Docker is broken in a way that did not match any known signature; inspect__cause__for details.
- reason
One of the strings above.
- exception spikelab.spike_sorting._exceptions.ModelLoadingError(message, *, sorter='rt_sort', model_path=None)[source]
Bases:
EnvironmentSortFailureDetection model could not be loaded or is unusable.
Raised when RT-Sort’s
ModelSpikeSorter.load()fails — typically because PyTorch is missing, weights are corrupt, the model folder does not exist, or the architecture parameters do not match the saved state dict.- model_path
Path that was attempted, when known.
- sorter
Short identifier of the sorter that raised.
- exception spikelab.spike_sorting._exceptions.GPUOutOfMemoryError(message, *, sorter, log_path=None)[source]
Bases:
ResourceSortFailureThe sorter exhausted GPU memory.
Raised when either a PyTorch
CUDA out of memoryerror (KS4) or a MATLAB/mexCUDA_ERROR_OUT_OF_MEMORYdiagnostic (KS2) appears in the exception chain or sorter log.Recommended remediation: reduce batch size /
NT/nPCs, split the recording into shorter segments, or run on a larger-memory GPU. Retrying the same command unchanged will loop.- sorter
Short identifier of the sorter that raised.
- log_path
Sorter log file carrying the full trace when located.
- exception spikelab.spike_sorting._exceptions.SorterTimeoutError(message, *, sorter, inactivity_s=None, log_path=None)[source]
Bases:
ResourceSortFailureThe sorter subprocess produced no output for too long.
Raised by
spikelab.spike_sorting.guards.LogInactivityWatchdogwhen the sorter’s log file has not been updated within the configured inactivity tolerance. Distinct from a hard wall-clock timeout: this fires only when the sort has stopped making progress (no log writes), so legitimate long sorts on dense MEAs / multi-hour recordings are not falsely killed.Recommended remediation: skip the recording and continue. Retrying without intervention will likely hang again at the same stage. Investigate the sorter log up to the inactivity point for the proximate cause (CUDA hang, MATLAB JVM deadlock, mex kernel failure mode, disk-full stall).
- sorter
Short identifier of the sorter that hung.
- inactivity_s
Configured inactivity tolerance at the time of the trip, in seconds.
- log_path
Path to the sorter log file the watchdog was polling, when known.
- exception spikelab.spike_sorting._exceptions.DiskExhaustionError(message, *, folder=None, free_gb_at_trip=None, abort_threshold_gb=None, report=None)[source]
Bases:
ResourceSortFailureFree disk space crossed the watchdog abort threshold mid-sort.
Raised by
spikelab.spike_sorting.guards.DiskUsageWatchdogwhenshutil.disk_usage(folder).freedrops below the configured abort threshold while a sort is in progress. RT-Sort especially can fill a volume mid-run by writing scaled traces, model traces, and model outputs as large.npyfiles.The exception carries a
DiskExhaustionReportdescribing free space, projected need, top disk consumers in the watched folder, and suggested operator actions.Recommended remediation: free disk space (or shorten the recording window via
RTSortConfig.recording_window_ms/first_n_mins) and rerun. The report’stop_consumersfield flags the largest existing files in the watched folder so the operator can clean up safely.- folder
The folder whose free space crossed the threshold.
- free_gb_at_trip
Free space (GB) at the moment of the trip.
- abort_threshold_gb
Configured abort threshold (GB).
- report
Optional
DiskExhaustionReportwith the full diagnostic payload.Noneonly when the report could not be assembled (e.g.os.walkfailed).
- exception spikelab.spike_sorting._exceptions.GpuMemoryWatchdogError(message, *, device_index=None, used_pct_at_trip=None, abort_pct=None)[source]
Bases:
ResourceSortFailureGPU VRAM crossed the watchdog abort threshold mid-sort.
Raised by
spikelab.spike_sorting.guards.GpuMemoryWatchdogwhen free VRAM on the device-in-use drops below the configured abort threshold (or used VRAM crosses the abort percentage). Sharp GPU OOMs typically come from PyTorch allocator fragmentation rather than a cleancudaMallocfailure, so a percentage-based early warning lets the pipeline trigger the existing OOM-retry path with a reduced batch before the next allocation hits the wall.Recommended remediation: rerun with reduced sorter batch params (the existing OOM-retry path handles this automatically through
GPUOutOfMemoryErrorclassification, which this exception subclasses-by-symmetry — both surface asoom_gpustatus).- device_index
Index of the GPU device that crossed the threshold.
- used_pct_at_trip
GPU memory used percentage at the moment of the trip.
- abort_pct
Configured abort percentage threshold.
- exception spikelab.spike_sorting._exceptions.GpuThermalWatchdogError(message, *, device_index=None, temperature_c_at_trip=None, abort_temp_c=None)[source]
Bases:
ResourceSortFailureGPU temperature crossed the watchdog abort threshold mid-sort.
Raised by
spikelab.spike_sorting.guards.GpuMemoryWatchdogwhen the device’s reported temperature crosses the configured abort threshold. Sustained operation above the GPU’s thermal junction limit risks driver-level throttling that produces silently degraded output, or in extreme cases a hardware shutdown that loses the in-progress sort.Recommended remediation: pause the batch until the GPU cools (check airflow, ambient temperature, dust on the heatsink), then rerun. A persistent thermal trip across reboots indicates a cooling failure that needs operator attention.
- device_index
Index of the GPU device that crossed the threshold.
- temperature_c_at_trip
Reported device temperature in degrees Celsius at the moment of the trip.
- abort_temp_c
Configured abort temperature threshold.
- exception spikelab.spike_sorting._exceptions.IOStallError(message, *, device=None, stall_s=None)[source]
Bases:
ResourceSortFailureDisk I/O stalled mid-sort.
Raised by
spikelab.spike_sorting.guards.IOStallWatchdogwhenpsutil.disk_io_counters()for the watched volume shows no byte-counter movement for the configured tolerance — typical of a hung NFS / SMB / S3-fuse mount that’s still accepting file handles but not actually reading or writing.The inactivity watchdog catches some I/O stalls (no log output → trip), but a sorter that keeps logging while waiting for I/O can defeat that signal. The I/O stall watchdog adds a second layer specifically targeting kernel-level read/write progress.
- device
Volume identifier (e.g.
"sda1","C:").
- stall_s
Configured stall tolerance at the time of the trip.
- exception spikelab.spike_sorting._exceptions.HostMemoryWatchdogError(message, *, percent_at_trip=None, abort_pct=None)[source]
Bases:
ResourceSortFailureHost RAM pressure exceeded the watchdog abort threshold.
Raised by
spikelab.spike_sorting.guards.HostMemoryWatchdogwhenpsutil.virtual_memory().percentcrosses the configured abort percentage. Distinct from a PythonMemoryError(which fires on a failed allocation): this signals impending host-level thrash before any individual allocation has hit a wall, so the pipeline can skip the current recording and let the workstation recover.Recommended remediation: skip the current recording, free references and call
gc.collect()/torch.cuda.empty_cache(), then continue with the next recording. Investigate the recording that tripped the trigger — long durations, very high unit counts, or oversized intermediate buffers are common causes.- percent_at_trip
psutilsystem memory percentage at the moment the watchdog tripped.
- abort_pct
Configured abort threshold.
Post-Failure Classifiers
The classifier module inspects sorter logs and exception chains to produce
specific SpikeSortingClassifiedError
subclasses from generic failures.
- spikelab.spike_sorting._classifier.classify_ks2_failure(output_folder, exc)[source]
Return a classified exception for a Kilosort2 failure, or
None.Priority: environment → resource → biology. Environment and resource errors can appear on any recording, so they take precedence over biology signatures that would otherwise be consistent with them.
- Return type:
- spikelab.spike_sorting._classifier.classify_ks4_failure(output_folder, exc)[source]
Return a classified exception for a Kilosort4 failure, or
None.Priority mirrors KS2. KS4 does not expose a distinct “all channels bad” diagnostic the same way KS2 does, so only the generic biology classifier (insufficient activity) is applied.
- Return type:
- spikelab.spike_sorting._classifier.classify_rt_sort_failure(output_folder, exc)[source]
Return a classified exception for an RT-Sort failure, or
None.Priority: environment → resource → biology. RT-Sort does not use Docker, but the HDF5 plugin check applies because it reads HDF5 recordings. GPU OOM is possible during model inference.
- Parameters:
output_folder (Path) – RT-Sort output directory (may contain
rt_sort.log).exc (BaseException) – The caught exception.
- Returns:
- A classified
exception if a known signature was found, otherwise None.
- Return type:
classified (SpikeSortingClassifiedError or None)
Sort Run Reports
sort_recording can return a structured per-run report via the
out_report= keyword argument, capturing per-recording status, timings,
and any classified failure.
- class spikelab.spike_sorting.pipeline.SortRunReport(records=<factory>)[source]
Bases:
objectPer-batch summary of a
sort_recording()invocation.Records a
RecordingResultfor each input recording — both successes and failures — so callers can inspect the outcome programmatically without parsing the per-recording log files.The report is also serialised to disk:
Per-recording:
<results_folder>/recording_report.json(always written).Per-batch: optional, see
out_reportparameter onsort_recording().
- Parameters:
records (list[RecordingResult]) – Per-recording outcomes in the order they were processed. Use the convenience properties for filtered views.
- records: List[RecordingResult]
- add(record)[source]
Append a per-recording result.
- Parameters:
record (RecordingResult) – Outcome of one recording.
- Return type:
- property succeeded: List[RecordingResult]
All successful recordings, in run order.
- property failed: List[RecordingResult]
All non-successful recordings, in run order.
- __init__(records=<factory>)
- class spikelab.spike_sorting.pipeline.RecordingResult(rec_name, rec_path, results_folder, status, wall_time_s, n_curated_units=None, error_class=None, error_message=None, retries_used=0, log_path=None, peak_host_ram_pct=None, peak_gpu_used_pct=None, min_disk_free_gb=None)[source]
Bases:
objectOutcome of sorting a single recording within a batch.
- Parameters:
rec_name (str) – Short recording identifier (the file’s basename).
rec_path (str) – Original recording path as a string.
results_folder (str) – Per-recording results folder.
status (str) – One of
"success","failed","oom_gpu","oom_host_ram","oom_memoryerror","sorter_timeout","disk_exhausted","gpu_thermal","io_stall","concurrent_sort".wall_time_s (float) – Wall-clock time spent on this recording (including OOM retries).
n_curated_units (int or None) – Number of curated units when successful, otherwise
None.error_class (str or None) –
type(exc).__name__on failure, otherwiseNone.error_message (str or None) –
str(exc)on failure (first 500 chars), otherwiseNone.retries_used (int) – OOM-retry attempts consumed.
log_path (str or None) – Path to the per-recording Tee log file (
sorting_<timestamp>.log). Populated bysort_recordingso the batch summary can point users at the log for failure diagnosis.
- __init__(rec_name, rec_path, results_folder, status, wall_time_s, n_curated_units=None, error_class=None, error_message=None, retries_used=0, log_path=None, peak_host_ram_pct=None, peak_gpu_used_pct=None, min_disk_free_gb=None)
After a successful sort, the pipeline writes a human-readable
sorting_report.md next to the results. The functions below let you
regenerate it manually or extract its components programmatically.
- spikelab.spike_sorting.report.generate_sorting_report(results_folder, *, log_path=None, recording_report_path=None, curated_pkl_path=None, config_used_path=None, output_path=None)[source]
Generate a Markdown sorting report for a single recording.
Reads the per-recording Tee log,
recording_report.json,config_used.json, and the curated SpikeData pickle (each auto-detected from results_folder when its argument isNone), then writes a structured Markdown report describing the run.The report is the input the
spikelab-spikesorteragent skill consumes — it replaces the manual report-writing instructions with a deterministic, testable artefact.- Parameters:
results_folder (path-like) – The per-recording results directory. All other paths default to standard names inside this folder when their argument is
None.log_path (path-like or None) – Path to the Tee log file (
sorting_<timestamp>.log).Noneauto-picks the most recent matching file in results_folder.recording_report_path (path-like or None) – Path to
recording_report.json. Default:<results_folder>/recording_report.json.curated_pkl_path (path-like or None) – Path to the curated SpikeData pickle. Default:
<results_folder>/sorted_spikedata_curated.pkl.config_used_path (path-like or None) – Path to
config_used.json. Default:<results_folder>/config_used.json.output_path (path-like or None) – Where to write the report. Default:
<results_folder>/sorting_report.md.
- Returns:
- The written file’s path, or
Noneon best-effort failure (the surrounding pipeline never lets a report failure abort the batch).
- The written file’s path, or
- Return type:
path (Path or None)
- spikelab.spike_sorting.report.parse_sorting_log(log_text)[source]
Extract structured fields from a Tee-mirrored sorting log.
The sort_recording pipeline writes per-recording stdout to a
sorting_<timestamp>.logviaTee. That log includes a structured banner block, ISO-stamped stage banners, the “Curation: N -> M units” line, a closing summary, and any Python traceback on failure. This function pulls those pieces out into a dict suitable for templating into Markdown.- Parameters:
log_text (str) – Full text of the Tee log file.
- Returns:
- Keys include
environment(dict), run(dict),stage_timings(list of{name, timestamp}dicts),curation_line(str or None),closing_summary(dict),warnings(list[str]),traceback(str or None),last_lines_before_traceback(list[str]).
- Keys include
- Return type:
info (dict)
- spikelab.spike_sorting.report.extract_unit_quality_stats(curated_pkl_path)[source]
Read the curated SpikeData pickle and return per-metric summary stats.
Reads attributes from
sd.neuron_attributesfor SNR, std_norm, amplitude. Computes firing rate fromsd.trainlengths andsd.length. Returns{}when the pickle cannot be loaded or is empty.- Parameters:
curated_pkl_path (Path) – Path to
sorted_spikedata_curated.pkl.- Returns:
Dict of metric name → summary stats dict.
- Return type:
stats (dict)
Resource Guards
The pipeline ships with a set of preflight checks and live watchdogs that
run automatically during a sort. Most users never need to touch these
directly — they are configured via ExecutionConfig
and surface as classified exceptions when triggered. The pieces below are
exposed for advanced users who want to run preflight checks standalone or
inspect watchdog state.
- spikelab.spike_sorting.guards.run_preflight(config, recording_files, intermediate_folders, results_folders)[source]
Run pre-loop resource checks; return all findings.
Findings are not raised by this function — the caller decides whether to escalate based on
ExecutionConfig.preflight_strict.- Parameters:
config (SortingPipelineConfig) – Pipeline configuration. Reads thresholds from
config.execution; sorter selection fromconfig.sorter; recording-side overrides fromconfig.recording; RT-Sort device + probe fromconfig.rt_sort.recording_files (sequence) – Recording inputs (used for length sanity in future checks; currently unused but kept in the signature for forward compatibility).
intermediate_folders (sequence of path-like) – Per-recording intermediate folders. Disk free space is checked at each folder’s nearest existing ancestor.
results_folders (sequence of path-like) – Per-recording results folders. Disk free space is checked similarly.
- Returns:
- All findings produced by
the checks. May be empty when the host has plenty of headroom.
- Return type:
findings (list[PreflightFinding])
- Raises:
ValueError – If any of
config.execution.preflight_min_*_gbisNone. The thresholds must be numeric.
Notes
Empty
recording_files,intermediate_folders, orresults_foldersproduce a fail-level “environment” finding (codesno_recordings,no_intermediate_folders,no_results_folders) but do not short-circuit — the host and dependency checks still run.
- class spikelab.spike_sorting.guards.PreflightFinding(level, code, message, remediation=None, category='resource')[source]
A single resource-check finding from
run_preflight().- Parameters:
level (str) – Either
"warn"or"fail".code (str) – Short stable identifier (e.g.
"low_disk_inter","low_vram").message (str) – One-line description of what was observed.
remediation (str or None) – Suggested action for the operator.
category (str) – One of
"resource"or"environment"— controls which exception subclass is raised when the finding is escalated.
- __init__(level, code, message, remediation=None, category='resource')
- class spikelab.spike_sorting.guards.HostMemoryWatchdog(warn_pct=85.0, abort_pct=92.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0)[source]
Daemon-thread watchdog that aborts the sort on host RAM pressure.
Use as a context manager. While the context is active a daemon thread polls system memory; on abort it terminates registered subprocesses and injects a
KeyboardInterruptinto the main thread.- Parameters:
warn_pct (float) – System memory percentage at which the watchdog prints a (rate-limited) warning. Defaults to
85.0.abort_pct (float) – System memory percentage at which the watchdog terminates registered subprocesses and aborts the main thread. Defaults to
92.0.poll_interval_s (float) – Seconds between polls. Defaults to
2.0.warn_repeat_s (float) – Minimum seconds between repeated warnings at the same level. Defaults to
30.0.kill_grace_s (float) – Default seconds between
terminate()andkill()for registered subprocesses. Per-subprocess overrides are accepted inregister_subprocess(). Defaults to5.0.
Notes
Degrades to a no-op when
psutilis missing.Safe to nest: the inner context is the active one for the duration of its body, and the outer context resumes on exit.
- __init__(warn_pct=85.0, abort_pct=92.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0)[source]
- register_subprocess(popen, *, kill_grace_s=None)[source]
Track a subprocess for termination on watchdog abort.
- Parameters:
popen (subprocess.Popen) – The child process handle. The watchdog calls
terminate()first, thenkill()afterkill_grace_sseconds if the process is still alive.kill_grace_s (float or None) – Override the default grace period for this subprocess.
Noneuses the watchdog’skill_grace_s.
- Return type:
- unregister_subprocess(popen)[source]
Stop tracking a previously registered subprocess.
- Parameters:
popen (subprocess.Popen) – Handle previously passed to
register_subprocess(). No-op if not registered.- Return type:
- register_kill_callback(callback)[source]
Track a zero-arg callable to invoke on watchdog abort.
Used for kill targets that are not
subprocess.Popenobjects — Docker containers, kubernetes pods, custom cleanup hooks. The callback runs after any registered subprocesses have been terminated. Exceptions raised by a callback are logged but do not prevent other callbacks from running.- Parameters:
callback (Callable[[], None]) – Zero-arg function. Should be idempotent and tolerate being called on an already-stopped target — the watchdog cannot tell whether the kill target is still alive.
- Return type:
Notes
To allow the kill target to be garbage-collected even while registered, build the callback with a weakref to the target rather than capturing it directly. See
docker_utils.patched_container_clientfor the container-kill pattern.
- unregister_kill_callback(callback)[source]
Stop tracking a previously registered kill callback.
- Parameters:
callback (Callable[[], None]) – Callable previously passed to
register_kill_callback(). No-op if not registered. Identity comparison is used.- Return type:
- interrupt_delivery_failed()[source]
Return True if the trip fired but
_thread.interrupt_mainraised.When True, host protection ran successfully (subprocesses terminated, kill callbacks invoked) but the main thread did not receive a
KeyboardInterrupt. The pipeline’s catch site checks this to reclassify a downstreamBrokenPipeError/RuntimeError(caused by the now-dead subprocess) as the appropriate watchdog error.
- make_error(message=None)[source]
Build a
HostMemoryWatchdogErrorfrom the trip state.- Parameters:
message (str or None) – Override the default message.
- Returns:
Exception ready to raise.
- Return type:
err (HostMemoryWatchdogError)
- class spikelab.spike_sorting.guards.GpuMemoryWatchdog(device_index=0, *, warn_pct=85.0, abort_pct=95.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0, warn_temp_c=85.0, abort_temp_c=92.0, monitor_throttle_reasons=True)[source]
Daemon-thread watchdog that aborts on GPU VRAM or thermal pressure.
Use as a context manager around the per-recording sort. Each poll inspects three signals:
VRAM usage — crossing
warn_pctprints a rate-limited warning; crossingabort_pctbuilds aGpuMemoryWatchdogError, terminates registered subprocesses, runs kill callbacks, and raises into the main thread.Device temperature — crossing
warn_temp_cprints a rate-limited warning; crossingabort_temp_caborts with aGpuThermalWatchdogError. Sustained operation above the GPU’s thermal junction limit risks driver-level throttling that silently degrades sort output.Active throttle reasons — when the device reports SW/HW power-cap or thermal slowdown, prints a rate-limited warning (no abort: the device is already protecting itself).
- Parameters:
device_index (int) – GPU index to monitor. Use
resolve_active_device()to pick from the config.warn_pct (float) – Used-memory percentage at which to warn. Defaults to
85.0.abort_pct (float) – Used-memory percentage at which to abort. Defaults to
95.0.poll_interval_s (float) – Seconds between polls. Defaults to
2.0.warn_repeat_s (float) – Minimum seconds between repeated warnings. Defaults to
30.0.kill_grace_s (float) – Seconds between
terminate()andkill()on registered subprocesses.warn_temp_c (float or None) – Temperature in degrees Celsius at which to warn.
Nonedisables the warn-stage temp check. Defaults to85.0.abort_temp_c (float or None) – Temperature at which to abort.
Nonedisables thermal aborts. Defaults to92.0.monitor_throttle_reasons (bool) – When True, surface NVML throttle reasons (SW power cap, HW thermal slowdown, HW power brake) as rate-limited warnings. Defaults to
True.
Notes
Thermal monitoring requires
pynvml; thenvidia-smi-only fallback path used byread_gpu_memory()does not surface temperature. When pynvml is missing, thermal/throttle checks silently degrade while VRAM monitoring continues via nvidia-smi.Disabled (no-op context manager) when no usable GPU info source is available.
- __init__(device_index=0, *, warn_pct=85.0, abort_pct=95.0, poll_interval_s=2.0, warn_repeat_s=30.0, kill_grace_s=5.0, warn_temp_c=85.0, abort_temp_c=92.0, monitor_throttle_reasons=True)[source]
- interrupt_delivery_failed()[source]
Return True if the trip fired but
_thread.interrupt_mainraised.When True, GPU protection ran successfully (subprocesses terminated, kill callbacks invoked) but the main thread did not receive a
KeyboardInterrupt. The pipeline’s catch site checks this to reclassify a downstream exception caused by the now-dead subprocess.
- make_error(message=None)[source]
Build the trip-kind-appropriate watchdog error.
- Parameters:
message (str or None) – Override the default message.
- Returns:
GpuMemoryWatchdogErrorfor VRAM trips,GpuThermalWatchdogErrorfor temperature trips. Falls back to a memory-shaped error when the trip kind is unset.
- Return type:
err
- register_subprocess(popen, *, kill_grace_s=None)[source]
Track a subprocess for termination on watchdog abort.
- Return type:
- unregister_subprocess(popen)[source]
Stop tracking a previously registered subprocess.
- Return type:
- class spikelab.spike_sorting.guards.DiskUsageWatchdog(folder, *, warn_free_gb=5.0, abort_free_gb=1.0, poll_interval_s=10.0, warn_repeat_s=30.0, sorter='sort', projected_need_gb=None, popen=None, kill_callback=None, kill_grace_s=5.0)[source]
Daemon watchdog that aborts the sort on low free disk space.
Use as a context manager around the per-recording sort. While active, a daemon thread polls free space on folder every
poll_interval_sseconds. Crossingwarn_free_gbprints a rate-limited warning; crossingabort_free_gbbuilds aDiskExhaustionReport, terminates any registered subprocess, and runs an optional kill callback (mirroring the in-process kill path used byLogInactivityWatchdog).- Parameters:
folder (Path) – The folder to monitor (typically the per-recording intermediate folder).
warn_free_gb (float) – Free-disk threshold at which to print a warning. Defaults to
5.0.abort_free_gb (float) – Free-disk threshold at which to abort the sort. Defaults to
1.0.poll_interval_s (float) – Seconds between polls. Defaults to
10.0.warn_repeat_s (float) – Minimum seconds between repeated warnings. Defaults to
30.0.sorter (str) – Short identifier used in diagnostic prints and in the resulting
DiskExhaustionError.projected_need_gb (float or None) – Optional sorter-specific disk projection; included verbatim in the trip report when present.
popen (subprocess.Popen or None) – Subprocess to terminate on trip (e.g. KS2 MATLAB child).
kill_callback (Callable[[], None] or None) – Optional zero-arg callable invoked on trip — used by in-process sorters to install a two-stage interrupt-then-os._exit fallback.
kill_grace_s (float) – Seconds between
terminate()andkill()on a registered subprocess.
Notes
The watchdog only trips once. After trip the polling thread exits.
Disabled (no-op) when
abort_free_gbis non-positive or when neither a popen nor a kill_callback is provided.
- __init__(folder, *, warn_free_gb=5.0, abort_free_gb=1.0, poll_interval_s=10.0, warn_repeat_s=30.0, sorter='sort', projected_need_gb=None, popen=None, kill_callback=None, kill_grace_s=5.0)[source]
- report()[source]
Return the
DiskExhaustionReportif the watchdog tripped.- Return type:
- make_error(message=None)[source]
Build a
DiskExhaustionErrorfrom the trip state.- Parameters:
message (str or None) – Override the default message.
- Returns:
Exception ready to raise.
- Return type:
err (DiskExhaustionError)
- class spikelab.spike_sorting.guards.LogInactivityWatchdog(log_path, popen, inactivity_s, *, sorter, poll_interval_s=5.0, kill_grace_s=5.0, kill_callback=None)[source]
Daemon watchdog that kills a subprocess on sorter-log inactivity.
Use as a context manager around the call that waits for the sorter subprocess. While the context is active a daemon thread polls
log_path(viaos.stat().st_mtime) everypoll_interval_s. If the file’s mtime has not advanced forinactivity_sseconds the watchdog terminates the registered subprocess and records the trip; the wait then returns and the runner can detect the kill viatripped()and raiseSorterTimeoutError.- Parameters:
log_path (Path) – Path to the sorter’s log file. The file does not need to exist when the watchdog starts — it’s polled for first appearance, and the watchdog is forgiving about “no log yet” until the file shows up. The pre-existing mtime (from a previous run, if any) is recorded at start so an old stale log doesn’t trip immediately.
popen (subprocess.Popen or None) – Subprocess handle to terminate on trip. Pass
Nonewhen the sort runs in-process — seekill_callbackinstead.inactivity_s (float) – Inactivity tolerance in seconds. Use
compute_inactivity_timeout_s()to derive a sensible value from recording duration.sorter (str) – Short identifier of the sorter (used for logging and the resulting
SorterTimeoutError).poll_interval_s (float) – Seconds between mtime polls. Defaults to
5.0.kill_grace_s (float) – Seconds between
terminate()andkill()if the subprocess does not exit. Defaults to5.0.kill_callback (Callable[[], None] or None) – Optional callback invoked after the subprocess termination step. Used by in-process backends (KS4 host, RT-Sort) to install a two-stage kill:
_thread.interrupt_mainfirst, thenos._exitif Python is unresponsive. Seemake_in_process_kill_callback().
Notes
When
inactivity_sisNone, OR when neitherpopennorkill_callbackis provided, the watchdog is a no-op context manager. This makes it safe to drop in unconditionally — passinactivity_s=Noneto disable.The watchdog only trips once. After trip, the polling thread exits.
- __init__(log_path, popen, inactivity_s, *, sorter, poll_interval_s=5.0, kill_grace_s=5.0, kill_callback=None)[source]
- make_error(message=None)[source]
Build a
SorterTimeoutErrorfrom the trip state.- Parameters:
message (str or None) – Override the default message.
- Returns:
Exception ready to raise.
- Return type:
err (SorterTimeoutError)
- class spikelab.spike_sorting.guards.IOStallWatchdog(folder=None, *, pids=None, include_descendants=True, stall_s=300.0, poll_interval_s=10.0, warn_repeat_s=60.0, kill_grace_s=5.0)[source]
Daemon-thread watchdog that aborts the sort on I/O stalls.
Use as a context manager around the per-recording sort. Operates in one of two modes (chosen at construction):
Device mode — pass folder: polls
read_bytes + write_bytesfor the volume holding the folder everypoll_interval_s. Catches kernel-wide I/O hangs but is sensitive to ambient I/O on the same disk.Process mode — pass pids: polls
psutil.Process(pid).io_counters()summed across the registered PIDs (and their descendants by default). Detects stalls in the sort process tree specifically; immune to ambient I/O from unrelated processes on the same device.
Either folder or pids (or both) must be provided. When both are given, process mode is used. Additional PIDs can be registered after construction via
register_pid()— useful for catching e.g. a Docker container PID after the container actually starts.On stall, the watchdog builds an
IOStallError, terminates registered subprocesses, runs kill callbacks, and raises into the main thread via_thread.interrupt_main.- Parameters:
folder (Path or None) – A path on the volume to monitor (typically the per-recording intermediate folder). Provide for device-mode monitoring.
Noneto skip device monitoring entirely.pids (Sequence[int] or None) – Process IDs to monitor in process mode. Defaults to
None(device mode). The watchdog sums I/O bytes across these processes and (ifinclude_descendants) their entire descendant trees on every poll.include_descendants (bool) – When in process mode, recurse into each registered PID’s children on every poll so subprocesses spawned by the sort (e.g. spikeinterface workers, KS2 MATLAB child) are accounted for. Defaults to
True. SetFalseif you want to detect a stall in only the registered PIDs without their descendants — rare; mostly useful for debugging.stall_s (float) – Inactivity tolerance for the byte counter, in seconds. Defaults to
300(5 min) — long enough to span normal write bursts and quiet stretches, short enough to flag genuinely hung mounts.poll_interval_s (float) – Seconds between polls. Defaults to
10.0.warn_repeat_s (float) – Minimum seconds between repeated warnings.
kill_grace_s (float) – Seconds between
terminate()andkill()for registered subprocesses.
Notes
Process mode requires
psutil. Device mode is also disabled whenpsutilis missing or when no device can be resolved for folder. To skip the I/O-stall check intentionally, omit anyregister_kill_callbackcalls — the watchdog still polls but has nothing to abort.Unlike
HostMemoryWatchdog, this watchdog does not accept subprocess registrations — only kill callbacks. A Docker-backed sort whose container is registered with the host watchdog will not have its container killed when the I/O stall watchdog trips.Docker container processes are visible to the host’s
psutilbut are NOT children of the orchestrating Python process — Docker daemon is the parent. To monitor a Docker-backed sort in process mode, register the container’s main PID explicitly viaregister_pid()once it’s known (docker inspect --format '{{.State.Pid}}' <id>).
- __init__(folder=None, *, pids=None, include_descendants=True, stall_s=300.0, poll_interval_s=10.0, warn_repeat_s=60.0, kill_grace_s=5.0)[source]
- interrupt_delivery_failed()[source]
Return True if the trip fired but
_thread.interrupt_mainraised.When True, host I/O protection ran successfully (kill callbacks invoked) but the main thread did not receive a
KeyboardInterrupt. The pipeline’s catch site checks this to reclassify a downstream exception.
- register_kill_callback(callback)[source]
Track a zero-arg callable to invoke on watchdog abort.
- Return type:
- register_pid(pid)[source]
Add a PID to the process-mode poll set.
Useful for tracking processes that don’t exist yet at watchdog construction — e.g. registering the Docker container’s main PID once the container has actually started, or registering a sorter subprocess after
Popenreturns.No-op when called in device mode (the watchdog isn’t polling per-PID counters there). The PID is added atomically; the next poll picks it up.
- Parameters:
pid (int) – The PID to monitor. Must be a positive integer.
- Raises:
ValueError – If pid is not a positive integer.
- Return type:
- class spikelab.spike_sorting.guards.DiskExhaustionReport(folder, free_gb_at_trip, abort_threshold_gb, projected_need_gb=None, bytes_consumed_during_sort=0.0, top_consumers=<factory>, suggested_actions=<factory>)[source]
Diagnostic payload built when the disk watchdog trips.
- Parameters:
folder (str) – The folder whose free space crossed the abort threshold.
free_gb_at_trip (float) – Free disk space (GB) at the trip moment.
abort_threshold_gb (float) – Configured abort threshold (GB).
projected_need_gb (float or None) – Sorter-specific projected on-disk footprint in GB when known (e.g. RT-Sort’s
estimate_rt_sort_intermediate_gbvalue).bytes_consumed_during_sort (float) – Bytes consumed inside
foldersince the watchdog started — i.e. how much this sort has written. Useful for distinguishing “I started near full and crossed the line” vs “I wrote everything”.top_consumers (list[tuple[str, float]]) – Up to 10 largest files inside
folder(depth-boundedos.walk) as(path, gb)tuples, sorted descending. Helps the operator identify what to clean up.suggested_actions (list[str]) – Free-form text hints. The watchdog seeds these from the trip context; callers can extend.
- __init__(folder, free_gb_at_trip, abort_threshold_gb, projected_need_gb=None, bytes_consumed_during_sort=0.0, top_consumers=<factory>, suggested_actions=<factory>)
Pipeline Canary
- spikelab.spike_sorting.canary.run_canary(config, recording, rec_path, inter_path, *, sorter_name=None, rec_name='canary', rng=None)[source]
Run a short-window smoke test of the configured backend.
Builds a canary clone of config (see
_build_canary_config()), spins up a fresh backend instance against that clone, and invokesspikelab.spike_sorting.pipeline.process_recording()against a<inter_path>/_canary/subdirectory.- Parameters:
config (SortingPipelineConfig) – Live pipeline configuration. Read but never mutated.
recording (
Any) – Pre-loadedBaseRecordingfor the canary, orNonewhen only a path is available.rec_path (
Any) – Path to the recording on disk. Used by the backend loader when recording isNone.inter_path (
Any) – The recording’s intermediate folder. The canary writes under a_canarysub-folder so the real sort’s artefacts are untouched.sorter_name (str or None) – Override the sorter resolved from
config.sorter.sorter_name. Mostly used by tests.rec_name (str) – Short identifier for the canary in log output.
rng (np.random.Generator or None) – Optional RNG passed through to
process_recordingfor reproducibility.
- Returns:
- A classified exception when
the canary discovered a failure the full sort would also have hit;
Nonewhen the canary succeeded or when the canary itself hit an unexpected non-classified failure (which the live watchdogs are responsible for during the real run).
- Return type:
result (BaseException or None)
Stimulation Sorting
Helpers for spike-sorting recordings with electrical stimulation: artifact removal, alignment recentering (single-pulse and multi-pulse), and the end-to-end stim-aware pipeline. See the Stimulation Artifact Removal section of the guide for usage examples.
- spikelab.spike_sorting.stim_sorting.sort_stim_recording(stim_recording, rt_sort, stim_times_ms, pre_ms, post_ms, fs_Hz=None, *, artifact_method='polynomial', artifact_window_ms=10.0, saturation_threshold=None, baseline_threshold=None, poly_order=3, artifact_window_only=True, max_stim_offset_ms=50.0, peak_mode='abs_max', n_reference_channels=8, prewindow_ms=5.0, multi_peak=False, multi_peak_select='first', multi_peak_threshold=0.6, multi_peak_min_separation_ms=2.0, model=None, model_path=None, recording_window_ms=None, verbose=True)[source]
Sort spikes in a stimulation recording using pre-trained RT-Sort sequences.
Takes a raw stimulation recording and a trained
RTSortobject (or path to a saved one produced bysort_recording(..., sorter="rt_sort")), removes stimulation artifacts, runs offline spike sorting, and returns aSpikeSliceStackof sorted spikes aligned to the corrected stimulation event times.Memory model. When
stim_recordingis a path or a lazy SpikeInterface recording, the pipeline processes one per-event time chunk at a time (peak RAM ≈ one chunk’s working set, typically 100-200 MB on MaxOne — independent of recording duration). Whenstim_recordingis a pre-materialisednp.ndarray, the full-recording path is used instead (caller has already paid the memory cost).- Parameters:
stim_recording –
The stimulation recording. Can be: -
strorPathto a recording file (Maxwell .h5 orNWB). Chunked path.
A SpikeInterface
BaseRecordingobject. Chunked path.np.ndarrayof shape(channels, samples). Full-recording path (no chunking possible).
rt_sort – The trained RT-Sort object or path to its pickle.
stim_times_ms (array-like) – Logged stimulation event times in milliseconds.
pre_ms (float) – Output peri-event window radius before each stim event, in milliseconds.
post_ms (float) – Output peri-event window radius after each stim event, in milliseconds.
fs_Hz (float or None) – Sampling frequency in Hz. Required for ndarray input; inferred from the recording object otherwise.
artifact_method (str) –
"polynomial"(default) or"blank". Passed toremove_stim_artifacts.artifact_window_ms (float) – Max artifact tail duration after the last desaturation. Default 10.0.
saturation_threshold (float or None) – Saturation voltage threshold. None auto-detects (gain-anchored from recording metadata if available).
baseline_threshold (float or None) – Baseline envelope threshold. None auto-detects from pre-stim MAD.
poly_order (int) – Polynomial order for detrend. Default 3.
artifact_window_only (bool) – Only process around stim events. Default True.
multi_peak (bool) – When
True, enables multi-pulse-aware recentering — the search window is interpreted as potentially containing multiple pulses (a stim train), and the alignment target is the first or last qualifying pulse rather than the strongest. DefaultFalse. WhenFalse, behaviour is identical to the pre-multi-peak implementation. Seerecenter_stim_times()for details.multi_peak_select (str) – When
multi_peak=True, which qualifying peak to lock onto."first"(default) /"last".multi_peak_threshold (float) – When
multi_peak=True, peaks below this fraction of the largest peak in the search window are ignored. Default0.6.multi_peak_min_separation_ms (float) – When
multi_peak=True, minimum spacing between candidate peaks. Default2.0.max_stim_offset_ms (float) – Search window radius for stim time recentering. Default 50.0.
peak_mode (str) – Alignment target for
recenter_stim_times. One of"abs_max"(default),"pos_peak","neg_peak","down_edge","up_edge". For biphasic anodic-first pulses where the AP is triggered at the up→down current reversal, use"down_edge".n_reference_channels (int) – Top-K highest-amplitude channels summed to form the signed reference trace for non-
abs_maxpeak modes. Default 8.prewindow_ms (float) – For
down_edge/up_edge, radius of the pre-window before the primary peak. Default 5.0.model (ModelSpikeSorter or None) – Detection model instance for
load_rt_sortwhenrt_sortis a path.model_path (str or Path or None) – Path to a detection model folder for
load_rt_sortwhenrt_sortis a path.recording_window_ms (tuple or None) –
(start_ms, end_ms)sub-window to restrict processing to. Only events whose peri-event window falls entirely within this range are sorted.Noneprocesses the full recording.verbose (bool) – Print progress messages. Default True.
- Returns:
- Event-aligned spike slice stack
with one slice per (corrected) stim event. Each slice spans
[-pre_ms, +post_ms]relative to the stim time.
- Return type:
stim_slices (SpikeSliceStack)
- spikelab.spike_sorting.stim_sorting.preprocess_stim_artifacts(recording, stim_times_ms, output_path=None, *, method='polynomial', artifact_window_ms=10.0, recenter=True, max_offset_ms=50.0, poly_order=3, saturation_threshold=None, baseline_threshold=None, artifact_window_only=True, return_scaled=False, dtype='float32')[source]
Remove stim artifacts and return a new SpikeInterface recording.
Materialises
recording.get_traces()to an ndarray, optionally recenters the stim times to their artifact peaks, runsremove_stim_artifacts(), and wraps the cleaned traces in either aBinaryRecordingExtractor(whenoutput_pathis given) or aNumpyRecording. Channel IDs, locations, gains, and offsets are copied from the input recording.- Parameters:
recording (BaseRecording) – SpikeInterface recording to clean. Single-segment only.
stim_times_ms (array-like) – Logged stim event times in milliseconds (
len(stim_times_ms)may be 0, in which case recentering/artifact removal are skipped and the recording is returned unchanged aside from theBinaryRecordingExtractorwrap whenoutput_pathis given).output_path (str or Path, optional) – When provided, cleaned traces are written as a float32 binary (interleaved channels, i.e. shape
(num_samples, num_channels)on disk) and aBinaryRecordingExtractoris returned. Parent directories are created as needed. WhenNone(default), aNumpyRecordingis returned — NOT dumpable for Docker-based sorters.method (str) –
"polynomial"(default) or"blank"— seeremove_stim_artifacts(). Polynomial detrend preserves spikes in the 0–10 ms post-stim window (the smooth fit can’t capture a ~1 ms spike feature) and is safe by default thanks topoly_clamp_factor— divergent fits at extreme stim amplitudes are caught and downgraded to blank automatically, with one summary warning per call. Use"blank"only when the post-stim window is genuinely irrelevant to the analysis, or when the clamp warning fires on a non-trivial fraction of events (in which case a uniform blank is cleaner than mixing per-event polynomial subtraction with per-event clamp blanks).artifact_window_ms (float) – Length of the post-stim artifact window in ms. Default 10.0.
recenter (bool) – When True (default), align logged stim times to the actual artifact peaks via
recenter_stim_times()before artifact removal. Set False when the supplied times are already peak-aligned.max_offset_ms (float) – Maximum recentering shift, passed to
recenter_stim_times(). Default 50.0.poly_order (int) – Polynomial order for
method="polynomial". Default 3.saturation_threshold (float, optional) – Override the auto-detected thresholds used by
remove_stim_artifacts().baseline_threshold (float, optional) – Override the auto-detected thresholds used by
remove_stim_artifacts().artifact_window_only (bool) – When True (default), only the windows around stim events are processed; when False, a global sliding-window detrend is applied (useful for very frequent stim protocols).
return_scaled (bool) – Whether to materialise µV-scaled traces from
recording. Default False — match the recording’s native dtype/units. Set True to force a µV-scaled float output when the recording exposes gains/offsets. Forwarded asreturn_in_uVon newer SpikeInterface versions andreturn_scaledon older ones.dtype (str) – dtype of the cleaned output (both for in-memory and on-disk representations). Default
"float32".
- Return type:
- Returns:
cleaned_recording (BaseRecording) – New SpikeInterface recording with artifacts removed. Channel IDs, locations, gains, and offsets are inherited from the input.
metadata (dict) –
- Artifact-removal metadata. Keys:
stim_times_ms_logged: original stim times as passed instim_times_ms_corrected: recentered stim times (equalsstim_times_ms_loggedwhenrecenter=False)recenter_offsets_ms:corrected - loggedoffsetsblanked_fraction: overall fraction of samples blankedblanked_fraction_per_channel: per-channel blanked fractions, shape(num_channels,)
- spikelab.spike_sorting.stim_sorting.recenter_stim_times(traces, stim_times_ms, fs_Hz, max_offset_ms=50.0, *, peak_mode='abs_max', n_reference_channels=8, prewindow_ms=5.0, warn_offset_ms=3.0, multi_peak=False, multi_peak_select='first', multi_peak_threshold=0.6, multi_peak_min_separation_ms=2.0)[source]
Find actual stimulation artifact times near logged stim times.
For each logged stim time, searches a window of
±max_offset_msin the raw voltage traces and returns the sample at the alignment point selected bypeak_mode. This corrects for timing offsets between the stimulation hardware trigger log and the artifact in the recording.- Parameters:
traces (np.ndarray) – Raw voltage traces, shape
(channels, samples).stim_times_ms (array-like) – Logged stimulation event times in milliseconds. Need not be sorted.
fs_Hz (float) – Sampling frequency in Hz.
max_offset_ms (float) – Radius of the search window around each logged stim time, in milliseconds. Default 50.0.
peak_mode (str) –
Alignment target. One of: *
"abs_max"(default): largest|voltage|acrosschannels. Backward-compatible with the pre-
peak_modeAPI."pos_peak": largest positive voltage in the top-K summed reference trace."neg_peak": most negative voltage in the top-K summed reference."down_edge": up→down transition for biphasic anodic-first pulses (see module docstring)."up_edge": down→up transition for biphasic cathodic-first pulses.
n_reference_channels (int) – Number of highest-amplitude channels summed to build the signed reference trace for non-
abs_maxmodes. Default8. Ignored forabs_max.prewindow_ms (float) – For
down_edge/up_edge, radius of the pre-window in which to search for the preceding opposite-polarity peak. Default5.0.warn_offset_ms (float or None) – When the median
|corrected - logged|shift exceeds this threshold, emit aUserWarning. A large systematic shift usually means a fixed hardware-vs-log delay, a wrong time column in the stim log, or a unit mismatch (ms vs s vs samples) rather than genuine jitter. Set toNoneto silence. Default3.0ms — well above one-sample jitter at 20–30 kHz.multi_peak (bool) – Opt-in support for multi-pulse stim trains. When
True, the search window is treated as potentially containing multiple stimulation pulses (e.g. a 100 Hz train), and the alignment target is the first or last qualifying pulse rather than the strongest one. DefaultFalse— preserves backward-compatible single- peak behavior.multi_peak_select (str) – When
multi_peak=True, which qualifying peak to lock onto."first"(default) = first pulse onset (matches “first-pulse alignment” used for train PSTHs)."last"= last pulse onset (useful for studying after-train rebound). Ignored whenmulti_peak=False.multi_peak_threshold (float) – When
multi_peak=True, only peaks whose amplitude is at least this fraction of the largest peak in the search window are considered “real pulses”. Default0.6— accepts pulses up to 40% weaker than the strongest while still rejecting noise.multi_peak_min_separation_ms (float) – When
multi_peak=True, the minimum spacing between candidate peaks. Prevents multi-sample peaks of a single pulse from being counted as separate pulses. Default2.0ms — well below any sensible inter-pulse interval (5 ms = 200 Hz; 10 ms = 100 Hz).
- Returns:
- Corrected stim times in
milliseconds, same length as
stim_times_ms. Events whose search window extends outside the recording are clipped to the recording boundary.
- Return type:
corrected_ms (np.ndarray)
Notes
When multiple stim events have overlapping search windows, each is recentered independently.
For monophasic pulses the
*_edgemodes degrade gracefully: the pre-window search returns the opposite polarity’s noise peak and the zero-crossing fallback lands near the onset of the single artifact — butpos_peak/neg_peakwill give cleaner results in that case.For single-pulse stim,
multi_peak=Truedegrades to the original single-peak behavior (only one peak in the window is above threshold; first==last). Set it always-on if you mix single-pulse and train conditions in one recording.
- spikelab.spike_sorting.stim_sorting.remove_stim_artifacts(traces, stim_times_ms, fs_Hz, method='polynomial', artifact_window_ms=10.0, saturation_threshold=None, baseline_threshold=None, poly_order=3, artifact_window_only=True, copy=True, *, recording=None, raw_traces=None, poly_clamp_factor=10.0)[source]
Remove stimulation artifacts from multi-channel voltage traces.
Processes each stim event independently per channel. Saturated samples are always blanked (zeroed). For the
"polynomial"method, a low-order polynomial is fit to the post-saturation artifact tail and subtracted, preserving neural spikes (which are too fast for the smooth polynomial to capture).When multiple stim events occur in rapid succession and the signal re-saturates before reaching baseline levels, the blanking region is extended dynamically and the polynomial fit is deferred until after the final desaturation in the burst.
The polynomial detrend is conceptually related to SALPA (Wagenaar & Potter 2002, J Neurosci Methods), adapted for offline processing where look-ahead past saturation is available — see the module docstring for details.
- Parameters:
traces (np.ndarray) – Raw voltage traces, shape
(channels, samples).stim_times_ms (array-like) – Corrected stim times in milliseconds (e.g. from
recenter_stim_times).fs_Hz (float) – Sampling frequency in Hz.
method (str) –
"polynomial"(default) or"blank".artifact_window_ms (float) –
Maximum duration in milliseconds of the artifact tail after the last desaturation point. The polynomial is fit over this window. Default 10.0.
Note: when the post-stim window contains a clear descent from the recentered stim time to a subsequent negative peak (typical for biphasic anodic-first pulses sorted with
peak_mode="down_edge"), the fit is automatically split into two independent polynomials at the negative peak — one for[stim_time, neg_peak](the descent) and one for[neg_peak, baseline_recovery](the tail). When the recentered stim time IS the negative peak (e.g.peak_mode="abs_max"or"neg_peak"), no descent exists and a single fit is used. This is automatic; no user knob.saturation_threshold (float or None) – Absolute voltage value above which a sample is considered saturated. When None, auto-detected — preferring gain-anchored detection from
recordingmetadata when supplied (seerecordingkwarg below), falling back to the 99.9th percentile of|traces|otherwise.raw_traces (np.ndarray or None) – Optional pre-bandpass traces, same shape as
traces, used as the source of truth for saturation detection. Bandpass filtering of a stim artifact produces ringing whose filtered amplitude can exceed the raw ADC rail even on unsaturated samples, so auto-detection fromtraces(filtered) both over- reports (ringing overshoot) and under-reports (group-delay smoothing) clips. When provided, the threshold is derived fromraw_tracesand the clip mask is built fromnp.abs(raw_traces) >= threshold; the filteredtracesare blanked at those same sample indices and polynomial- detrended around them.baseline_threshold (float or None) – Absolute voltage envelope below which the signal is considered to have returned to baseline. When None, auto-detected from pre-stim MAD.
poly_order (int) – Polynomial order for the detrend. Default 3 (cubic). Higher orders risk fitting spike-like features; lower orders may not capture the artifact decay shape.
artifact_window_only (bool) – If True (default), only process windows around stim events. If False, apply a global polynomial detrend to the entire trace (for recordings with very frequent stimulation).
copy (bool) – If True (default), return a copy; if False, modify
tracesin-place.poly_clamp_factor (float or None) – Sanity-clamp factor for the
"polynomial"method. After each polynomial subtraction, if any post-subtract sample exceedspoly_clamp_factor * saturation_thresholdin absolute value, the segment is treated as a divergent fit (extrapolated wildly across saturated samples), blanked instead of left in place, and counted toward a one-shot warning emitted at the end of the call. Default10.0— well above any plausible neural amplitude (~100 µV) whensaturation_thresholdis in the multi-thousand-µV range. Set toNoneto disable. Has no effect whensaturation_thresholdis+inf(no clipping detected) ormethod="blank".
- Returns:
- Cleaned traces, shape
(channels, samples).- blanked_mask (np.ndarray): Boolean array, shape
(channels, samples). True for samples that were blanked (zeroed) because they fell within a saturation region.
- Return type:
cleaned (np.ndarray)