Sound: To TextGrid (speech activity, Silero)...

A command that creates a TextGrid with one interval tier from every selected Sound object. The interval tier contains non-speech and speech intervals with boundaries determined by the Silero VAD model (for the algorithm and settings, see speech activity detection with Silero VAD). The labels of the intervals are specified by the settings Non-speech interval label and Speech interval label.

Settings

Speech probability threshold (0-1) (standard value: 0.5)
see speech activity detection with Silero VAD.
Min. gap between speech segments (s) (standard value: 0.1)
see speech activity detection with Silero VAD.
Min. speech segment (s) (standard value: 0.25)
see speech activity detection with Silero VAD.
Padding around speech segments (s) (standard value: 0.03)
see speech activity detection with Silero VAD.
Non-speech interval label (standard value: “”)
the label assigned to intervals classified as non-speech in the resulting TextGrid.
Speech interval label (standard value: “speech”)
the label assigned to intervals classified as speech in the resulting TextGrid.

Links to this page


© Anastasia Shchupak 2026-06-01