SpeechGraph is a computational pipeline for analyzing the structure of transcribed speech through directed lexical graphs. It extracts graph-theoretic and recurrence-based features from discourse and evaluates their association with impulsivity dimensions measured by the Barratt Impulsiveness Scale (BIS-11).
- Parsing of orthographic transcripts into token sequences.
- Graph construction: directed lexical transition graphs where each node is a unique token and each directed edge represents a transition between consecutive tokens within the same discourse segment.
- Sliding-window metrics quantifying topology, connectivity, recurrence, and path structure of the graph.
- Z-scores via permutation testing, comparing observed metrics against a null distribution generated by shuffling tokens within discourse segments.
- Spearman correlations (simple and partial) between graph metrics and impulsivity dimensions.
- Predictive regression using Monte Carlo cross-validation to assess how well graph metrics predict impulsivity scores.
- Hyperparameter optimization with Optuna, exploring multiple regression algorithms and RFE-based feature selection.
where each vertex
For an observed metric
For each target
Performance is evaluated via:
All results —cross-experiment comparisons, distributions, feature analysis, optimization, and SHAP analysis— are available on the live dashboard: