MATClus is a framework for learning fixed-length deep representations of Multiple Aspect Trajectories (MAT) that lead to the discovery of effective MAT clusters. The pipeline processes raw trajectory data, extracts behavioral features using sliding windows, learns compact vector representations, and applies clustering algorithms. It converts trajectories into behavior sequences, normalizes them, learns trajectory embeddings, and groups trajectories with similar movement patterns.
The repository contains a full pipeline for trajectory processing and clustering:
completeTrajectories()
Converts raw trajectory records into differential trajectory components (distance, heading change, etc.).
stwindow_interval()
Splits trajectories into sliding temporal windows and calculates spatial–temporal interaction scores.
-
computeFeas()
Computes motion features such as: -
movement rate
-
directional change
-
curvature change
-
generate_behavior_sequences()
Converts trajectory features into window-based behavior sequences. -
generate_normal_behavior_sequence()
Applies quantile normalization to the behavior vectors.
trajectory2Vec_torch()
Learns trajectory embeddings using a PyTorch LSTM autoencoder.
The framework supports multiple clustering approaches:
matric_kmeans_clustering()– KMeans clusteringdbscan_clustering()– density-based clusteringspatiotemporal_hac()– hierarchical clustering
Embeddings can be projected to 2D for clustering using:
- PCA
- t-SNE
The system expects trajectory data in a CSV-style file:
synthetic_vector.out
Each row represents a trajectory observation:
| Field | Description |
|---|---|
| trajectory_id | ID of the trajectory |
| timestamp | observation time |
| x | x coordinate |
| y | y coordinate |
| rk | rotation / heading |
| kx | x component of movement vector |
| ky | y component of movement vector |
Example:
traj_id, timestamp, x, y, rk, kx, ky
1, 10.0, 52.1, 13.2, 0.1, 0.02, 0.01
1, 20.0, 52.2, 13.3, 0.15, 0.03, 0.02
2, 11.0, 51.9, 13.0, 0.2, 0.01, 0.04
The pipeline stores intermediate results in the data directory:
simulated_data/
│
├─ sim_trajectories_complete
├─ sim_trajectories_feas
├─ sim_behavior_sequences
├─ sim_normal_behavior_sequences
├─ trajectory_distances
└─ encodings/
└─ traj_vec_lstm_final_state_torch.pkl
Example minimal pipeline:
from MATClus import *
data_dir = "./simulated_data"
completeTrajectories(data_dir)
stwindow_interval(data_dir)
computeFeas(data_dir)
generate_behavior_sequences(data_dir)
generate_normal_behavior_sequence(data_dir)
trajectory2Vec_torch(data_dir)
result = matric_kmeans_clustering(
data_dir,
n_clusters=3
)
print(result["labels"])This will:
- Load trajectory data
- Generate behavioral windows
- Normalize features
- Train the LSTM autoencoder
- Produce trajectory embeddings
- Cluster trajectories
Clustering functions return a dictionary:
{
"method": "kmeans",
"labels": array([...]),
"coords": array([...]), # 2D projection (optional)
"embeddings": array([...]),
"meta": {...}
}- labels → cluster assignment for each trajectory
- coords → 2D projection for visualization
- embeddings → learned trajectory vectors
This dataset contains 2D trajectory points grouped into trajectories (tracks). Each row represents one observation (one point) along a trajectory with time, coordinates, rating, keyword coordinates and keywords.
- One row = one timestep for one trajectory
- Rows belonging to the same trajectory share the same trajectory_id
- Within a trajectory, timestep is expected to be monotonic (typically increasing)
Each line has 8 comma-separated fields:
- trajectory_id (int) Identifier of the trajectory / track (e.g., 0).
- time (int) Time index or frame number within the trajectory (e.g., 29, 54, 97).
- x (float) X coordinate in the dataset’s coordinate system.
- y (float) Y coordinate in the dataset’s coordinate system.
- rating (float) Social rating.
- kx (float) Keyword X coordinate.
- ky (float) Keyword Y coordinate.
- keyword (string) related keywords (e.g., industry, complex, …).
Example
trajectory_id,time, x, y, rating, kx, ky
0,29,34037.70761806609,9833.138607576757,8.4,3.206948757171631,34.005638122558594,industry
0,54,34782.25592839948,9270.427745288769,9.6,-9.44599723815918,16.407695770263672,complex
0,97,36103.710020125865,8309.520759541527,9.1,-19.66252326965332,-37.67576599121094,deliciousIf you use this code or reproduce these results, please cite: F.Gryllakis, N. Pelekis, C. Doulkeridis and Y. Theodoridis, "Multiple Aspect Trajectory Clustering using Deep Learning Techniques", 2026. (Submitted for publication.)