Skip to content

fix: Resolve ValueError in distance matrix pivot & correct speed grid filtering#2

Open
pdml422 wants to merge 2 commits into
CityScope:mainfrom
pdml422:fix/speed-filtering-and-pivot-collision
Open

fix: Resolve ValueError in distance matrix pivot & correct speed grid filtering#2
pdml422 wants to merge 2 commits into
CityScope:mainfrom
pdml422:fix/speed-filtering-and-pivot-collision

Conversation

@pdml422

@pdml422 pdml422 commented Jun 30, 2026

Copy link
Copy Markdown

This PR addresses two bugs encountered when evaluating a Multi-modal GTFS dataset (Bus + Metro/Subway with constant speeds) in Hanoi.
Bug 1 : Constant-speed routes being skipped in Headway calculation

  • Issue: In the speed loop, the condition if gtfs_length_i == gtfs_length: continue acts as a premature optimization. If a route has a perfectly constant speed (e.g., a Metro line running at exactly 35km/h), it passes the speed_grid thresholds without reducing the gtfs_length. Consequently, the loop skips computation for these speeds, resulting in constant-speed modes being incorrectly assigned the lowest speed bin (e.g., 5km/h). Furthermore, overwriting gtfs_selection inside the loop breaks the filtering logic for subsequent iterations.
  • Fix: Removed the gtfs_length_i == gtfs_length check to ensure computation runs for all valid thresholds. Introduced gtfs_selection_i to preserve the original DataFrame scope during filtering.

Bug 2 : ValueError: Index contains duplicate entries, cannot reshape during .pivot()

  • Issue: When using a multi-modal dataset (e.g., mode_factor for Bus=0.8, Metro=1.0), rounding collisions occur. Two different stops from different modes can end up with the exact same stop_quality_grid value (e.g., 0.47) but slightly different final quality_grid values after distance multiplication. The previous drop_duplicates logic included 'quality_grid', which failed to drop these overlapping pairs, ultimately crashing the .pivot() function.
  • Fix: Excluded 'quality_grid' from the subset in drop_duplicates(). Chained .sort_values('quality_grid') and used keep='last' to ensure that in the event of a rounding collision, the highest accessibility score is correctly retained for that grid intersection.

Testing: Tested successfully on a custom Hanoi GTFS feed containing both street-level buses and grade-separated Metro lines. The fix prevents the ValueError and accurately colors Metro lines based on their true speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant