SP Transit Node Classifier

Classifies bus stops in São Paulo's transit network as Hub, Intermediate, or Peripheral based on graph features and geographic coordinates.

The goal: predict betweenness centrality class without computing betweenness itself (which is computationally expensive for large networks).

How to Use

import joblib
import numpy as np
from huggingface_hub import hf_hub_download

path = hf_hub_download(
    repo_id="cintia-shinoda/sp-transit-node-classifier",
    filename="model.joblib",
)
model = joblib.load(path)

# Input: [degree, degree_centrality, closeness_centrality, lat, lon]
node = np.array([[8, 0.00036, 0.018, -23.55, -46.63]])
pred = model.predict(node)
# 0 = Peripheral, 1 = Intermediate, 2 = Hub

Features

Feature Description
degree Number of direct connections
degree_centrality Normalized degree centrality
closeness_centrality Closeness centrality
lat Latitude
lon Longitude

Metrics

Metric Value
F1 Macro (test) 0.59
Accuracy (test) 0.68
F1 Macro (5-fold CV) 0.43

Feature Importance

Feature Importance
lat 0.2793
lon 0.2604
closeness_centrality 0.2566
degree 0.1061
degree_centrality 0.0976

Key Finding

Geographic position (lat/lon) is the strongest predictor of hub status, confirming that high-centrality stops concentrate in specific corridors of São Paulo.

Limitations

  • Labels derived from betweenness centrality quantiles — simplified classification
  • Trained on a single GTFS snapshot — may not generalize to network changes
  • Does not consider temporal patterns (peak vs. off-peak)
  • Class imbalance: 66% Peripheral, 24% Intermediate, 10% Hub

Dataset

SP Transit Network Centrality — 21,892 bus stops with graph centrality metrics.

Citation

@misc{shinoda2026sp-classifier,
  author = {Cintia Shinoda},
  title = {SP Transit Node Classifier},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/cintia-shinoda/sp-transit-node-classifier}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train cintia-shinoda/sp-transit-node-classifier

Space using cintia-shinoda/sp-transit-node-classifier 1