The raw data contains the following information:

id_sa time visitors pct_female pct_male pct_agegroup1 pct_agegroup2 pct_agegroup3 pct_agegroup4 pct_agegroup5 pct_agegroup6 pct_business
250mN280625E479025 2019-08-11 00:00:00 242 49.13 50.87 15.78 20.88 16.18 18.90 15.09 13.18 5.81
250mN280625E479025 2019-08-11 00:30:00 241 49.31 50.69 15.44 21.52 16.51 18.22 15.45 12.86 5.87
250mN280625E479025 2019-08-11 01:00:00 235 50.07 49.93 15.30 20.70 16.73 18.59 16.01 12.67 5.84
250mN280625E479025 2019-08-11 01:30:00 234 49.05 50.95 15.35 20.12 16.71 19.60 16.00 12.22 5.93
250mN280625E479025 2019-08-11 02:00:00 232 47.74 52.26 14.51 21.10 17.08 19.18 15.60 12.52 5.73

Spatial extent

## [1] "Number of grid cells: 276"
## [1] "Area of grid cells in m2: 62500"

Temporal extent

## [1] "First timestamp: 2019-08-11 00:00:00"
## [1] "Last timestamp: 2019-09-10 23:30:00"
## <interval[1]>
## [1] 30m
## [1] "Time series is regular: TRUE"
## [1] "Time series has gaps: FALSE"

Spatio-Temporal Patterns

We can already identify some spatial temporal patterns in the data. For example, we extracted spatially coherent clusters of similar daily visitor patterns.

For more exploratory spatio-temporal data analysis, see the Jupyter Notebooks on Exploratory Data Analysis and Time Series Clustering.

The cluster in the city center has its peak of visitors during daytime, while the cluster at the edge of the grid has its peak of visitors during nighttime.

Extracting Movement

We want to estimate the number of people moving from each cell \(i\) at timestamp \(t0\) to each cell \(j\) at timestamp \(t1\), with as input only the visitor counts in each cell at each timestamp.

This problem is ill-posed, since the number of parameters to be estimated (probabilites of movement between cells) is \(N^2\), while the number of observations is \(2N\), with \(N\) being the number of cells in the grid.

We are currently testing different methods, based on recent research in the fields of graph modelling and optimal transport.

More details can be found in Jupyter Notebooks in the projects GitHub repo:

Remarks

More and better data

  • Finer time intervals
  • Information on how many people actually move during a time interval
  • Test data!
  • Auxillary data to better estimate transition probabilities

Other use-cases

  • Movement on coarser spatio-temporal scale, e.g. commuting flows