The raw data contains the following information:
| id_sa | time | visitors | pct_female | pct_male | pct_agegroup1 | pct_agegroup2 | pct_agegroup3 | pct_agegroup4 | pct_agegroup5 | pct_agegroup6 | pct_business |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 250mN280625E479025 | 2019-08-11 00:00:00 | 242 | 49.13 | 50.87 | 15.78 | 20.88 | 16.18 | 18.90 | 15.09 | 13.18 | 5.81 |
| 250mN280625E479025 | 2019-08-11 00:30:00 | 241 | 49.31 | 50.69 | 15.44 | 21.52 | 16.51 | 18.22 | 15.45 | 12.86 | 5.87 |
| 250mN280625E479025 | 2019-08-11 01:00:00 | 235 | 50.07 | 49.93 | 15.30 | 20.70 | 16.73 | 18.59 | 16.01 | 12.67 | 5.84 |
| 250mN280625E479025 | 2019-08-11 01:30:00 | 234 | 49.05 | 50.95 | 15.35 | 20.12 | 16.71 | 19.60 | 16.00 | 12.22 | 5.93 |
| 250mN280625E479025 | 2019-08-11 02:00:00 | 232 | 47.74 | 52.26 | 14.51 | 21.10 | 17.08 | 19.18 | 15.60 | 12.52 | 5.73 |
## [1] "Number of grid cells: 276"
## [1] "Area of grid cells in m2: 62500"
## [1] "First timestamp: 2019-08-11 00:00:00"
## [1] "Last timestamp: 2019-09-10 23:30:00"
## <interval[1]>
## [1] 30m
## [1] "Time series is regular: TRUE"
## [1] "Time series has gaps: FALSE"
We can already identify some spatial temporal patterns in the data. For example, we extracted spatially coherent clusters of similar daily visitor patterns.
For more exploratory spatio-temporal data analysis, see the Jupyter Notebooks on Exploratory Data Analysis and Time Series Clustering.
The cluster in the city center has its peak of visitors during daytime, while the cluster at the edge of the grid has its peak of visitors during nighttime.
We want to estimate the number of people moving from each cell \(i\) at timestamp \(t0\) to each cell \(j\) at timestamp \(t1\), with as input only the visitor counts in each cell at each timestamp.
This problem is ill-posed, since the number of parameters to be estimated (probabilites of movement between cells) is \(N^2\), while the number of observations is \(2N\), with \(N\) being the number of cells in the grid.
We are currently testing different methods, based on recent research in the fields of graph modelling and optimal transport.
More details can be found in Jupyter Notebooks in the projects GitHub repo: