The use of information technology (IT) in traffic systems is becoming a hot topic within the traffic research community. It is within this context that IT and traffic, blend together to create intelligent transportation systems (ITS). Benefits of employing ITS are ample and can range from; increased safety, improved operational performance, enhanced mobility, environmental benefits to boosts of productivity leading to economic and employment growth (Ezell, 2010). One of the practical outcomes can for example be that all actors of the transportation systems are allowed to enlighten themselves with information and make better informed decisions. This type of utilization of ITS leans on a key ingredient; a robust, complete and accurate picture of the traffic state in the network. This picture is generated in a process called traffic state estimation. It generally goes hand in hand with the process of traffic state prediction, which produces the future pictures of the traffic state.
In this research towards urban traffic state estimation and prediction, two challenges are introduced. Firstly there is a mismatch between the amount of data and the needed traffic flow variables. A traffic network is in practice never fully covered by traffic information sources, thus requiring techniques to extrapolate and utilize the traffic data that ís available. The second challenge is a result of choosing an urban environment as subject of study in this research. An urban environment which as opposed to a freeway only network, comes with a higher complexity due to e.g. lower traffic volumes, lower speed limits, more variability in velocities, a higher density of intersections, traffic signals, roundabouts, priority-junctions and dynamic interactions between other modes of transport.
Van Hinsbergen et al. (2007) describes that the vast amount of traffic estimation and predictions models used in literature can be fitted into four categories. There are naïve models, parametric models, non-parametric models and a hybrid blend of two or more of these categories. Van Lint (2011) emphasizes that the key difficulty for traffic state estimation and prediction is therefore to find a balance between sophisticated and complex models on one side and smooth, fast, general applicable models on the other side, to make valid estimations and forecasts given the data available.
The naïve categorization represents models, in which only traffic data is used from which direct relations are calculated. Examples are instantaneous travel time or historical averaging models. The advantage of these models is the favourable low computational complexity and easy to implementation. The downside is that because of the lack of traffic theory, results are usually illogical and inaccurate. The parametric categorization represents models in which the principle of the Lighthill–Whitham–Richards (Lighthill & Whitham, 1955) model are applied. Classical examples are; Newell’s simplified kinematic wave model (Newell, 1993) and cell transmission models (Daganzo, 1994). The advantage of these models is that they implement logical real world traffic theory, but with the disadvantage of requiring vast calibration of parameters. Additionally due to the case being an urban network where the traffic flow fundamentals of for example flow conservation might not be applicable, accuracy is negatively affected. The non-parametric categorization represents the traffic models in which relations in traffic data are considered, but no traffic flow parameters are estimated. Examples of these models are mostly based on simple regression. These models have in common that while their complexity is low and therefore they can easily be run in real-time speed, their accuracy is according to Van Lint (2011) generally fairly low.
Theoretically more suitable for urban traffic state estimation and prediction are hybrid model types, which take elements from non-parametric, parametric and naïve methods to output more accurate estimations and predictions. The most famous example is Kalman Filtering (Kalman & Bucy, 1961), which again assumes all traffic flow fundamentals to hold. Of all the other hybrid models considered in this research, the hybrid black-box approaches seem to be theoretically, practically and intuitively, the best suitable in the urban environment chosen as subject of this study. Within these lines Morita (2011), Esaway (2012) developed two interesting frameworks which consider the use of patterns in historical traffic data which allow the current traffic state of links to be used as indicators for the traffic state on neighbouring links. The goal of this research is to design a performing traffic state estimation and traffic state prediction framework, which by utilizing both floating car- and inductive loop detector- data, delivers real-time and future link -velocities, -densities and -flows within an urban traffic environment. Used as a starting point for this design are the previously methods of Morita (2011) and Esaway (2012), from which this newly developed neighbourhood link method (NLM) is created. Aimed is to answer the question on how to design a neighbour link framework which delivers both a traffic state estimation and state prediction of all relevant traffic flow variables within an urban network and assess both the performance and accuracy of the traffic states outputted by this NLM framework.
The research method applied is divided into three separate parts; 1) the urban traffic state ground truth, 2) the urban traffic state estimation and 3) the urban traffic state prediction part. In the first part of the research a microsimulation programme (PARAMICS) is used to generate a 100% accurate and 100% covered ground truth for the Sioux Falls network. This ground truth designed remains unchanged throughout this research. The second part of this research presents the extension of the ground truth framework with the NLM for traffic state estimation. Additionally the steps of performance assessment, evaluation and synthesis are included to complete the design cycle. With the ground truth available (for comparative purposes), the estimations are assessed on accuracy and correlation leading to the designing of the best performing NLM variant for both traffic state-estimation and -prediction.
The newly developed NLM framework can be described by the following 7 steps. Initially traffic data is stored in a database. Next from this database for each link it is determined which links behave the same and can be considered neighbours based on correlation in traffic data. Then newly arrived data is considered upon which an estimation from solely the traffic data of neighbouring links is generated using linear regression. Consequently this neighbourhood estimation is fused, weighted on reliability, with the traffic data from the link itself, generating the final traffic state estimation. For the prediction part, an extra time dimension is included, to incorporate the prediction horizon.
The results reveal that the NLM framework for estimation at 5% FCD, is able to estimate on average 60% of the urban links in the network within 3 mi/h of the ground truth during rush hour periods. In free flow traffic this percentage drops to 50%. Density estimations show 80% of all links to be estimated within 12 veh/mi/lane and of the flow estimations 60% of the links can be estimated within 100 veh/h/lane deviations in rush hour. With corresponding correlation values of over 0,91 for velocity; 0,93 for density and 0,75 for flow estimations. The NLM framework for prediction shows at 5% FCD and a prediction horizon of 5 minutes again promising results, with only a 20% drop in accuracy for velocity. However density predictions are up to 50% less accurate, and flow predictions are up to 300% worse. The correlation of velocity and density predictions only drop marginally, but the correlation of flow predictions lowers by 10% to 15%. For a prediction horizon of 15 minutes the degradation continues as the prediction accuracy decreases further and the correlation for density and flow drops well below 0,75.
This research reveals that the considered NLM framework can yield very reasonable traffic state estimation results in a modelled and simulated environment. Due to the fact it is simple in essence and algorithmically not very complex, NLM can also be easily transferred to a real world scenario. Additionally other traffic data sources can be effortlessly implemented in the process. There are however areas suitable for further research, especially as the predictive ability of NLM is currently unsatisfactory. Advanced bagging of historical traffic data can provide additional accuracy, as well as a different approach to finding the neighbourhood space for each link. The time-lag inherently apparent in NLM (because the first link that experiences congestion cannot be predicted by its neighbours) has not been overcome. A plug-in that incorporates historical traffic data differently might provide a solution here. Additionally further work is needed to improve and test the current implementation into more complex and realistic urban traffic networks as not all traits that typically describe an urban environment were included in the used case study (e.g. user-interaction, mode-interaction, a heterogeneous vehicle mix and dynamic traffic lights).