Deliverable D2.3, “State of the Art on Transport and Mobility Data Protection Technologies,” offers an overview of the privacy risks related to mobility and transportation data and provides a description to several techniques in the literature aiming at limiting or eliminating such risks.
Mobility data in its simpler form are data about individuals that include their locations at specific times. Sources of real-time raw individual location data include, but are not limited to, cell towers, Wi-Fi access points, RFID tag readers, location-based services, or credit card payments. Historical location data, in form of data sets in which each of the records corresponds to an individual and includes her location data for some time periods are referred to as trajectory microdata sets.
Such trajectory microdata sets are often of interest to transport authorities, operators, and other stakeholders to evaluate and improve their services, the state of the traffic, etc. and thus are often publicly released or shared. Sharing of mobility data is occasionally shared as aggregates (e.g., heat maps) instead of at an individual level.
Whichever the form of these mobility data, they all share some statistical characteristics that make their sharing a potential privacy risk. Mobility data are highly unique and regular.
Unicity refers to the data of different individuals to be easily differentiable, particularly at some specific locations. The starting and ending locations of users’ trajectories are often their home and work locations which, again are highly unique and can lead to reidentification. Studies show that user full trajectories can be uniquely recovered with the knowledge of only two locations.
The regularity of trajectories means that for single individuals, their data follows periodic patterns. Namely, individuals tend to follow the same trajectories during workdays—home to work and back to home.
The deliverable introduces and describes the legal and regulatory implications of the collection and processing of mobility data in the context of the GDPR, and then studies the risks of reidentification for location-based services, trajectory microdata sharing and release, and aggregated data sharing and release, along with techniques in the literature that aim at reducing such risks