Most analyses of travel patterns are based on the assumption of isolated individuals and ignore interpersonal relationships between travelers. In this paper, we develop a straightforward method to identify group travel behavior (GTB), defined as two or more persons intentionally traveling together from a single origin to a single destination, with public transit smart card data based on proxemics theory. We apply our method to Beijing to reveal the patterns of GTB, using all records generated by the subway system during a one-week period in 2010. Our data and method do not allow a reliable estimate of GTB share in overall travel, but do enable a description of the characteristics and the spatiotemporal pattern of GTB. The results reveal that the group size and GTB frequency follow a long tail distribution: far more people travel in small groups than in large groups and far more group travelers can be observed carrying out only one group trip than travelers making multiple group trips. Group trips tend to occur in weekends, in afternoons, and during public holidays. Furthermore, stations and lines serving leisure destinations show the highest GTB scores. We conclude that the GTB pattern is distinctly different from the pattern of individual travel in terms of both time and space, and is essentially influenced by urban land uses surrounding subway stations.
Unlike the data from traditional sources, there have not been standard ways to validate the quality and reliability of information derived from big data. This article argues that the theory of urban formation can be used to do the validation. In addition, the information derived from big data can be used to verify and even extend existing theories or hypotheses of urban formation. It proposes a general framework regarding how the theory of urban formation can be employed to validate information derived from smart card data and how the validated information can supplement other data to reveal spatial patterns of economic agglomeration or human settlements. Through a case study of Beijing, it demonstrates the usefulness of the framework. Additionally, it utilizes smart card data to delineate characteristics of subcenters defined by bus commuters of Beijing.
In the process of developing urban public transport system, the coverage of bus station is an essential indicator in evaluating the service level of public transportation. Based on the meticulous data of bus stations, this study calculates the bus station coverage ratio of urban built-up area of 313 major cities, among which there are 281 prefecture-level or above prefecture-level cities whose average coverage ratio of bus station is 64.4%. Meanwhile this study reveals significant correlation of two group variables; one is between bus station coverage ratio and population density and bus station density, and the other is between used times of public transportation service per ten thousand people and ownership of public bus per ten thousand people and per capita GDP of urban district. According to the spatial feature of bus station coverage ratio, this study divides 313 cities into 5 categories, and tries to find out the general patterns and rules of Chinese public transportation service system. And the further trial analyses human activities and facilities condition within 500m service scope of bus station based on Flickr photos, Weibo position and POIs data, indicating that 94.4% facilities and more than 92% human activities are included in this service scope, which demonstrates that the majority demand of human activities and facilities can be fulfilled in view of bus station layout of Chinese cities. While most present studies on bus station coverage are focus on certain single city, and researches about general pattern of the majority cities of China are rare. On the one hand, study up to the whole country is subject to mass base data, on the other hand, studying most cities in microcosmic scale is held back by the transformation between different scale. This study is an experiment in dissolving micro-scale problems, considering both macro-scale and microcosmic analysis units, and taking advantage of meticulous data and analysis. Therefore, the results of this study will be able to provide evidence in the process of optimizing urban public transportation service and propelling urban public transportation planning at the same time.
This paper has been published and attached here.
The wide application of smart card-based automated fare collection systems for public transportation has produced large quantities of spatial-temporal data at an individual level. Such data not only records mobility behavior of cardholders, but also reveals the usage patterns of cities. Owing to its high spatial-temporal resolution, low cost and large quantity, transit smart card data attracts more and more attention from urban/transport planners, playing an increasingly important role in urban studies. This article presents a comprehensive review of latest development on quantitative urban studies empowered by smart card data, from both China and overseas. The review covers the following four aspects: (1) data processing and origin-destination inference, (2) transit operation and management, (3) spatial structure of cities, and (4) mobility behavior and social networks. Finally, the review summarizes existing studies and gives a brief introduction to privacy protection and information extraction, and present potential future avenues of research.
For details, please see our working paper 64 at the Working Papers channel of BCL (http://www.beijingcitylab.com/working-papers-1/).
Public transportation in big cities is a crucial part of urban transportation infrastructures. Exploring the spatiotemporal patterns of public trips can help us to understand dynamic transportation patterns and the complex urban systems thus supporting better urban planning and design. The availability of large-scale smart card data (SCD), which is one type of urban Big Data collected from public transportation operations and management institutions, offers new opportunities to study intra-urban structure and spatial interaction dynamics. Previous research has investigated the jobs-housing relationships and commuting patterns using such data and demonstrated comparisons with traditional high-cost travel survey approach. In this research, we are interested in extracting origin-destination (OD) flow matrices in the aggregation scale of traffic analysis zones (TAZs) and analyzing the intra-urban spatial interaction patterns revealed by human movements among TAZs using public transportation. Traditional spatial clustering approaches are not sufficient to explore network structure of spatial interactions between different regions. We applied the novel community detection methods from the study of complex networks to examine the dynamic spatial structures of public transportation communities in the Beijing Metropolitan Area (16,410 km2). It can help to find the ground-truth community structure of strongly connected TAZs by public transportation, which may yield insights for urban planners on land use patterns or for transportation engineers on traffic congestion.
In Beijing, most bus/metro passengers use smart cards when getting on and off buses and metros to pay their fares. Thus, individual OD trips which connect bus stops (or metro stations) can be extracted directly from the detailed records of SCD. The collected SCD consists of 97.9 million trips from anonymized 10.9 million smart card users during a one-week period from April 5 to April 11, 2010. In order to create the public transportation OD flow matrices in the TAZ level, we first georeferenced all bus stops and metros stations with latitude/longitude coordinates, and then spatially joined them into the total 1911 Beijing TAZ boundaries. A directed-weighted linkage between two TAZs represents the total number of public trips from the origin-TAZ to the destination-TAZ in a given time interval. Regarding the temporal dynamics, we aggregated the data into different hourly and daily periods to study the spatio-temporal patterns in public transportation, as well as variations between weekdays and weekends.
In the study of complex networks, a community is deﬁned as a subset (group) of the whole network and the nodes in the same community are densely connected internally and grouped together. The identiﬁcation of such densely connected nodes in networks is called community detection. We first converted the TAZ-scale OD flow matrices in the consecutive seven days into seven undirected-weighted graphs, where each TAZ can be taken as a node and each OD-flow interaction as a weight edge linking two TAZs. Then, the widely used Newman-modularity-maximization method was applied to find the daily public transportation communities. The modularity measure compares a proposed graph division with a null model in which connections between nodes are random. The modularity was defined as the sum of differences between the fraction of edges falling within communities and the expected value of the same quantity under the null model. In practice, a bottom-up fast greedy algorithm was adopted for searching an optimized graph partition that maximizes the modularity measure. First, each TAZ started in its own independent cluster of community and the modularity values among all pairs of TAZs for all communities were calculated. Second, a pair of TAZs which has the maximum difference of OD flow compared with the null model should be merged into a community. Third, the modularity of the new graph will be calculated again and then repeating the procedure until the maximum of modularity is found. A larger modularity value indicates a more robust community structure.
The community detection results showed that some geographically cohesive regions that correspond remarkably well with administrative districts in Beijing were identified by weekday public transportation patterns, while some unexpected spatial structures might uncover hidden urban structure that needs further investigation. The suburb public transportation communities usually contain more TAZs than urban central TAZs. There exist strong public transit connections among TAZs which locate along the middle west-to-east corridor including the Chang'an Avenue in Beijing, where the metro line 1 also runs through the street. Surprisingly, most of the southern TAZs were aggregated into a large transportation community. It indicates there are more frequent intra-public trips within its own community in the south region than the inter-community trips across other sub-regions of the metropolitan Area. In addition, a southwest TAZ was aggregated into a large spatially separated community in the northeast region only in weekends not in weekdays. It reveals a recreation place of interests in the southwest TAZ and attracts a large portion of public travel trips from the large northeast community. We also found that the daily community detection results using SCD are different from that using household travel surveys. A comparative study might help to identify the characteristics or bias of different data sources. Moreover, the community detection results of hourly aggregated trips, especially the commuting trips at peak hours yield insights on the overall job-related mobility patterns and intra-TAZ spatial interactions using public transportation.
This paper appears in the Springer book "Geospatial Analysis for Supporting Urban Planning in Beijing" as a chapter.
Location Based Services (LBS) provide a new perspective for spatiotemporally analyzing dynamic urban systems. Research has investigated urban dynamics using GSM (Global System for Mobile Communications), GPS (Global Positioning System), SNS (Social Networking Services) and Wi-Fi techniques. However, less attention has been paid to the analysis of urban structure (especially commuting pattern) using smart card data (SCD), which are widely available in most cities. Additionally, ubiquitous LBS data, although providing rich spatial and temporal information, lacks rich information on the social dimension, which limits its in-depth application. To bridge this gap, this paper combines bus SCD for a one-week period with a one-day household travel survey, as well as a parcel-level land use map to identify job-housing locations and commuting trip routes in Beijing. Two data forms (TRIP and PTD) are proposed, with PTD used for jobs-housing identification and TRIP used for commuting trip route identification. The results of the identification are aggregated in the bus stop and traffic analysis zone (TAZ) scales, respectively. Particularly, commuting trips from three typical residential communities to six main business zones are mapped and compared to analyze commuting patterns in Beijing. The identified commuting trips are validated on three levels by comparison with those from the survey in terms of commuting time and distance, and the positive validation results prove the applicability of our approach. Our experiment, as a first step toward enriching LBS data using conventional survey and urban GIS data, can obtain solid identification results based on rules extracted from existing surveys or censuses.
This study has been published in Computers, Environment and Urban Systems.
Discovering functional zones using bus smart card data and points of interest in Beijing
Cities comprise various functional zones, including residential, educational, commercial zones etc... It is important for urban planners to identify different functional zones and understand their spatial structure within the city in order to make better urban plans. In this research, we used 77976010 bus smart card records of Beijing City in one week in April 2008 and converted them into two-dimensional time series data of each bus platform, Then, through data mining in the big database system and previous studies on citizens’ trip behavior, we established the DZoF (Discovering Zones of different Functions) model based on SCD (Smart Card Data) and POIs (Points of Interest), and pooled the results at the TAZ (traffic analysis zone) level. The results suggested that DzoF model and cluster analysis based on dimensionality reduction and EM (expectation-maximization) algorithm can identify functional zones that well match the actual land uses in Beijing. The methodology in the present research can help urban planners and the public understand the complex urban spatial structure and contribute to the academia of urban geography and urban planning.
This paper appears in the Springer book "Geospatial Analysis for Supporting Urban Planning in Beijing".
This paper seeks to understand extreme public transit riders in Beijing using both traditional household survey and emerging new data sources such as Smart Card Data (SCD). We focus on four types of extreme transit behaviors: public transit riders who (1) travel significantly earlier than average riders (the ‘early birds’); (2) ride in unusual late hours (the ‘night owls’); and (3) commute in excessively long distance (the ‘tireless itinerants’); (4) travel over frequently in a day (the ‘recurring itinerants’). SCD are used to identify the spatiotemporal patterns of these three extreme transit behaviors. In addition, household survey data are employed to supplement the socioeconomic background and provide a tentative profiling of extreme travelers. While the research findings are useful to guide urban governance and planning in Beijing, the methods developed in this paper can be applied to understand travel patterns elsewhere.
For details, see BCL working paper 57 "Early Birds, Night Owls, and Tireless/Recurring Itinerants: An Exploratory Analysis of Extreme Transit Behaviors in Beijing, China"
Media coverage by The Paper: http://m.thepaper.cn/newsDetail_forward_1302674
By Prof SHEN Hao
Maps download: https://www.dropbox.com/s/dozi6feuxv2uzxc/BUS_SHENHAO.rar
Data: BCL Data 18 Bus routes and stops of Beijing (http://longy.jimdo.com/data-released/)
Mobility of economically underprivileged residents in China has seldom been well profiled due to privacy issue and the characteristics of Chinese over poverty. In this paper, we identify and characterize underprivileged residents in Beijing using overwhelmingly available public transport smartcard transactions in 2008 and 2010, respectively. We regard these frequent bus/metro riders (FRs) in China, especially in Beijing, as economically underprivileged residents, which has been proved by (1) the household travel survey in 2010, (2) a small-scale survey in 2012, as well as (3) our interviews with local residents in Beijing. Places of residence and work of cardholders are identified using the SCD in 2008 and 2010, respectively, in which a cardholder’s unique card ID kept the same. We then profile all FRs identified into 20 groups in terms of their place of residence variation (change, no change), workplace variation (change, no change, find a job, lose a job, and jobless all the time) during 2008-2010 and housing place in 2010 (within the fourth ring road or not) for deriving policy implications for practitioners. The underprivileged degree of each FR is then evaluated using the 2014 SCD. The potential bias and contribution is finally summarized. To the best of our knowledge, this is the first study for understanding long- or mid-term urban dynamics using immediate “big data”, and also for profiling underprivileged residents in Beijing in a fine-scale.
This paper appears in the Springer book "Geospatial Analysis for Supporting Urban Planning in Beijing" as a chapter.
Existing studies have extensively used temporalspatial data to mining the mobility patterns of different kinds of travelers. Smart Card Data (SCD) collected by the Automated Fare Collection (AFC) systems can reflect a general view of the mobility pattern of the whole bus and metro riders in urban area. Since the mobility and stability are temporally and spatially dynamic and therefore difficult to measure, few work focuses on the transition of their travel pattern between a long time interval. In this paper, an overview of the relation between stability and regularity of public transit riders based on SCD of Beijing is presented first. To analyze the temporal travel pattern of urban residents, travelers are classified into two categories, extreme and non-extreme travelers. We have two lines for profiling all cardholders, rule based approach for extreme and improved density-based clustering method for nonextreme. Similar clusters are aggregated according their features of regularity and occasionality. By combining transition matrix of passenger's temporal travel pattern and socioeconomic data of Beijing in the year of 2010 and 2014, several analyses about resident's temporal mobility and stability are presented to shed lights on the interdependence between stability and mobility in the time dimension. The results indicate that passengers's regularity is hard to predict, extreme travel patterns are more vulnerable and overall non-extreme travel patterns nearly stay the same.
For more, please see the BCL working paper 74.
When optimizing the overall commuting pattern for a city or a region, there are often winners and losers among commuters at the subdivision level. Losers are those who are burdened with longer commutes than before the optimization. Knowing who or where losers are is of interest to both researchers and policy-makers. The information would help them efficiently locate losers and compensate them. Few, however, pay attention to such losers. By revisiting “excess commuting” in the economic framework, we show that optimizing the commuting pattern is comparable to restoring Pareto optimality in commuting. Using Beijing as a case study, we identify and geo-visualize the losers when the city’s bus commuting pattern is optimized. We examine the severity of the loss among the losers, the spatial pattern of the losers and their influencing factors. We find that most losers are located around the epicenter. The severity of the loss is independent of jobs/housing ratio but is associated with the commute distance before the optimization. Workers whose commute distance is less than the global average are more likely to become losers. Where losers reside have significantly lower employment density in a few industries than where non-losers reside. A low jobs/housing ratio in individual subareas does not necessarily increase the average trip length of commuters therein. A low jobs/housing ratio of one or several subareas, however, could influence the average trip length of all the commuters in the area. Locating diverse jobs and housing opportunities around or along transit corridors could compensate the losers and to reduce the overall commuting cost.
Universities are where innovations, face-to-face interactions and social capital are commonplace. But universities, often regarded as “the ivory tower”, cannot be separated from the social and economic transformations that are outside them. Traffic, information and financial flows between universities and other locations can be used to reveal connections between the ivory tower and other locales. Therefore, we use the weekday public transit smartcard records from April 6, 2010 to April 9, 2010 (totally 158,262 transit trips, including bus-only, bus plus subway and subway-only trips) to identify and profile the most popular destinations of the student riders from the 985 universities (a short list of top universities designated by the Chinese Central Government in 1999) and associated transit trip flows in Beijing. We identify destination hotspots for 985-universities students in Beijing, allocate traffic volume to major roads and delineate transit trips of students from each campus. Our results indicate that there exist only weak ties between the top universities and the most disadvantaged areas.