Synthetic populations are computer-generated groups of people that are designed to look like real populations. They are built using public census information about people’s characteristics, such as their age, gender, and job, alongside statistical algorithms that help put it all together. Their main application is for conducting so-called social simulations to assess different possible solutions to social problems, such as transportation, health issues, and housing. During the COVID-19 pandemic, for example, scientists in many places around the world conducted social simulations to estimate the number of cases in each country.
In Japan, researchers have been carrying out such simulations using supercomputers under the COVID-19 AI & Simulation Project led by the Cabinet Secretariat of the Japanese government since 2020. They were given significant consideration when deciding various political measures, such as PCR testing policies, immigration limits, domestic tourism support, vaccination programs, and so on. These simulations were possible thanks to a synthetic population which was prepared and updated under the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures (JHPCN) project, since 2017.
However, this Japanese synthetic population had a significant limitation — even though a home address was one of the attributes assigned to each individual, their workplace location was not. As a result, this synthetic population was more accurate at representing the night-time distribution of people, but not their day-time distribution, or the relationship between both.
To tackle this problem, a trio of Japanese researchers including Assistant Professor Takuya Harada of Shibaura Institute of Technology, as well as Dr. Tadahiko Murata and Mr. Daiki Iwase of the Faculty of Informatics at Kansai University, recently devised a method to assign a workplace attribute to each worker in synthetic populations. Their study was published in IEEE Transactions on Computational Social Systems and was supported by both JHPCN and the Japan Science and Technology Agency (JST).
The main challenge the researchers had to overcome was the lack of statistical information linking home and workplace locations for people. In Japan, only local governments whose area has over 200,000 residents release complete origin-destination-industry (ODI) statistics, which provide details about the movement of workers as well as their industry type (like retail, construction, or manufacturing). For cities, towns, or villages with less than 200,000 residents, the available ODI data is less specific, and only tells whether the person works in the same city, in another city within the same prefecture, or in another city in a different prefecture. Unfortunately, approximately 48% of workers in Japan reside in cities with less than 200,000 residents.
Thus, the research team combined available ODI data with origin-destination (OD) data and developed an innovative workplace assignment method that works for all cities, towns, and villages in Japan. To test whether their method was designed properly, they used it to assign workplaces to people in cities with more than 200,000 residents and compared the results with the available complete ODI data. For the city of Takatsuki in the Osaka prefecture, which the researchers showcased as an example in their paper, the proposed method could assign the correct cities as workplaces for 88.2% of workers.
The possible applications for detailed social simulations using synthetic populations are manyfold, as Professor Murata of Kansai University remarks: “Real-scale social simulations can be used for estimating the efficiency of urban developments, including housing and transportation projects, as well as the influence of social programs conducted by national or local governments. They can also be employed for rescue and relief programs when facing disasters such as earthquakes, tsunamis, floods, typhoons, and pandemics.” Put simply, social simulations can help decisionmakers accurately image various possible futures.
Another important aspect of synthetic populations is that they are free from data privacy concerns. “Synthetic populations are a secure technology because no private information is used,” explains Assistant Professor Harada, “Because we synthesize multiple sets of populations that have the same statistical characteristics, third parties cannot identify whether real information is included or not.” Worth noting, this study marks the world’s first synthetic populations with workplace information that are publicly released for engineers and researchers.
The research team is already working on using their newfound workplace assignment method to estimate the day-time population distribution throughout all of Japan.