BACKGROUND
Point of Interest (POI) data (e.g., the locations where people are visiting based on their mobile device movement and self-reports) is increasingly being studied in spatial analysis of pressing healthcare issues, such as for assisting with substance use prevention and treatment. However, the task of retrieving accurate healthcare POI information remains complicated, in part, due to the lack of available POI data except from proprietary sources. With the abundance of open-source projects and commercial geographical databases, POI conflation may be a useful method to enrich spatial data attributes and coverage by merging POI records from different sources.
OBJECTIVE
This study outlines a proposed framework for healthcare POI conflation and spatial enrichment that involves a multi-step process.
METHODS
This framework includes the following steps: POI data collection from Open-source Location-Based Services (OLBS), geographic and spatial attributes collection, calculating similarity across datasets, POI matching, spatial attributes enrichment, manual labeling, Quality Control (QC), and a deployable enriched database. We tested the viability of our proposed framework on a drug and substance abuse use case in California, USA.
RESULTS
Based on our findings, our automated approach was able to detect 11,936 unique POIs related to this healthcare tag. Using a proprietary commercial dataset, we found that 38% of their healthcare POIs were substance use-related POI’s (n = 104 number of substance abuse POIs from SafeGraph). In contrast, our framework, which included a much larger number of healthcare POI’s due to our conflation matrix, was composed of 33% substance use-related POI’s (n = 559 number of POIs).
CONCLUSIONS
We conclude that using our proposed framework can provide a larger number of POIs with similar relevance and spatial attributes as proprietary commercial datasets, allowing this method to be low-cost method for studying substance use-related POIs.