Abstract
AbstractNetwork analysis provides powerful tools to learn about a variety of social systems. However, most analyses implicitly assume that the considered relational data is error-free, and reliable and accurately reflects the system to be analysed.
Especially if the network consists of multiple groups (e.g., genders, races), this assumption conflicts with a range of systematic biases, measurement errors and other inaccuracies that are well documented in the literature.
To investigate the effects of such errors we introduce a framework for simulating systematic bias in attributed networks. Our framework enables us to model erroneous edge observations that are driven by external node attributes or errors arising from the (hidden) network structure itself. We exemplify how systematic inaccuracies distort conclusions drawn from network analyses on the task of minority representations in degree-based rankings. By analysing synthetic and real networks with varying homophily levels and group sizes, we find that the effect of introducing systematic edge errors depends on both the type of edge error and the level of homophily in the system: in heterophilic networks, minority representations in rankings are very sensitive to the type of systematic edge error. In contrast, in homophilic networks we find that minorities are at a disadvantage regardless of the type of error present. We thus conclude that the implications of systematic bias in edge data depend on an interplay between network topology and type of systematic error. This emphasises the need for an error model framework as developed here, which provides a first step towards studying the effects of systematic edge-uncertainty for various network analysis tasks.
Funder
Bundesministerium für Bildung und Forschung
Ministry of Culture and Science (MKW) of the German State of North Rhine-Westphalia
RWTH Aachen University
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,Computer Networks and Communications,Multidisciplinary
Reference55 articles.
1. Adiga A, Vullikanti AKS (2013) How robust is the core of a network? In: Blockeel H, Kersting K, Nijssen S, Železný F (eds) Machine learning and knowledge discovery in databases. Springer, pp 541–556
2. Almquist ZW (2012) Random errors in egocentric networks. Soc Netw 34(4):493–505. https://doi.org/10.1016/j.socnet.2012.03.002
3. Avella-Medina M, Parise F, Schaub MT, Segarra S (2020) Centrality measures for graphons: accounting for uncertainty in networks. IEEE Trans Netw Sci Eng 7(1):520–537. https://doi.org/10.1109/TNSE.2018.2884235
4. Bell DC, Belli-McQueen B, Haider A (2007) Partner naming and forgetting: recall of network members. Soc Netw 29(2):279–299. https://doi.org/10.1016/j.socnet.2006.12.004
5. Borgatti SP, Carley KM, Krackhardt D (2006) On the robustness of centrality measures under conditions of imperfect data. Soc Netw 28(2):124–136. https://doi.org/10.1016/j.socnet.2005.05.001