Abstract
Purpose
– The purpose of this paper is to present a Big Data solution as a methodological approach to the automated collection, cleaning, collation, and mapping of multimodal, longitudinal data sets from social media. The paper constructs social information landscapes (SIL).
Design/methodology/approach
– The research presented here adopts a Big Data methodological approach for mapping user-generated contents in social media. The methodology and algorithms presented are generic, and can be applied to diverse types of social media or user-generated contents involving user interactions, such as within blogs, comments in product pages, and other forms of media, so long as a formal data structure proposed here can be constructed.
Findings
– The limited presentation of the sequential nature of content listings within social media and Web 2.0 pages, as viewed on web browsers or on mobile devices, do not necessarily reveal nor make obvious an unknown nature of the medium; that every participant, from content producers, to consumers, to followers and subscribers, including the contents they produce or subscribed to, are intrinsically connected in a hidden but massive network. Such networks when mapped, could be quantitatively analysed using social network analysis (e.g. centralities), and the semantics and sentiments could equally reveal valuable information with appropriate analytics. Yet that which is difficult is the traditional approach of collecting, cleaning, collating, and mapping such data sets into a sufficiently large sample of data that could yield important insights into the community structure and the directional, and polarity of interaction on diverse topics. This research solves this particular strand of problem.
Research limitations/implications
– The automated mapping of extremely large networks involving hundreds of thousands to millions of nodes, encapsulating high resolution and contextual information, over a long period of time could possibly assist in the proving or even disproving of theories. The goal of this paper is to demonstrate the feasibility of using automated approaches for acquiring massive, connected data sets for academic inquiry in the social sciences.
Practical implications
– The methods presented in this paper, together with the Big Data architecture can assist individuals and institutions with a limited budget, with practical approaches in constructing SIL. The software-hardware integrated architecture uses open source software, furthermore, the SIL mapping algorithms are easy to implement.
Originality/value
– The majority of research in the literature uses traditional approaches for collecting social networks data. Traditional approaches can be slow and tedious; they do not yield adequate sample size to be of significant value for research. Whilst traditional approaches collect only a small percentage of data, the original methods presented here are able to collect and collate entire data sets in social media due to the automated and scalable mapping techniques.
Subject
Industrial and Manufacturing Engineering,Strategy and Management,Computer Science Applications,Industrial relations,Management Information Systems
Reference21 articles.
1. Anderson, C.
(2008), “The end of theory: the data deluge makes the scientific method obsolete”, Wired Magazine, 16 July, p. 16.
2. Bontcheva, K.
(2014a), “EU project to build lie detector for social media”, available at: https://www.sheffield.ac.uk/news/nr/lie-detector-social-media-sheffield-twitter-facebook-1.354715
3. Bontcheva, K.
(2014b), “Lie detector on the way to test social media rumours”, BBC Technology, available at: www.bbc.co.uk/news/technology-26263510 (accessed 21 April 2015).
4. Bourdieu, P.
(1985), “The social space and the genesis of groups”,
Theory and Society
, Vol. 14 No. 6, pp. 723-744.
5. Boyd, D.
and
Crawford, K.
(2011), “Six provocations for big data: a decade in internet time”, Symposium on the Dynamics of the Internet and Society, Social Science Research Network, New York.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献