TY - JOUR

T1 - Analyzing a large and unobtainable relationship graph using a streaming activity graph

AU - Bartal, Alon

AU - Ravid, Gilad

N1 - Publisher Copyright:
© 2020 Elsevier Inc.

PY - 2021/2/6

Y1 - 2021/2/6

N2 - The availability of large-scale data about interactions of social media users allows the study of complex human behavior. Graphs are typically employed to represent user interactions, but several algorithms become impractical for analyzing large graphs. Hence, it can be useful to analyze a small sub-graph instead in a practice known as graph sampling. However, if the graph is unobtainable, for example, due to privacy limitations, graph sampling is impossible. We introduce an innovative algorithm for representing a large unobtainable graph of user relationships such as Facebook friendships, using a streaming graph of user activity that can include, for example, wall posts on Facebook. We applied different methods of the proposed algorithm to two large datasets. The results show that averages and distribution statistics of nodes in a large, unobtainable relationship graph are well represented by a graph of about 20% of the size of the unobtainable graph. Finally, we apply the proposed algorithm to identify influencers in an unobtainable graph by analyzing a representative graph. We find that 63% to 76% of identified influencers in the representative graph act as influencers in the unobtainable graph, suggesting that the developed algorithm can effectively capture properties of the unobtainable graph.

AB - The availability of large-scale data about interactions of social media users allows the study of complex human behavior. Graphs are typically employed to represent user interactions, but several algorithms become impractical for analyzing large graphs. Hence, it can be useful to analyze a small sub-graph instead in a practice known as graph sampling. However, if the graph is unobtainable, for example, due to privacy limitations, graph sampling is impossible. We introduce an innovative algorithm for representing a large unobtainable graph of user relationships such as Facebook friendships, using a streaming graph of user activity that can include, for example, wall posts on Facebook. We applied different methods of the proposed algorithm to two large datasets. The results show that averages and distribution statistics of nodes in a large, unobtainable relationship graph are well represented by a graph of about 20% of the size of the unobtainable graph. Finally, we apply the proposed algorithm to identify influencers in an unobtainable graph by analyzing a representative graph. We find that 63% to 76% of identified influencers in the representative graph act as influencers in the unobtainable graph, suggesting that the developed algorithm can effectively capture properties of the unobtainable graph.

KW - Activity graph

KW - Large relationship graph

KW - Streaming graph

KW - Unobtainable graph representation

UR - http://www.scopus.com/inward/record.url?scp=85092518054&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2020.09.063

DO - 10.1016/j.ins.2020.09.063

M3 - Article

AN - SCOPUS:85092518054

SN - 0020-0255

VL - 546

SP - 1097

EP - 1112

JO - Information Sciences

JF - Information Sciences

ER -