TY - JOUR
T1 - Analyzing a large and unobtainable relationship graph using a streaming activity graph
AU - Bartal, Alon
AU - Ravid, Gilad
N1 - Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2021/2/6
Y1 - 2021/2/6
N2 - The availability of large-scale data about interactions of social media users allows the study of complex human behavior. Graphs are typically employed to represent user interactions, but several algorithms become impractical for analyzing large graphs. Hence, it can be useful to analyze a small sub-graph instead in a practice known as graph sampling. However, if the graph is unobtainable, for example, due to privacy limitations, graph sampling is impossible. We introduce an innovative algorithm for representing a large unobtainable graph of user relationships such as Facebook friendships, using a streaming graph of user activity that can include, for example, wall posts on Facebook. We applied different methods of the proposed algorithm to two large datasets. The results show that averages and distribution statistics of nodes in a large, unobtainable relationship graph are well represented by a graph of about 20% of the size of the unobtainable graph. Finally, we apply the proposed algorithm to identify influencers in an unobtainable graph by analyzing a representative graph. We find that 63% to 76% of identified influencers in the representative graph act as influencers in the unobtainable graph, suggesting that the developed algorithm can effectively capture properties of the unobtainable graph.
AB - The availability of large-scale data about interactions of social media users allows the study of complex human behavior. Graphs are typically employed to represent user interactions, but several algorithms become impractical for analyzing large graphs. Hence, it can be useful to analyze a small sub-graph instead in a practice known as graph sampling. However, if the graph is unobtainable, for example, due to privacy limitations, graph sampling is impossible. We introduce an innovative algorithm for representing a large unobtainable graph of user relationships such as Facebook friendships, using a streaming graph of user activity that can include, for example, wall posts on Facebook. We applied different methods of the proposed algorithm to two large datasets. The results show that averages and distribution statistics of nodes in a large, unobtainable relationship graph are well represented by a graph of about 20% of the size of the unobtainable graph. Finally, we apply the proposed algorithm to identify influencers in an unobtainable graph by analyzing a representative graph. We find that 63% to 76% of identified influencers in the representative graph act as influencers in the unobtainable graph, suggesting that the developed algorithm can effectively capture properties of the unobtainable graph.
KW - Activity graph
KW - Large relationship graph
KW - Streaming graph
KW - Unobtainable graph representation
UR - http://www.scopus.com/inward/record.url?scp=85092518054&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2020.09.063
DO - 10.1016/j.ins.2020.09.063
M3 - Article
AN - SCOPUS:85092518054
SN - 0020-0255
VL - 546
SP - 1097
EP - 1112
JO - Information Sciences
JF - Information Sciences
ER -