TY - GEN
T1 - AutoMAP
T2 - 29th International World Wide Web Conference, WWW 2020
AU - Ma, Meng
AU - Xu, Jingmin
AU - Wang, Yuan
AU - Chen, Pengfei
AU - Zhang, Zonghua
AU - Wang, Ping
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/4/20
Y1 - 2020/4/20
N2 - The high complexity and dynamics of the microservice architecture make its application diagnosis extremely challenging. Static troubleshooting approaches may fail to obtain reliable model applies for frequently changing situations. Even if we know the calling dependency of services, we lack a more dynamic diagnosis mechanism due to the existence of indirect fault propagation. Besides, algorithm based on single metric usually fail to identify the root cause of anomaly, as single type of metric is not enough to characterize the anomalies occur in diverse services. In view of this, we design a novel tool, named AutoMAP, which enables dynamic generation of service correlations and automated diagnosis leveraging multiple types of metrics. In AutoMAP, we propose the concept of anomaly behavior graph to describe the correlations between services associated with different types of metrics. Two binary operations, as well as a similarity function on behavior graph are defined to help AutoMAP choose appropriate diagnosis metric in any particular scenario. Following the behavior graph, we design a heuristic investigation algorithm by using forward, self, and backward random walk, with an objective to identify the root cause services. To demonstrate the strengths of AutoMAP, we develop a prototype and evaluate it in both simulated environment and real-work enterprise cloud system. Experimental results clearly indicate that AutoMAP achieves over 90% precision, which significantly outperforms other selected baseline methods. AutoMAP can be quickly deployed in a variety of microservice-based systems without any system knowledge. It also supports introduction of various expert knowledge to improve accuracy.
AB - The high complexity and dynamics of the microservice architecture make its application diagnosis extremely challenging. Static troubleshooting approaches may fail to obtain reliable model applies for frequently changing situations. Even if we know the calling dependency of services, we lack a more dynamic diagnosis mechanism due to the existence of indirect fault propagation. Besides, algorithm based on single metric usually fail to identify the root cause of anomaly, as single type of metric is not enough to characterize the anomalies occur in diverse services. In view of this, we design a novel tool, named AutoMAP, which enables dynamic generation of service correlations and automated diagnosis leveraging multiple types of metrics. In AutoMAP, we propose the concept of anomaly behavior graph to describe the correlations between services associated with different types of metrics. Two binary operations, as well as a similarity function on behavior graph are defined to help AutoMAP choose appropriate diagnosis metric in any particular scenario. Following the behavior graph, we design a heuristic investigation algorithm by using forward, self, and backward random walk, with an objective to identify the root cause services. To demonstrate the strengths of AutoMAP, we develop a prototype and evaluate it in both simulated environment and real-work enterprise cloud system. Experimental results clearly indicate that AutoMAP achieves over 90% precision, which significantly outperforms other selected baseline methods. AutoMAP can be quickly deployed in a variety of microservice-based systems without any system knowledge. It also supports introduction of various expert knowledge to improve accuracy.
KW - Microservice architecture
KW - anomaly diagnosis
KW - cloud computing
KW - root cause
KW - web application
UR - https://www.scopus.com/pages/publications/85086570534
U2 - 10.1145/3366423.3380111
DO - 10.1145/3366423.3380111
M3 - Conference contribution
AN - SCOPUS:85086570534
T3 - The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020
SP - 246
EP - 258
BT - The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020
PB - Association for Computing Machinery, Inc
Y2 - 20 April 2020 through 24 April 2020
ER -