With the development of big data application, the demand of large-scale structured/unstructured data fusion management and analysis is becoming increasingly prominent. However, the differences in management, process, retrieval of structured/unstructured data brings challenges for fusion management and analysis. This study proposes an extended property graph model for heterogeneous data fusion management and semantic computing, and defines related property operators and query syntax. Based on the intelligent property graph model, this study implements PandaDB, an intelligent fusion management system for heterogeneous data. This study depicts the architecture, storage mechanism, query mechanism, property co-storage, AI algorithm scheduling, and distributed architecture of PandaDB. Test experiments and cases show that the co-storage mechanism and distributed architecture of PandaDB have good performance acceleration effects, and can be applied in some scenarios of fusion data intelligent management such as entity disambiguation of academic knowledge graph.
Zhihong Shen, Zihao Zhao, Huajin Wang, Zhongxin Liu, Chuan Hu, Chunyuan Zhou. PandaDB: Intelligent Management System for Heterogeneous Data. International Journal of Software and Informatics, 2021,11(1):69~90Copy