CSKB: A Cyber Security Knowledge Base Based on Knowledge Graph

The access of massive terminal devices has brought new security risks to the existing Internet, so traditional cybersecurity data sets are difficult to reflect the modern and complex network attack environment. Therefore, how to realize the standardizatio

  • PDF / 2,614,368 Bytes
  • 14 Pages / 439.37 x 666.142 pts Page_size
  • 103 Downloads / 315 Views

DOWNLOAD

REPORT


Abstract. The access of massive terminal devices has brought new security risks to the existing Internet, so traditional cybersecurity data sets are difficult to reflect the modern and complex network attack environment. Therefore, how to realize the standardization and integration of cybersecurity data, so as to continuously store and update malicious traffic information under massively connected terminals, has become a critical issue to be solved urgently. Therefore, based on the knowledge graph, we built a standardized cybersecurity ontology, and introduced the implementation process of the cybersecurity knowledge base (CSKB) from five stages of knowledge acquisition, knowledge fusion/extraction, know-ledge storage, knowledge inference, and knowledge update, aiming at providing a reliable basis for real-time cybersecurity protection solutions. Experiments prove that the knowledge stored in CSKB can effectively realize the specification and integration of security data. Keywords: Cyber security data · Knowledge graph · Security ontology · Cyber security knowledge base

1 Introduction With the rapid development of 5G communication technology, the access of massive terminal devices has brought new security risks to the existing Internet, which in turn threatens user’s privacy protection and impacts the security of critical information infrastructure [1, 2]. In the field of cybersecurity, although a series of cybersecurity data sets have been designed, such as KDDCup99 [3], NSL-KDD [4], UNSW-NB15 [5], and CICDDoS2019 [6], etc. They are stored in a CSV file in the form of a two-dimensional table, designed to reflect modern and complex attack environments by designing a comprehensive data set containing normal and abnormal behavior, but they still have some shortcomings: Firstly, cybersecurity data sets capture and analyze traffic in the form of data packets, and put all the characteristics of traffic into data rows, so that they lose the clear relationship between cyber entities and various features. It is difficult to achieve logical preservation of existing data only through data sets; Secondly, each security data set uses its own rules to count traffic and design feature values, resulting © Springer Nature Singapore Pte Ltd. 2020 S. Yu et al. (Eds.): SPDE 2020, CCIS 1268, pp. 100–113, 2020. https://doi.org/10.1007/978-981-15-9129-7_8

CSKB: A Cyber Security Knowledge Base Based on Knowledge Graph

101

in a lack of effective correlation with each other, which hinders data mining and knowledge extraction; Finally, the security data set is collected and analyzed under a specific network environment. When faced with traffic information from multiple sources, the data set cannot be updated and expanded regarding the original rules. Therefore, how to effectively use a large amount of existing knowledge and historical accumulation in the field of cybersecurity to achieve the specification and integration of security data, to continuously store and update malicious traffic information under massively connected terminals