Research on Text Mining Based on Domain Ontology
This paper improves the traditional text mining technology which cannot understand the text semantics. The author discusses the text mining methods based on ontology and puts forward text mining model based on domain ontology. Ontology structure is built
- PDF / 193,567 Bytes
- 9 Pages / 439.363 x 666.131 pts Page_size
- 84 Downloads / 244 Views
Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing, 100081 2 Key Lab of Agricultural Information Service Technology of Ministry of Agriculture, Beijing, 100081 3 Institute of Agriculture Resources and Regional Planning of Chinese Academy of Agricultural Sciences, Beijing, 100081
Abstract. This paper improves the traditional text mining technology which cannot understand the text semantics. The author discusses the text mining methods based on ontology and puts forward text mining model based on domain ontology. Ontology structure is built firstly and the “concept-concept” similarity matrix is introduced, then a conception vector space model based on domain ontology is used to take the place of traditional vector space model to represent the documents in order to realize text mining. Finally, the author does a case and draws some conclusions. Keywords: Ontology, text mining, domain ontology, vector space model.
1
Introduction
Natural language is the main communication and expression thought tool in today’s economic society. Although it has been studied for a long time, the understanding and using ability is still limited. The data mining technology based on statistics had matured and applied successfully in large scale relational database in the early nineteenth century. Naturally scholars had the idea of applying the technology of data mining to analyze the text block described by natural language and called it text mining or knowledge discovery in text. Different from the traditional natural language processing’s focusing on understanding the words and sentences, the main goal of text mining is to find out the unknown and valuable knowledge or their relationship in large scale text sets. However, I found that most text mining lack of semantic considerations in application, only analyze grammatically, but not the content, so the results are always barely satisfied.
*
Corresponding author.
D. Li and Y. Chen (Eds.): CCTA 2013, Part II, IFIP AICT 420, pp. 361–369, 2014. © IFIP International Federation for Information Processing 2014
362
2
L.-h. Jiang, N.-f. Xie, and H.-b. Zhang
Text Mining Based on Ontology
Text mining, or knowledge discovery in text database, is the process of finding unknown, useful and understandable knowledge in large scale text database. The objects of text mining are semi-structured or unstructured. And they always contains multi-layer ambiguity, so a lot of difficulties of text mining are caused. The traditional text mining method based on vector space model converts the text to word frequency vectors. The major defect of this method is neglecting the importance of semantic role leading to text mining results are unsatisfied. Therefore, semantic analysis and processing technology should be combine with text mining technology in order to develop more effective mining method to realize deep semantic level mining. Appling ontology to text mining provides theoretical support and a feasible approach to solve above problems. At present, the representa
Data Loading...