Representing unstructured text semantics for reasoning purpose

PDF / 1,074,012 Bytes
23 Pages / 439.642 x 666.49 pts Page_size
4 Downloads / 362 Views

Representing unstructured text semantics for reasoning purpose Zohre Moteshakker Arani1 · Ahmad Abdollahzadeh Barforoush2 · Hossein Shirazi1 Received: 27 January 2020 / Revised: 9 September 2020 / Accepted: 10 September 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract To interpret a natural language text using a machine, we need to convert its semantics into structured information. In the field of Natural Language Processing, multiple tasks have been designed and developed to interpret the semantics of an unstructured text, and change words into meanings. However, there are some challenges in directly using the output of these tasks in subsequent applications such as logical inference. There has been a growing interest in building and enhancing state-of-the-art semantic representation systems in recent years. However, most of these systems involve supervised models that benefit from manually annotated data, which is not accessible for a wide range of languages. This paper presents a new framework for modeling text in order to extract its information, and through an inference system, obtain new information that is not explicitly stated in the text, but could be logically inferred. This framework is based on Open Information Extraction and Semantic Web techniques for machine reading. We translate the text into a machine-readable representation by using Semantic Types Identification and Question-based Semantic Role Labeling, which could be used in low-resource languages. We integrate the extracted information into the background knowledge by using existing Semantic Web standards. The proposed framework could increase generalization of labelling and reduce ambiguities, therefore, it is an appropriate solution for preparing text for reasoning systems. Keywords Text analysis · Open information extraction · Rule-based inference · Semantic role labeling · Semantic representation

Zohre Moteshakker Arani

[email protected] Ahmad Abdollahzadeh Barforoush [email protected] Hossein Shirazi [email protected] 1

Faculty of Electrical and Computer Engineering, Malek Ashtar University of Technology, Tehran, Iran

2

Faculty of Computer and IT Engineering, Amirkabir University of Technology, Tehran, Iran

Journal of Intelligent Information Systems

1 Introduction A vast amount of information which is produced and exchanged on the internet and virtual networks, is often in the form of unstructured texts. The need for automatic processing and analyzing of these texts has led to designing multiple tasks in the field of Natural Language Processing, with the aim of automating semantic analysis of massive texts. For the purpose of reasoning, we seek to extract complete and unambiguous information from text, and we need this information to be integrated with the background knowledge. But reading texts and preparing them for a decision-making system in a non-restricted domain, is still an open problem. Open Information Extraction (Open IE) systems extract entities and their relationships from open-dom

Data Loading...

Representing unstructured text semantics for reasoning purpose

Recommend Documents

An Operational Semantics of Program Dependence Graphs for Unstructured Programs

Text classification algorithms for mining unstructured data: a SWOT analysis

Hybrid Attention Based Neural Architecture for Text Semantics Similarity Measurement

Combinatorics and Reasoning Representing, Justifying and Building Is

Building a Machine Learning Model for Unstructured Text Classification: Towards Hybrid Approach

An Exercise in a Non-classical Semantics for Reasoning with Incompleteness and Inconsistencies

Purpose

Purpose

Hoare-Style Logic for Unstructured Programs

Representing Functions

Evaluation of a Concept Mapping Task Using Named Entity Recognition and Normalization in Unstructured Clinical Text

Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining tech