Annals of Computer Science and Information Systems, Volume 8

Proceedings of the 2016 Federated Conference on Computer Science and Information Systems

Semantic Knowledge Extraction from Research Documents


Abstract. In this paper, we designed a knowledge supporting software system in which sentences and key words are extracted from large scale document database. This system consists of semantic representation scheme for natural language processing of the document database. Documents originally in a form of PDF are broken into triple-store data after pre-processing. The semantic representation is a hyper-graph which consists of collections of binary relations of ‘triples'. According to a certain rule based on user's interests, the system identify sentences and words of interests. The relationship of those extracted sentences is visualized in the form of network graph. A user of the system can introduce new rules to create additional relationship between sentences and words. For practical example, we chose a set of research papers related IoT for the analysis. Applying several rules concerning authors' indicated keywords as well as the system's specified discourse words, significant sentences are extracted from the papers.


