漢字テクスト検索システムKR

KR : A retrieval system for Japanese texts

Matsuo Masatsugu

fulltext

1.67 MB

hps_14_131.pdf

Abstract

The present paper is an interim report of KR, a simple concording and word counting microcomputer program for full texts of Japanese, that is, texts represented in kanji, kana and other 2 byte symbols. The program was developed as part of a research project on 'Full Text Data Base of Documents of Atomic Bomb Damages', a report of which is also in this issue. The program, written in the C language for portability, is intended, first of all, as an easy and quick tool of searching and retrieving parts of a text where a given word or string apprears. In view of this, the usually time-consuming procedure of preparing Japanese texts,, which requires the delimitation of every word in the texts and the manual lemmmatization or explicit specification of the rules of lemmatization, are all drastically simplified. Users are expected only to prepare an MS-DOS text (or ASCII) files. The process of seraching conducted by a menu is also intended to be simple and quick. But, it cannot be so simple and easy because the program must satisfy many different reseach needs of users. Therefore, the program offers the following facilities as options. searching of a pair of terms, reordering, merging, and pairing of results of a search limiting of the range of search to part(s) or subtext(s) simultaneous counting and/or searching of more than one terms.

About This Article

all_journal_index

The Center for Peace, Hiroshima University