RCaBoChaMx {RCaBoCha}R Documentation

RCaBoChaMx

Description

creates a document-term matrix out of a file or all textfiles in a given directory.

Usage

RCaBoChaMx(directory, rmT= c("記号"), str2 = "ない",  minFreq = 1, weight = "no")

Arguments

directory directory path or a filename (may include path).
rmT specifies which parts of speech should be removed.
str2 a japaense word
minFreq words of a document appearing less than minDocFreq within that document will be ignored.
weight Calculates a weighted document-term matrix with some options.

Details

All textfiles in the specified directory are read in and a matrix is composed. Every cell of the matrix shows the actual frequency of each word.

Value

RCaBoChaMx the document-term matrix

Author(s)

Motohiro ISHIDA ishida.motohiro@gmail.comt


[Package RCaBoCha version 0.11 Index]