docMatrix {RMeCab}R Documentation

docMatrix

Description

creates a document-term matrix out of all textfiles in a given directory.

Usage

docMatrix( mydir, pos= c("名詞","動詞","形容詞"), minFreq = 1, weight = "no",sym=0)
docVector( filename, pos, minFreq, sym)

Arguments

filename filename (may include path).
mydir the directory path to textfiles.
pos specifies which parts of speech should be extracted.
minFreq words of a document appearing less than minDocFreq within that document will be ignored.
weight Calculates a weighted document-term matrix with some options.
sym if total must include number of symbols set sym = 1. Default being 0

Details

All textfiles in the specified directory are read in and a matrix is composed. Every cell of the matrix shows the actual frequency of each word.

docVector() is a supporting function that creates a document-term frequency list.

Value

docMatrix the document-term matrix

Author(s)

Motohiro ISHIDA ishida.motohiro@gmail.comt


[Package RMeCab version 0.60 Index]