docMatrix2 {RMeCab}R Documentation

docMatrix2

Description

creates a document-term matrix out of a file or all textfiles in a given directory.

Usage

docMatrix2(directory, pos= c("名詞","形容詞"),  minFreq = 1, sym = 0, weight = "no", kigo = "記号")

Arguments

directory directory path or a filename (may include path).
pos specifies which parts of speech should be extracted.
sym if total must include number of symbols, set sym = 1. Default being 0
minFreq words of a document appearing less than minDocFreq within that document will be ignored.
weight Calculates a weighted document-term matrix with some options.
kigo

Details

All textfiles in the specified directory are read in and a matrix is composed. Every cell of the matrix shows the actual frequency of each word.

Value

docMatrix2 the document-term matrix

Author(s)

Motohiro ISHIDA ishida.motohiro@gmail.comt


[Package RMeCab version 0.67 Index]