RCaBoChaDF {RCaBoCha}R Documentation

RCaBoChaDF

Description

creates a document-term matrix out of a column in a dataframe.

Usage

RCaBoChaDF( charVec = c("CaBoCha"), str2 = "",  pos= "DEFAULT", minFreq = 1, weight = "no" , mecabrc = "" , cabocharc = "" )

Arguments

charVec column of strings.
str2 a japanese word
pos specifies which parts of speech should be extracted.
minFreq words of a document appearing less than minDocFreq within that document will be ignored.
weight Calculates a weighted document-term matrix with some options.
mecabrc to specify mecab resource file mecabrc
cabocharc to specify cabocha resource file cabocharc

Details

column in the specified dataframe are read in and a matrix is composed. Every cell of the matrix shows the actual frequency of each word.

Value

RCaBoChaDF the document-term matrix

Author(s)

Motohiro ISHIDA ishida.motohiro@gmail.comt

References

石田基広『Rによるテキストマイニング入門』森北出版 2008


[Package RCaBoCha version 0.26 Index]