docDF {RMeCab}R Documentation

docDF

Description

docDF returns returns data frame of N-gram from a file or all files in a given directoryor or a column of data.frame. Each word of N-gram makes one column.

Usage

  docDF(target, column = 0, type = 0, pos = NULL, minFreq = 1, N = 1,Genkei = 0, weight ="", nDF = 0, co = 0 , dic = "", mecabrc = "", etc = "")

Arguments

target

directory path or a filename (may include path) or a data.frame.

column

column name or index, if data.frame . Default being 0

type

character (0) or term(1).Default being 0.

pos

vector of parts of speech . Default being all words extracted.

minFreq

words of a document appearing less than minDocFreq within that document will be ignored.

N

N-gram. Default being 1

Genkei

infinitive(0) or not(1)

weight

tf or/and idf.see weight

nDF

Ngrma as data frame

co

to make co-occurence matrix.

dic

to specify user dictionary, e.x. ishida.dic

mecabrc

to specify mecab resource file

etc

other options to mecab

Details

If necessary, more details than the description above

Value

returns a data frame.

Author(s)

Motohiro ISHIDA ishida.motohiro@gmail.com

References

石田基広『Rによるテキストマイニング入門』森北出版 2008

See Also

objects to See Also as help,


[Package RMeCab version 0.97 Index]