Skip to content
This repository has been archived by the owner on Jul 10, 2019. It is now read-only.

GATE module

butlermh edited this page Jun 2, 2011 · 2 revisions

GATE commands are found in behemoth-gate.job.

usage: com.digitalpebble.behemoth.gate.GATECorpusGenerator -i <input> -o <output> [--h]
-i, --input           Behemoth corpus
-o, --output          GATE corpus directory
-h, --help            Print the help message

Converts a Behemoth corpus into a XML corpus for GATE. This is used mostly for displaying the documents with the GATE GUI. This is not a map-reduce job.

usage: com.digitalpebble.behemoth.gate.GATEDriver <in> <out> <path_gate_file>
<in>                  The input path on HDFS
<out>                 The output path on HDFS
<path_gate_file>      The path to the zip file on HDFS containing the GATE application.

This processes a Behemoth corpus using a zipped GATE application.

Behemoth Modules | Home

Clone this wiki locally