- All Implemented Interfaces:
- Configurable, Tool
public class CustomFields
- extends Configured
- implements Tool
Creates custom FieldWritable objects from a text file containing field
information including field name, value, and optional boost and fields type
(as needed by FieldWritable objects).
An input text file to CustomFields would be tab separated and would look
similar to this:
The only required fields are url, name and value. Custom fields are
configured through the custom-fields.xml file in the classpath. The config
file allow you to set defaults for whether a field is indexed, stored, and
tokenized, boosts on a field, and whether a field can output multiple values
under the same key.
The purpose of the CustomFields job is to allow better integration with
technologies such as Hadoop Streaming. Streaming jobs can be created in any
programming language, can output the text file needed by the CustomFields
job, and those fields can then be included in the index.
The concept of custom fields requires two separate pieces. The indexing piece
and the query piece. The indexing piece is handled by the CustomFields job.
The query piece is handled by the query-custom plugin.
Currently, because of the way the query plugin
architecture works, custom fields names must be added to the fields parameter
in the query-custom plugin plugin.xml file in order to be queried.
The CustomFields tool accepts one or more directories containing text files
in the appropriate custom field format. These files are then turned into
FieldWritable objects to be included in the index.
Runs the CustomFields job.
|Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public static final org.apache.commons.logging.Log LOG
public static void main(String args)
public int run(String args)
- Runs the CustomFields job.
- Specified by:
run in interface
Copyright © 2006 The Apache Software Foundation