Class RegexFieldValueExtractor

java.lang.Object
com.norconex.commons.lang.text.RegexFieldValueExtractor

public class RegexFieldValueExtractor extends Object

Simplify extraction of field/value pairs (or "key/value") from text using regular expression. Match groups can be used to identify the fields and values. Field matching is optional and can be set explicitly instead. If both a "toField" and a "fieldGroup" are provided, the toField act as a default when no fields could be obtained from matching. At least one of "toField" or "fieldGroup" must be specified. If fieldGroup is specified without a "toField" and finds no matches, the matching of the value is ignored. If no value group is provided, it assumes the entire regex match is the value. If more than one value is extracted for a given toField, they will be available as a list.

When initialized with a "pattern" only instead of passing or configuring a Regex instance, a default one will be created, assuming case insensitivity and dots matching any character.

Since:
2.0.0 (moved from Norconex Importer RegexKeyValueExtractor)