Package com.norconex.commons.lang.io
Class TextReader
java.lang.Object
java.io.Reader
com.norconex.commons.lang.io.TextReader
- All Implemented Interfaces:
Closeable,AutoCloseable,Readable
Reads text form an input stream, splitting it wisely whenever the text
is too large. First tries to split after the last paragraph. If there
are no paragraph, it tries to split after the last sentence. If no sentence
can be detected, it splits on the last word. If no words are found,
it returns all it could read up to the maximum read size in characters.
The default maximum number of characters to be read before splitting
is 10 millions. Passing
-1 as the maxReadSize
will disable reading in batch and will read the entire text all at once.- Since:
- 1.6.0
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionTextReader(Reader reader) Create a new text reader, reading a maximum of 10 million characters at a time whenreadText()is called.TextReader(Reader reader, int maxReadSize) Constructor.TextReader(Reader reader, int maxReadSize, boolean removeTrailingDelimiter) Constructor. -
Method Summary
Methods inherited from class java.io.Reader
mark, markSupported, nullReader, read, read, read, ready, reset, skip, transferTo
-
Field Details
-
DEFAULT_MAX_READ_SIZE
public static final int DEFAULT_MAX_READ_SIZE- See Also:
-
-
Constructor Details
-
TextReader
Create a new text reader, reading a maximum of 10 million characters at a time whenreadText()is called.- Parameters:
reader- a Reader
-
TextReader
Constructor.- Parameters:
reader- a ReadermaxReadSize- maximum to read at once withreadText().
-
TextReader
Constructor.- Parameters:
reader- a ReadermaxReadSize- maximum to read at once withreadText().removeTrailingDelimiter- whether to remove trailing delimiter
-
-
Method Details
-
read
- Specified by:
readin classReader- Throws:
IOException
-
readText
Reads the next chunk of text, up to the maximum read size specified. It tries as much as possible to break long text into paragraph, sentences or words, before returning. See class documentation.- Returns:
- text read
- Throws:
IOException- problem reading text.
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Specified by:
closein classReader- Throws:
IOException
-