Class CrnnPostProcessor
java.lang.Object
com.itextpdf.pdfocr.onnxtr.recognition.CrnnPostProcessor
- All Implemented Interfaces:
-
IRecognitionPostProcessor
Implementation of a text recognition predictor post-processor, used for OnnxTR CRNN model outputs.
Notably it does not have end-of-string tokens. Only token, besides the vocabulary one, is blank, which is just skipped or used as a char separator. Multiple of the same label in a row is aggregated into one.
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a new post-processor with the default vocabulary.CrnnPostProcessor
(Vocabulary vocabulary) Creates a new post-processor. -
Method Summary
Modifier and TypeMethodDescriptionint
Returns the size of the output character label vector.process
(FloatBufferMdArray output) Process ML model output and return recognized characters as string.
-
Constructor Details
-
CrnnPostProcessor
Creates a new post-processor.- Parameters:
-
vocabulary
- vocabulary used for the model output (without special tokens)
-
CrnnPostProcessor
public CrnnPostProcessor()Creates a new post-processor with the default vocabulary.
-
-
Method Details
-
process
Process ML model output and return recognized characters as string.- Specified by:
-
process
in interfaceIRecognitionPostProcessor
- Parameters:
-
output
- raw output of the ML model - Returns:
- recognized characters as string
-
labelDimension
public int labelDimension()Returns the size of the output character label vector. I.e. how many distinct tokens/characters the model recognizes.- Specified by:
-
labelDimension
in interfaceIRecognitionPostProcessor
- Returns:
- the size of the output character label vector
-