Class BasicDetectionPostProcessor
java.lang.Object
com.itextpdf.pdfocr.onnx.detection.BasicDetectionPostProcessor
- All Implemented Interfaces:
-
IDetectionPostProcessor
- Direct Known Subclasses:
-
EasyOcrDetectionPostProcessor,OnnxDetectionPostProcessor,PaddleOcrDetectionPostProcessor
Implementation of a text detection predictor post-processor, which is used as a basis for creating post-processors for handling OnnxTR, EasyOCR and PaddleOCR model outputs.
Base implementation works somewhat like this:
- Model output is binarized to create a predictions mask.
- Large-enough contours from the mask in the previous step are found.
- Contours with less certainty score are discarded.
- Remaining contours are wrapped into boxes with relative [0, 1] coordinates.
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedBasicDetectionPostProcessor(float binarizationThreshold, float scoreThreshold, int maxCandidates) Creates a new post-processor. -
Method Summary
Modifier and TypeMethodDescriptionprotected org.bytedeco.opencv.opencv_core.MatbuildTextContourPredictionMask(org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox) Builds and return a mask for calculating prediction score for the provided contour.protected doublecalcTextBoxEnlargement(double width, double height) Calculates by how much the dimensions of a text box should be enlarged compared to the ones gotten from the model output.protected IScoreCalculatorCreates a new score calculator for calculating score over a text contour.protected org.bytedeco.opencv.opencv_core.MatVectorfindTextContours(org.bytedeco.opencv.opencv_core.Mat mask) Extracts text contours from the provided 0 - 255 mask.protected FloatBufferMdArraygetMaskSourceArray(FloatBufferMdArray output) Returns the array to be used, when building a mask for contour detection.protected FloatBufferMdArraygetPredsArray(FloatBufferMdArray output) Returns the preds array from the output buffer.protected booleanisValidContour(org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox) Returns whether the contour is good enough to be a text box.protected floatmapPredToSample(float pred) Calculates the score sample value, based on a prediction value from the buffer.process(BufferedImage input, FloatBufferMdArray output) Process ML model output for a specified image and return a list of detected objects.
-
Constructor Details
-
BasicDetectionPostProcessor
protected BasicDetectionPostProcessor(float binarizationThreshold, float scoreThreshold, int maxCandidates) Creates a new post-processor.- Parameters:
-
binarizationThreshold- threshold value used, when binarizing a monochromatic image. If pixel value is greater or equal to the threshold, it is mapped to 1, otherwise it is mapped to 0 -
scoreThreshold- score threshold for a detected box. If score is lower than this value, the box gets discarded -
maxCandidates- maximum amount of text box contours, that will be handled in the post processor
-
-
Method Details
-
process
Process ML model output for a specified image and return a list of detected objects.- Specified by:
-
processin interfaceIDetectionPostProcessor - Parameters:
-
input- input image, which was used to produce the inputs to the ML model -
output- output of the ML model - Returns:
- a list of detected objects. See interface documentation for more information
-
getPredsArray
Returns the preds array from the output buffer.- Parameters:
-
output- output buffer from the model - Returns:
- the preds array
-
getMaskSourceArray
Returns the array to be used, when building a mask for contour detection.- Parameters:
-
output- output buffer from the model - Returns:
- the array to build the mask from
-
findTextContours
protected org.bytedeco.opencv.opencv_core.MatVector findTextContours(org.bytedeco.opencv.opencv_core.Mat mask) Extracts text contours from the provided 0 - 255 mask.- Parameters:
-
mask- mask to find contours in, can be modified, should not be closed - Returns:
- found text contours
-
isValidContour
protected boolean isValidContour(org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox) Returns whether the contour is good enough to be a text box. Called before score calculations.- Parameters:
-
contour- contour to check -
contourBox- bounding box of the contour to check - Returns:
- whether the contour is good enough to be a text box
-
buildTextContourPredictionMask
protected org.bytedeco.opencv.opencv_core.Mat buildTextContourPredictionMask(org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox) Builds and return a mask for calculating prediction score for the provided contour.Mask should adhere to the following requirements:
- Mask should have the same dimensions as the contour box.
- Data type should be CV_8U.
- Pixels, that should be counted towards the score, should have a non-zero value in the mask.
- Parameters:
-
contour- contour to build mask for -
contourBox- bounding box of the contour to build mask for - Returns:
- the built mask
-
createScoreCalculator
Creates a new score calculator for calculating score over a text contour.- Returns:
- a new score calculator
-
mapPredToSample
protected float mapPredToSample(float pred) Calculates the score sample value, based on a prediction value from the buffer.- Parameters:
-
pred- prediction value to map - Returns:
- mapped score
-
calcTextBoxEnlargement
protected double calcTextBoxEnlargement(double width, double height) Calculates by how much the dimensions of a text box should be enlarged compared to the ones gotten from the model output.- Parameters:
-
width- original width of the text box -
height- original height of the text box - Returns:
- value to enlarge the dimensions by
-