java.lang.Object

com.itextpdf.pdfocr.onnx.detection.BasicDetectionPostProcessor

All Implemented Interfaces:: IDetectionPostProcessor

Direct Known Subclasses:: EasyOcrDetectionPostProcessor, OnnxDetectionPostProcessor, PaddleOcrDetectionPostProcessor

public abstract class BasicDetectionPostProcessor extends Object implements IDetectionPostProcessor

Implementation of a text detection predictor post-processor, which is used as a basis for creating post-processors for handling OnnxTR, EasyOCR and PaddleOCR model outputs.

Base implementation works somewhat like this:

Model output is binarized to create a predictions mask.
Large-enough contours from the mask in the previous step are found.
Contours with less certainty score are discarded.
Remaining contours are wrapped into boxes with relative [0, 1] coordinates.

Constructor Summary

Constructors

Modifier

Constructor

Description

protected

BasicDetectionPostProcessor(float binarizationThreshold, float scoreThreshold, int maxCandidates)

Creates a new post-processor.
Method Summary

Modifier and Type

Method

Description

protected org.bytedeco.opencv.opencv_core.Mat

buildTextContourPredictionMask(org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox)

Builds and return a mask for calculating prediction score for the provided contour.

protected double

calcTextBoxEnlargement(double width, double height)

Calculates by how much the dimensions of a text box should be enlarged compared to the ones gotten from the model output.

protected IScoreCalculator

createScoreCalculator()

Creates a new score calculator for calculating score over a text contour.

protected org.bytedeco.opencv.opencv_core.MatVector

findTextContours(org.bytedeco.opencv.opencv_core.Mat mask)

Extracts text contours from the provided 0 - 255 mask.

protected FloatBufferMdArray

getMaskSourceArray(FloatBufferMdArray output)

Returns the array to be used, when building a mask for contour detection.

protected FloatBufferMdArray

getPredsArray(FloatBufferMdArray output)

Returns the preds array from the output buffer.

protected boolean

isValidContour(org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox)

Returns whether the contour is good enough to be a text box.

protected float

mapPredToSample(float pred)

Calculates the score sample value, based on a prediction value from the buffer.

List

process(BufferedImage input, FloatBufferMdArray output)

Process ML model output for a specified image and return a list of detected objects.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- BasicDetectionPostProcessor
  
  protected BasicDetectionPostProcessor (float binarizationThreshold, float scoreThreshold, int maxCandidates)
  
  Creates a new post-processor.
  
  Parameters:
  
  binarizationThreshold - threshold value used, when binarizing a monochromatic image. If pixel value is greater or equal to the threshold, it is mapped to 1, otherwise it is mapped to 0
  
  scoreThreshold - score threshold for a detected box. If score is lower than this value, the box gets discarded
  
  maxCandidates - maximum amount of text box contours, that will be handled in the post processor
Method Details
- process
  
  public List process (BufferedImage input, FloatBufferMdArray output)
  
  Process ML model output for a specified image and return a list of detected objects.
  
  Specified by:
  
  process in interface IDetectionPostProcessor
  
  Parameters:
  
  input - input image, which was used to produce the inputs to the ML model
  
  output - output of the ML model
  
  Returns:
  
  a list of detected objects. See interface documentation for more information
- getPredsArray
  
  protected FloatBufferMdArray getPredsArray (FloatBufferMdArray output)
  
  Returns the preds array from the output buffer.
  
  Parameters:
  
  output - output buffer from the model
  
  Returns:
  
  the preds array
- getMaskSourceArray
  
  protected FloatBufferMdArray getMaskSourceArray (FloatBufferMdArray output)
  
  Returns the array to be used, when building a mask for contour detection.
  
  Parameters:
  
  output - output buffer from the model
  
  Returns:
  
  the array to build the mask from
- findTextContours
  
  protected org.bytedeco.opencv.opencv_core.MatVector findTextContours (org.bytedeco.opencv.opencv_core.Mat mask)
  
  Extracts text contours from the provided 0 - 255 mask.
  
  Parameters:
  
  mask - mask to find contours in, can be modified, should not be closed
  
  Returns:
  
  found text contours
- isValidContour
  
  protected boolean isValidContour (org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox)
  
  Returns whether the contour is good enough to be a text box. Called before score calculations.
  
  Parameters:
  
  contour - contour to check
  
  contourBox - bounding box of the contour to check
  
  Returns:
  
  whether the contour is good enough to be a text box
- buildTextContourPredictionMask
  
  protected org.bytedeco.opencv.opencv_core.Mat buildTextContourPredictionMask (org.bytedeco.opencv.opencv_core.Mat contour, org.bytedeco.opencv.opencv_core.Rect contourBox)
  Builds and return a mask for calculating prediction score for the provided contour.
  Mask should adhere to the following requirements:
  
  Mask should have the same dimensions as the contour box.
  
  Data type should be CV_8U.
  
  Pixels, that should be counted towards the score, should have a non-zero value in the mask.
  Parameters:
  
  contour - contour to build mask for
  
  contourBox - bounding box of the contour to build mask for
  
  Returns:
  
  the built mask
- createScoreCalculator
  
  protected IScoreCalculator createScoreCalculator()
  
  Creates a new score calculator for calculating score over a text contour.
  
  Returns:
  
  a new score calculator
- mapPredToSample
  
  protected float mapPredToSample (float pred)
  
  Calculates the score sample value, based on a prediction value from the buffer.
  
  Parameters:
  
  pred - prediction value to map
  
  Returns:
  
  mapped score
- calcTextBoxEnlargement
  
  protected double calcTextBoxEnlargement (double width, double height)
  
  Calculates by how much the dimensions of a text box should be enlarged compared to the ones gotten from the model output.
  
  Parameters:
  
  width - original width of the text box
  
  height - original height of the text box
  
  Returns:
  
  value to enlarge the dimensions by

Class BasicDetectionPostProcessor

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

BasicDetectionPostProcessor

Method Details

process

getPredsArray

getMaskSourceArray

findTextContours

isValidContour

buildTextContourPredictionMask

createScoreCalculator

mapPredToSample

calcTextBoxEnlargement