Class OnnxDetectionPostProcessor

java.lang.Object
com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPostProcessor
All Implemented Interfaces:
IDetectionPostProcessor

public class OnnxDetectionPostProcessor extends Object implements IDetectionPostProcessor
Implementation of a text detection predictor post-processor, used for OnnxTR model outputs.

Current implementation works somewhat like this:

  1. Model output is binarized and then cleaned-up via erosion and dilation.
  2. Large-enough contours from the image in the previous step are found.
  3. Contours with less certainty score are discarded.
  4. Remaining contours are wrapped into boxes with relative [0, 1] coordinates.
  • Constructor Details

    • OnnxDetectionPostProcessor

      public OnnxDetectionPostProcessor (float binarizationThreshold, float scoreThreshold)
      Creates a new post-processor.
      Parameters:
      binarizationThreshold - threshold value used, when binarizing a monochromatic image. If pixel value is greater or equal to the threshold, it is mapped to 1, otherwise it is mapped to 0
      scoreThreshold - score threshold for a detected box. If score is lower than this value, the box gets discarded
    • OnnxDetectionPostProcessor

      public OnnxDetectionPostProcessor()
      Creates a new post-processor with the default threshold values.
  • Method Details

    • process

      public List process (BufferedImage input, FloatBufferMdArray output)
      Process ML model output for a specified image and return a list of detected objects.
      Specified by:
      process in interface IDetectionPostProcessor
      Parameters:
      input - input image, which was used to produce the inputs to the ML model
      output - normalized output of the ML model
      Returns:
      a list of detected objects. See interface documentation for more information