Class OnnxDetectionPostProcessor

java.lang.Object
com.itextpdf.pdfocr.onnx.detection.BasicDetectionPostProcessor
com.itextpdf.pdfocr.onnx.detection.OnnxDetectionPostProcessor
All Implemented Interfaces:
IDetectionPostProcessor

public class OnnxDetectionPostProcessor extends BasicDetectionPostProcessor
Implementation of a text detection predictor post-processor, used for OnnxTR model outputs.

Current implementation works somewhat like this:

  1. Model output is binarized and then cleaned-up via erosion and dilation.
  2. Large-enough contours from the image in the previous step are found.
  3. Contours with less certainty score are discarded.
  4. Remaining contours are wrapped into boxes with relative [0, 1] coordinates.
  • Constructor Details

    • OnnxDetectionPostProcessor

      public OnnxDetectionPostProcessor (float binarizationThreshold, float scoreThreshold)
      Creates a new post-processor.
      Parameters:
      binarizationThreshold - threshold value used, when binarizing a monochromatic image. If pixel value is greater or equal to the threshold, it is mapped to 1, otherwise it is mapped to 0
      scoreThreshold - score threshold for a detected box. If score is lower than this value, the box gets discarded
    • OnnxDetectionPostProcessor

      public OnnxDetectionPostProcessor()
      Creates a new post-processor with the default threshold values.
  • Method Details

    • findTextContours

      protected org.bytedeco.opencv.opencv_core.MatVector findTextContours (org.bytedeco.opencv.opencv_core.Mat mask)
      Extracts text contours from the provided 0 - 255 mask.
      Overrides:
      findTextContours in class BasicDetectionPostProcessor
      Parameters:
      mask - mask to find contours in, can be modified, should not be closed
      Returns:
      found text contours
    • mapPredToSample

      protected float mapPredToSample (float pred)
      Calculates the score sample value, based on a prediction value from the buffer.
      Overrides:
      mapPredToSample in class BasicDetectionPostProcessor
      Parameters:
      pred - prediction value to map
      Returns:
      mapped score
    • calcTextBoxEnlargement

      protected double calcTextBoxEnlargement (double width, double height)
      Calculates by how much the dimensions of a text box should be enlarged compared to the ones gotten from the model output.
      Overrides:
      calcTextBoxEnlargement in class BasicDetectionPostProcessor
      Parameters:
      width - original width of the text box
      height - original height of the text box
      Returns:
      value to enlarge the dimensions by