OnnxDetectionPostProcessor (pdfOCR 4.1.0 API)

java.lang.Object

com.itextpdf.pdfocr.onnxtr.detection.OnnxDetectionPostProcessor

All Implemented Interfaces:: IDetectionPostProcessor

public class OnnxDetectionPostProcessor extends Object implements IDetectionPostProcessor

Implementation of a text detection predictor post-processor, used for OnnxTR model outputs.

Current implementation works somewhat like this:

Model output is binarized and then cleaned-up via erosion and dilation.
Large-enough contours from the image in the previous step are found.
Contours with less certainty score are discarded.
Remaining contours are wrapped into boxes with relative [0, 1] coordinates.

Constructor Summary

Constructors

Constructor

Description

OnnxDetectionPostProcessor()

Creates a new post-processor with the default threshold values.

OnnxDetectionPostProcessor(float binarizationThreshold, float scoreThreshold)

Creates a new post-processor.
Method Summary

Modifier and Type

Method

Description

List

process(BufferedImage input, FloatBufferMdArray output)

Process ML model output for a specified image and return a list of detected objects.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- OnnxDetectionPostProcessor
  
  public OnnxDetectionPostProcessor (float binarizationThreshold, float scoreThreshold)
  
  Creates a new post-processor.
  
  Parameters:
  
  binarizationThreshold - threshold value used, when binarizing a monochromatic image. If pixel value is greater or equal to the threshold, it is mapped to 1, otherwise it is mapped to 0
  
  scoreThreshold - score threshold for a detected box. If score is lower than this value, the box gets discarded
- OnnxDetectionPostProcessor
  
  public OnnxDetectionPostProcessor()
  
  Creates a new post-processor with the default threshold values.
Method Details
- process
  
  public List process (BufferedImage input, FloatBufferMdArray output)
  
  Process ML model output for a specified image and return a list of detected objects.
  
  Specified by:
  
  process in interface IDetectionPostProcessor
  
  Parameters:
  
  input - input image, which was used to produce the inputs to the ML model
  
  output - normalized output of the ML model
  
  Returns:
  
  a list of detected objects. See interface documentation for more information