public class TesseractHelper extends Object
Modifier and Type | Method and Description |
---|---|
static Map<Integer,List<TextInfo>> |
parseHocrFile(List<File> inputFiles, TextPositioning textPositioning)
Parses each hocr file from the provided list, retrieves text, and returns data in the format described below.
|
public static Map<Integer,List<TextInfo>> parseHocrFile(List<File> inputFiles, TextPositioning textPositioning) throws IOException
inputFiles
- list of input files
textPositioning
- TextPositioning
Map
where key is Integer
representing the number of the page and value is List
of TextInfo
elements where each TextInfo
element contains a word or a line and its 4 coordinates(bbox)
IOException
- if error occurred during reading one the provided files
Copyright © 1998–2020 iText Group NV. All rights reserved.