Module NewOCR

Interface Actions

  • All Known Implementing Classes:
    OCRActions

    public interface Actions
    A class to provide general actions for the OCR.
    Since:
    April 25, 2019
    • Method Detail

      • getLetters

        void getLetters​(SearchImage searchImage,
                        java.util.List<SearchCharacter> searchCharacters)
        Gets the SearchCharacter characters found in the given SearchImage. This works by dividing it up into lines, then horizontally. Each individual section then has vertical padding removed. Any 'characters' that are 2x2 pixels or less are discarded. More information on this method can be found on page 55 of this paper.
        Parameters:
        searchImage - The image to scan
        searchCharacters - The list that will have all of the SearchCharacters added to
      • getLettersDuringTraining

        java.util.List<CharacterLine> getLettersDuringTraining​(SearchImage searchImage)
        Gets the SearchCharacters found in training. This is different because it assumes that there are whole lines to help group characters.
        Parameters:
        searchImage - The training image to scan
        Returns:
        A collection of a list contianing the characters in a line
      • getCharacterFor

        java.util.Optional<ImageLetter> getCharacterFor​(SearchCharacter searchCharacter,
                                                        IntPair lineBounds)
        Actually matches the SearchCharacter object to a real character from the database with line bounds for improved accuracy.
        Parameters:
        searchCharacter - The input SearchCharacter to match to
        lineBounds - The line bounds (Key/value is top/bottom Y values respectively) for improved accuracy
        Returns:
        The ImageLetter object with the DatabaseCharacter inside it containing the found character
      • getCharacterFor

        java.util.Optional<ImageLetter> getCharacterFor​(SearchCharacter searchCharacter,
                                                        it.unimi.dsi.fastutil.objects.Object2DoubleMap<ImageLetter> diffs,
                                                        IntPair lineBounds)
        Actually matches the SearchCharacter object to a real character from the database with line bounds for improved accuracy.
        Parameters:
        searchCharacter - The input SearchCharacter to match to
        diffs - The potential ImageLetters
        lineBounds - The line bounds (Key/value is top/bottom Y values respectively) for improved accuracy
        Returns:
        The ImageLetter object with the DatabaseCharacter inside it containing the found character
      • getFontSize

        java.util.OptionalDouble getFontSize​(ImageLetter imageLetter)
        Gets the estimated font size for the given ImageLetter.
        Parameters:
        imageLetter - The ImageLetter to get the font size of
        Returns:
        The font size in pixels
      • getLineBoundsForTraining

        java.util.List<IntPair> getLineBoundsForTraining​(SearchImage image)
        Gets the top and bottom line bounds found from the value 2D array. This is used for getting characters for training data.
        Parameters:
        image - The image to get the line bounds from
        Returns:
        A list of the absolute top and bottom line values