Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.1
    • Fix Version/s: 4.2
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      any

      Description

      What seems to be happening is that we are substituting incorrectly the fonts for the OCR layer with a font that doesn't have the same width as the one used to generate the PDF. I've attached a screen shot which introduces an alpha value into the renderting stack so you can see the OCR text behind the image text.

        Activity

        Patrick Corless created issue -
        Patrick Corless made changes -
        Field Original Value New Value
        Salesforce Case []
        Fix Version/s 4.1.1 [ 10244 ]
        Patrick Corless made changes -
        Salesforce Case []
        Fix Version/s 4.2 [ 10243 ]
        Fix Version/s 4.1.1 [ 10244 ]
        Patrick Corless made changes -
        Salesforce Case []
        Assignee Priority P1
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #24026 Fri Feb 25 15:05:28 MST 2011 patrick.corless PDF-200 adjusted how we apply horizontal text scaling on glyphs, instead of being accumulative it is now a direct set of the text transforms horizontal scale. Previously we would over scale text if more then one sz was specified per text block.
        Files Changed
        Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/util/ContentParser.java
        Hide
        Patrick Corless added a comment -

        Finally had a change to take a close look at this selection issue. The PDF in question expose a small bug in a context parser where we where concatenating the horizontal text scaling number against the previous value. So if more then one "Tz" was specified per text block we would gradually shrink the text.

        For example

        81 Tz
        65 Tz

        First scale is 81% of the font width, followed by 65% of the previous value. The correct handling of this is to treat each as separate scales. Once the logic was adjust the text selection seem to correspond more directly with the original graphic/ocr capture.

        Took a while to find this one.

        Show
        Patrick Corless added a comment - Finally had a change to take a close look at this selection issue. The PDF in question expose a small bug in a context parser where we where concatenating the horizontal text scaling number against the previous value. So if more then one "Tz" was specified per text block we would gradually shrink the text. For example 81 Tz 65 Tz First scale is 81% of the font width, followed by 65% of the previous value. The correct handling of this is to treat each as separate scales. Once the logic was adjust the text selection seem to correspond more directly with the original graphic/ocr capture. Took a while to find this one.
        Patrick Corless made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #24151 Tue Mar 15 11:52:26 MDT 2011 patrick.corless PDF-200 reworded how we apply hScale, as we need ot apply negative values to flip the layout, width are not applied otherwise.
        Files Changed
        Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/util/ContentParser.java
        Ken Fyten made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Patrick Corless
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: