ICEpdf
  1. ICEpdf
  2. PDF-317

Page.getText() is not returning all page text

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2.1
    • Fix Version/s: 4.2.2
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      any

      Description

      A forum user has identified a bug with missing text when calling page.getText(). This method is used in the RI for the TextExtractionTask, it turns out that the "optimized" text extraction call is not initializing PDF XForm object and thus missing quite a bit of content during the extraction.

      The Content parser method parseTextBlocks() needs to be updated to insure the xform objects are correctly initalizied and parsed.

        Activity

        Patrick Corless created issue -
        Patrick Corless made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Ken Fyten made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Patrick Corless
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: