ICEpdf
  1. ICEpdf
  2. PDF-110

Demonstrate capturing to multi-page fax TIFF

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0 - Beta
    • Fix Version/s: 4.0
    • Component/s: Examples
    • Labels:
      None
    • Environment:
      JAI ImageIO Tools

      Description

      We should build on our existing page to image capturing example, by showing:
      - Using black and white image, instead of RGB image
      - Writing to TIFF, which ImageIO doesn't support out of the box
      - Writing to multi-page TIFF, instead of just one page per output image file
      - Calculating zoom level from page resolution, to target a specific DPI and image quality level
      - Specifying output image compression type

        Activity

        Hide
        Mark Collette added a comment -

        This capture example opens a PDF whose path is given on the command line, goes through every page in it, draws the page onto a black and white image, and feeds that to ImageIO, which writes a multi-page CCITT fax group 4 compressed TIFF file. The zoom level used on each page is calculated to give 300 (print) or 200 (fax) DPI of resolution.

        Subversion 20023
        icepdf\examples\captureMultiple\MultiPageCapture.java

        Subversion 22150
        icepdf-pro\resources\lib\jai_imageio.jar

        Show
        Mark Collette added a comment - This capture example opens a PDF whose path is given on the command line, goes through every page in it, draws the page onto a black and white image, and feeds that to ImageIO, which writes a multi-page CCITT fax group 4 compressed TIFF file. The zoom level used on each page is calculated to give 300 (print) or 200 (fax) DPI of resolution. Subversion 20023 icepdf\examples\captureMultiple\MultiPageCapture.java Subversion 22150 icepdf-pro\resources\lib\jai_imageio.jar
        Hide
        Mark Collette added a comment -

        I was able to find several examples online for using JAI and for using ImageIO to write a multi-page TIFF online. This provided a good starting point.

        Originally looked into using JAI directly, but found the ImageIO API easier to use. As well, there's no need for JAI dependencies, or for reflection, when coding to the ImageIO interface. Alternate implementations can be plugged-in. The compression type parameter for group 4 seems to be specific to JAI's implementation, though.

        Tried using Document.getPageImage, but found that the TIFF writer would not automatically convert the RGB source image into black and white. Instead it required a black and white source image, which necessitated creating the image myself, and using the slightly more involved Document.paintPage API.

        Was going to use a fixed zoom level, assuming a standard 72 DPI input, but rethought that, because of the likelihood of once in a while getting higher resolution inputs, and not wanting to potentially waste space from blowing those up, so switched to calculating DPI. Note that the calculation is still specific to using US Letter paper for printing. If someone wanted to output 300 DPI on a differently sized page, the calculation would still need to change. That's part of the trade-off of image capturing to a file, is having to know ahead of time what resolution will be sufficient, when trying to minimise file size.

        Show
        Mark Collette added a comment - I was able to find several examples online for using JAI and for using ImageIO to write a multi-page TIFF online. This provided a good starting point. Originally looked into using JAI directly, but found the ImageIO API easier to use. As well, there's no need for JAI dependencies, or for reflection, when coding to the ImageIO interface. Alternate implementations can be plugged-in. The compression type parameter for group 4 seems to be specific to JAI's implementation, though. Tried using Document.getPageImage , but found that the TIFF writer would not automatically convert the RGB source image into black and white. Instead it required a black and white source image, which necessitated creating the image myself, and using the slightly more involved Document.paintPage API. Was going to use a fixed zoom level, assuming a standard 72 DPI input, but rethought that, because of the likelihood of once in a while getting higher resolution inputs, and not wanting to potentially waste space from blowing those up, so switched to calculating DPI. Note that the calculation is still specific to using US Letter paper for printing. If someone wanted to output 300 DPI on a differently sized page, the calculation would still need to change. That's part of the trade-off of image capturing to a file, is having to know ahead of time what resolution will be sufficient, when trying to minimise file size.
        Hide
        Mark Collette added a comment -

        An imge of a signature is showing up in the viewer window, but not in the extracted TIFF image. The details of the image are:

        Decode=[0.0, 1.0]
        Filter=CCITTFaxDecode
        Intent=RelativeColorimetric
        DecodeParms=

        {K=-1, Columns=564, Rows=156}

        BitsPerComponent=1
        ImageMask=true

        Show
        Mark Collette added a comment - An imge of a signature is showing up in the viewer window, but not in the extracted TIFF image. The details of the image are: Decode= [0.0, 1.0] Filter=CCITTFaxDecode Intent=RelativeColorimetric DecodeParms= {K=-1, Columns=564, Rows=156} BitsPerComponent=1 ImageMask=true
        Hide
        Mark Collette added a comment -

        At first I assumed that the viewer environment was different than the extraction environment, because of either classpath, or some encryption callback, or other hook that the viewer does outside of the core, in the RI. I tracked down every possible JAR and callback, and tried either adding them to the extraction, or removing them from the viewer. Nothing made a difference.

        Then, I tracked into the Stream and Shapes classes, to examine the resulting parsed image, thinking that this would show whether the problem was parsing the image or displaying it. Those classes showed that they had the signature image in a valid state, properly parsed.

        So I then examined what could be causing the image to not be displayed properly. I thought that maybe the clipping was working for one and not the other, maybe due to some round-off error. When disabling the clipping and transforming, I saw that the black area then moved and grew, which indicated that the image was being drawn, just somehow as all black, even though it showed up as a signature when painted on the viewer frame and in my test JFrame. Commenting out the image drawing command made the black disappear. So the black rectangle was not the omission of the image, it was how the image was coming across. So, somehow it was being drawn differently in the two contexts.

        This made me think that I should draw it into a regular RGB image, instead of right into the black and white image, since maybe something about the destination image made it lose information, leaving only black. But the source image was black and white too, so that didn't add up. Then I clued in to see if the source image had any alpha related attributes. It did, being an image mask. The destination image was opaque, and did not support alpha. So, I experimented making the destination image not be opaque, and instead either use a bit mask or translucence. Both worked, in Java 1.6 and Java 1.5. So I went with the bit mask.

        Subversion 20098
        icepdf\examples\captureMultiple\MultiPageCapture.java

        Show
        Mark Collette added a comment - At first I assumed that the viewer environment was different than the extraction environment, because of either classpath, or some encryption callback, or other hook that the viewer does outside of the core, in the RI. I tracked down every possible JAR and callback, and tried either adding them to the extraction, or removing them from the viewer. Nothing made a difference. Then, I tracked into the Stream and Shapes classes, to examine the resulting parsed image, thinking that this would show whether the problem was parsing the image or displaying it. Those classes showed that they had the signature image in a valid state, properly parsed. So I then examined what could be causing the image to not be displayed properly. I thought that maybe the clipping was working for one and not the other, maybe due to some round-off error. When disabling the clipping and transforming, I saw that the black area then moved and grew, which indicated that the image was being drawn, just somehow as all black, even though it showed up as a signature when painted on the viewer frame and in my test JFrame. Commenting out the image drawing command made the black disappear. So the black rectangle was not the omission of the image, it was how the image was coming across. So, somehow it was being drawn differently in the two contexts. This made me think that I should draw it into a regular RGB image, instead of right into the black and white image, since maybe something about the destination image made it lose information, leaving only black. But the source image was black and white too, so that didn't add up. Then I clued in to see if the source image had any alpha related attributes. It did, being an image mask. The destination image was opaque, and did not support alpha. So, I experimented making the destination image not be opaque, and instead either use a bit mask or translucence. Both worked, in Java 1.6 and Java 1.5. So I went with the bit mask. Subversion 20098 icepdf\examples\captureMultiple\MultiPageCapture.java

          People

          • Assignee:
            Mark Collette
            Reporter:
            Mark Collette
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: