[PDF-110] Demonstrate capturing to multi-page fax TIFF - ICEsoft JIRA Issue Tracker

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 4.0 - Beta
Fix Version/s: 4.0
Component/s: Examples
Labels:
None
Environment:
JAI ImageIO Tools

Description

We should build on our existing page to image capturing example, by showing:
- Using black and white image, instead of RGB image
- Writing to TIFF, which ImageIO doesn't support out of the box
- Writing to multi-page TIFF, instead of just one page per output image file
- Calculating zoom level from page resolution, to target a specific DPI and image quality level
- Specifying output image compression type

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Mark Collette added a comment - 18/Dec/09 1:03 PM

This capture example opens a PDF whose path is given on the command line, goes through every page in it, draws the page onto a black and white image, and feeds that to ImageIO, which writes a multi-page CCITT fax group 4 compressed TIFF file. The zoom level used on each page is calculated to give 300 (print) or 200 (fax) DPI of resolution.

Subversion 20023
icepdf\examples\captureMultiple\MultiPageCapture.java

Subversion 22150
icepdf-pro\resources\lib\jai_imageio.jar

Show

Mark Collette added a comment - 18/Dec/09 1:03 PM This capture example opens a PDF whose path is given on the command line, goes through every page in it, draws the page onto a black and white image, and feeds that to ImageIO, which writes a multi-page CCITT fax group 4 compressed TIFF file. The zoom level used on each page is calculated to give 300 (print) or 200 (fax) DPI of resolution. Subversion 20023 icepdf\examples\captureMultiple\MultiPageCapture.java Subversion 22150 icepdf-pro\resources\lib\jai_imageio.jar

Hide

Permalink

Mark Collette added a comment - 18/Dec/09 1:17 PM

I was able to find several examples online for using JAI and for using ImageIO to write a multi-page TIFF online. This provided a good starting point.

Originally looked into using JAI directly, but found the ImageIO API easier to use. As well, there's no need for JAI dependencies, or for reflection, when coding to the ImageIO interface. Alternate implementations can be plugged-in. The compression type parameter for group 4 seems to be specific to JAI's implementation, though.

Tried using Document.getPageImage, but found that the TIFF writer would not automatically convert the RGB source image into black and white. Instead it required a black and white source image, which necessitated creating the image myself, and using the slightly more involved Document.paintPage API.

Was going to use a fixed zoom level, assuming a standard 72 DPI input, but rethought that, because of the likelihood of once in a while getting higher resolution inputs, and not wanting to potentially waste space from blowing those up, so switched to calculating DPI. Note that the calculation is still specific to using US Letter paper for printing. If someone wanted to output 300 DPI on a differently sized page, the calculation would still need to change. That's part of the trade-off of image capturing to a file, is having to know ahead of time what resolution will be sufficient, when trying to minimise file size.

Show

Mark Collette added a comment - 18/Dec/09 1:17 PM I was able to find several examples online for using JAI and for using ImageIO to write a multi-page TIFF online. This provided a good starting point. Originally looked into using JAI directly, but found the ImageIO API easier to use. As well, there's no need for JAI dependencies, or for reflection, when coding to the ImageIO interface. Alternate implementations can be plugged-in. The compression type parameter for group 4 seems to be specific to JAI's implementation, though. Tried using Document.getPageImage , but found that the TIFF writer would not automatically convert the RGB source image into black and white. Instead it required a black and white source image, which necessitated creating the image myself, and using the slightly more involved Document.paintPage API. Was going to use a fixed zoom level, assuming a standard 72 DPI input, but rethought that, because of the likelihood of once in a while getting higher resolution inputs, and not wanting to potentially waste space from blowing those up, so switched to calculating DPI. Note that the calculation is still specific to using US Letter paper for printing. If someone wanted to output 300 DPI on a differently sized page, the calculation would still need to change. That's part of the trade-off of image capturing to a file, is having to know ahead of time what resolution will be sufficient, when trying to minimise file size.

Hide

Permalink

Mark Collette added a comment - 31/Dec/09 3:00 PM

An imge of a signature is showing up in the viewer window, but not in the extracted TIFF image. The details of the image are:

Decode=[0.0, 1.0]
Filter=CCITTFaxDecode
Intent=RelativeColorimetric
DecodeParms=

{K=-1, Columns=564, Rows=156}

BitsPerComponent=1
ImageMask=true

Show

Mark Collette added a comment - 31/Dec/09 3:00 PM An imge of a signature is showing up in the viewer window, but not in the extracted TIFF image. The details of the image are: Decode= [0.0, 1.0] Filter=CCITTFaxDecode Intent=RelativeColorimetric DecodeParms= {K=-1, Columns=564, Rows=156} BitsPerComponent=1 ImageMask=true

Hide

Permalink

Mark Collette added a comment - 31/Dec/09 3:14 PM

At first I assumed that the viewer environment was different than the extraction environment, because of either classpath, or some encryption callback, or other hook that the viewer does outside of the core, in the RI. I tracked down every possible JAR and callback, and tried either adding them to the extraction, or removing them from the viewer. Nothing made a difference.

Then, I tracked into the Stream and Shapes classes, to examine the resulting parsed image, thinking that this would show whether the problem was parsing the image or displaying it. Those classes showed that they had the signature image in a valid state, properly parsed.

So I then examined what could be causing the image to not be displayed properly. I thought that maybe the clipping was working for one and not the other, maybe due to some round-off error. When disabling the clipping and transforming, I saw that the black area then moved and grew, which indicated that the image was being drawn, just somehow as all black, even though it showed up as a signature when painted on the viewer frame and in my test JFrame. Commenting out the image drawing command made the black disappear. So the black rectangle was not the omission of the image, it was how the image was coming across. So, somehow it was being drawn differently in the two contexts.

This made me think that I should draw it into a regular RGB image, instead of right into the black and white image, since maybe something about the destination image made it lose information, leaving only black. But the source image was black and white too, so that didn't add up. Then I clued in to see if the source image had any alpha related attributes. It did, being an image mask. The destination image was opaque, and did not support alpha. So, I experimented making the destination image not be opaque, and instead either use a bit mask or translucence. Both worked, in Java 1.6 and Java 1.5. So I went with the bit mask.

Subversion 20098
icepdf\examples\captureMultiple\MultiPageCapture.java

Show

Mark Collette added a comment - 31/Dec/09 3:14 PM At first I assumed that the viewer environment was different than the extraction environment, because of either classpath, or some encryption callback, or other hook that the viewer does outside of the core, in the RI. I tracked down every possible JAR and callback, and tried either adding them to the extraction, or removing them from the viewer. Nothing made a difference. Then, I tracked into the Stream and Shapes classes, to examine the resulting parsed image, thinking that this would show whether the problem was parsing the image or displaying it. Those classes showed that they had the signature image in a valid state, properly parsed. So I then examined what could be causing the image to not be displayed properly. I thought that maybe the clipping was working for one and not the other, maybe due to some round-off error. When disabling the clipping and transforming, I saw that the black area then moved and grew, which indicated that the image was being drawn, just somehow as all black, even though it showed up as a signature when painted on the viewer frame and in my test JFrame. Commenting out the image drawing command made the black disappear. So the black rectangle was not the omission of the image, it was how the image was coming across. So, somehow it was being drawn differently in the two contexts. This made me think that I should draw it into a regular RGB image, instead of right into the black and white image, since maybe something about the destination image made it lose information, leaving only black. But the source image was black and white too, so that didn't add up. Then I clued in to see if the source image had any alpha related attributes. It did, being an image mask. The destination image was opaque, and did not support alpha. So, I experimented making the destination image not be opaque, and instead either use a bit mask or translucence. Both worked, in Java 1.6 and Java 1.5. So I went with the bit mask. Subversion 20098 icepdf\examples\captureMultiple\MultiPageCapture.java

People

Assignee:

Mark Collette

Reporter:

Mark Collette

Votes:

0 Vote for this issue

Watchers:

0 Start watching this issue

Dates

Created:

17/Dec/09 6:16 PM

Updated:

29/Mar/12 11:56 AM

Resolved:

31/Dec/09 5:59 PM