[PDF-215] Handle malformed postscript stream - ICEsoft JIRA Issue Tracker

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 4.1.1
Fix Version/s: 4.2.2
Component/s: Core/Parsing
Labels:
None
Environment:
any

Assignee Priority:
P2

Description

The PDF in question is actually malformed, I've never seen one that has such a simple but none the less a malformed file. The offending content reads:

q1 0 0 1 0 792 cm

but should read

q 1 0 0 1 0 792 cm

The error is really simple to see but not some simple to fix. I'll have to think a little more about how to fix the error without slowing down our parser too much.

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Patrick Corless added a comment - 09/Mar/11 12:40 PM

I did quite a bit of research into a new Lexer for ICEpdf when writing the Type4 function lexer. I think new Lexer for the content parser would be fairly low risk, fix the issue in question and significantly speed up content parsing. As time permits I'll see if one can be created on a branch of 4.2 with the intention of a 5.0 release upon completion.

Show

Patrick Corless added a comment - 09/Mar/11 12:40 PM I did quite a bit of research into a new Lexer for ICEpdf when writing the Type4 function lexer. I think new Lexer for the content parser would be fairly low risk, fix the issue in question and significantly speed up content parsing. As time permits I'll see if one can be created on a branch of 4.2 with the intention of a 5.0 release upon completion.

Hide

Permalink

Patrick Corless added a comment - 24/Jun/11 1:25 PM

I have a relatively straight forward fix to the address the malformed PDF content stream. The test checks to see if the content stream token (not a String or a Name) does not contain "d0" or "d1" which are on the only content stream non number tokens that have mixed content. With the a little luck this will be the extent of the corrupt PDF content stream from PDF4NET 2.7.0.3 generator.

Show

Patrick Corless added a comment - 24/Jun/11 1:25 PM I have a relatively straight forward fix to the address the malformed PDF content stream. The test checks to see if the content stream token (not a String or a Name) does not contain "d0" or "d1" which are on the only content stream non number tokens that have mixed content. With the a little luck this will be the extent of the corrupt PDF content stream from PDF4NET 2.7.0.3 generator.

People

Assignee:

Patrick Corless

Reporter:

Patrick Corless

Votes:

0 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

13/Oct/10 7:53 AM

Updated:

29/Mar/12 11:42 AM

Resolved:

24/Jun/11 1:25 PM