Microsoft Windows Phone 8.1 support ends (13 Jul 2017)

Microsoft has ended support for Windows Phone 8.1

Questions about Android development and PDF

Character index sequencing counts extra characters

More
IP: 192.168.0.71 11 years 4 months ago #7652 by mossdude
I'm developing an Android app that exchanges PDF text highlights via database objects with similar apps using other PDF parsers in Flash and iOS. To that end I need to be able to extract the following information about a selection:

The highlighted text
The pixel coordinates of the highlight rectangles
The start and end indices of the selected text.

So far so good, I have figured out how to accomplish this, but it seems that the character indices (start, end) don't match those derived in the other two PDF parsers, namely, the parser in the Radaee component seems to be counting extra characters. Except for the first few characters in the sequence, ObjsGetCharIndex() returns a value which is greater than the character's actual position in the text stream. I can't think of a way to correct for this, and without correct indexing this app won't be compatible with the other apps.

I can attach a single page example PDF and a log of the character sequencing that I expect to see, if useful.
More
IP: 192.168.0.71 11 years 4 months ago #7654 by radaee
it is not bug.
different PDF lib may got different result for extracting text.
algorithm of extracting text is(or shall) not defined in PDF reference.
you can save database as rects, by [PDFAnnot getMarkupRects];
Time to create page: 0.382 seconds
Powered by Kunena Forum