TOPIC:

OnPDFSelectEnd return text without white spaces 4 years 5 months ago #15405

radaee
Offline
Moderator
Posts: 1123
Thank you received: 73

dear user:
it is possible manage on java layer, if you can manage UTF-16.
that Page.ObjsGetString(0, objsCharCount) return whole string value of page, and index of each char in this string shall be in UTF16 order.
another method Page.ObjsGetCharRect() return rectangle of each char.
that mean, you can judge by own, using char rect and space between 2 chars.
but in this way, you shall manage UTF-16 code by yourself.

Please Log in or Create an account to join the conversation.

Last edit: by radaee.

OnPDFSelectEnd return text without white spaces 3 years 9 months ago #15622

thiagopelikan Topic Author Offline Junior Member Posts: 36 Thank you received: 0	Could you give me an example how to do this? For example, in the attached file, the ObjsGetString method gives me: "TORCIDA T U M U L T U A AMBIENTE" As I see, the 3 lines in the example are exactly with the same space between the chars How can I use ObjsGetCharRect and extract the word from a RECT? Tks, Thiago Attachments:
	Please Log in or Create an account to join the conversation.

OnPDFSelectEnd return text without white spaces 3 years 9 months ago #15623

radaee
Offline
Moderator
Posts: 1123
Thank you received: 73

dear user,
i'm still not sure what you really want.
i share some codes here。
for example, you want get char code and rectangle area of a char by index:

String sval = page.ObjsGetString(start, end);
float[] rect = new float[4];
for( int i = start; i < end; i++)
{
   char code = sval.charAt(i - start);//encoding as UTF16.
   page.ObjsGetCharRect(i, rect);//array as [left,top,right,bottom].
   //todo: process each char.
}

Please Log in or Create an account to join the conversation.

Last edit: by radaee.

OnPDFSelectEnd return text without white spaces 3 years 9 months ago #15624

support Offline Administrator Posts: 696 Thank you received: 59	Dear Thiago, your issue seems to be related to the way your test pdf file was generated and how chars are spaced on the page. May you provide the file, please?
	Please Log in or Create an account to join the conversation.

OnPDFSelectEnd return text without white spaces 3 years 9 months ago #15625

thiagopelikan Topic Author Offline Junior Member Posts: 36 Thank you received: 0	Hi, I had to upload the file to my repository because it has 6MB. itgames-my.sharepoint.com/:b:/g/personal...zre9kwNyt5w?e=3dSTOv Tks, Thiago
	Please Log in or Create an account to join the conversation.

OnPDFSelectEnd return text without white spaces 3 years 9 months ago #15626

radaee Offline Moderator Posts: 1123 Thank you received: 73	Hi, we checked this PDF file, and tested in last version of Edge(chrome core). the extracted text is: TORCIDA T U M U LT U A AMBIENTE also, we tested in PDF XChange pro, it is same to Edge. this is depends on blank judgement of PDF software.
	Please Log in or Create an account to join the conversation.

Page:
1
2

Forum

Developing applications

Android development

OnPDFSelectEnd return text without white spaces