TOPIC:

Identify a searchable PDF 9 years 1 month ago #8567

arcmobile.div
Topic Author
Offline
Premium Member
Posts: 80
Thank you received: 0

How to identify whether a PDF is searchable or not ?
Kindly provide the API

Searchable PDF is essentially a PDF image file. Unlike static image formats such as TIFF, JPEG and BMP, every PDA document has the ability to contain several layers of information i.e. image layer and text layer. The image layer carries information including the actual image, resolution, compression method, color depth, etc. Similarly, the text layer includes the actual ASCII text and an identification of the text's location on the page. In simple terms the Searchable PDF's text portions of the scanned document gets stored in a text layer, allowing the user to easily search for and locate any keyword within the scanned document.

Please Log in or Create an account to join the conversation.

Last edit: by arcmobile.div.

Identify a searchable PDF 9 years 1 month ago #8568

arcmobile.div Topic Author Offline Premium Member Posts: 80 Thank you received: 0	Any update ?
	Please Log in or Create an account to join the conversation.

Identify a searchable PDF 9 years 1 month ago #8573

arcmobile.div Topic Author Offline Premium Member Posts: 80 Thank you received: 0	Kindly update
	Please Log in or Create an account to join the conversation. Last edit: by arcmobile.div.

Identify a searchable PDF 9 years 1 month ago #8631

support
Offline
Administrator
Posts: 692
Thank you received: 59

Dear user, we evaluated that actually this kind of API isn't interesting for our user base.
The request has been market and put in our requests' list with a low priority.

It's simply reproducible from your side extracting page text, looking for non empty text string.

Note: I invite you not to add new empty post to your thread.
Even if your aim is to push us providing you an answer, you're getting the opposite result: the thread seems containing replies and disappear from "thread without answer" list and it could happen we miss the request.

Please Log in or Create an account to join the conversation.

Last edit: by support.

Identify a searchable PDF 9 years 1 month ago #8633

Davide Offline User is blocked Posts: 814 Thank you received: 65	Hi, to extract text from a pdf you can use: ObjsStart(); that gets text objects to memory and then: ObjsGetString(int from, int to); that gets string from range. For more info check this: www.radaeepdf.com/documentation/javadocs...e.html#ObjsGetString(int, int)
	Please Log in or Create an account to join the conversation. Last edit: by Davide.

Identify a searchable PDF 9 years 1 month ago #8680

arcmobile.div
Topic Author
Offline
Premium Member
Posts: 80
Thank you received: 0

Using the following code to determine whether the PDF is searchable or not.
I call the method after the document is loaded but with a searchable PDF also , the String text is returning a null value.

private boolean checkSearchablePDF() {
boolean isSearchablePDF = false;
if (mDoc != null) {
int totalPageCount = mDoc.GetPageCount();
for (int pageCount = 0; pageCount < totalPageCount; pageCount++) {
Page page = mDoc.GetPage(pageCount + 1);
page.ObjsStart();
String text = page.ObjsGetString(0, page.ObjsGetCharCount() - 1);
if (text != null && text.trim().length() > 0) {
isSearchablePDF = true;
break;
}
}
}
return isSearchablePDF;
}

Kindly provide your inputs.

Please Log in or Create an account to join the conversation.

Last edit: by arcmobile.div.

Page:
1
2

Forum

Requests and communications

What do you like to see in next RadaeePDF SDK releases

Identify a searchable PDF