Knowledge Base - Get custom page labels using advanced properties

Using the new interface Document.GetPageLabel(int pageno) or the Advanced properties, you can get the custom page labels from pdf catalog, check the below code:

Note: a premium license is needed for advanced properties.

Based on PDF reference 1.7:

A document’s labeling ranges are defined by the PageLabels entry in the document catalog (see Section 3.6.1, “Document Catalog”). The value of this entry is a number tree (Section 3.8.6, “Number Trees”), each of whose keys is the page index of the first page in a labeling range. The corresponding value is a page label dictionary defining the labeling characteristics for the pages in that range. The tree must include a value for page index 0. Table 8.10 shows the contents of a page label dictionary. (See implementation note 76 in Appendix H.)

Example 8.5 shows a document with pages labeled
i, ii, iii, iv, 1, 2, 3, A−8, A−9, …
Example 8.5
1 0 obj
<< /Type /Catalog
/PageLabels << /Nums [ 0 << /S /r >>% A number tree containing
4 << /S /D >>% three page label dictionaries
7 << /S /D
/P ( A− )
/St 8
>>
]
>>

>>
endobj

  

TABLE 8.10 Entries in a page label dictionary
KEY TYPE VALUE
Type  name  (Optional) The type of PDF object that this dictionary describes; if present, must be PageLabel for a page label dictionary.
S  name  (Optional) The numbering style to be used for the numeric portion of each page label:
     D      Decimal arabic numerals
     R      Uppercase roman numerals
     r       Lowercase roman numerals
     A      Uppercase letters (A to Z for the first 26 pages, AA to ZZ for the next 26, and so on)
     a       Lowercase letters (a to z for the first 26 pages, aa to zz for the next 26, and so on)
There is no default numbering style; if no S entry is present, page labels consist solely of a label prefix with no numeric portion. For example, if the P entry (below) specifies the label prefix Contents, each page is simply labeled Contents with no page number. (If the Pentry is also missing or empty, the page label is an empty string.)
P  text string  (Optional) The label prefix for page labels in this range.
St  integer  (Optional) The value of the numeric portion for the first page label in the range. Subsequent pages are numbered sequentially from this value, which must be greater than or equal to 1. Default value: 1.

 

/**
* get label of page
* @param pageno 0 based page index number
* @return json string or pure text. for json: name is style name of number.<br/>
* for example:<br/>
* {"D":2} is "2"<br/>
* {"R":3} is "III"<br/>
* {"r":4} is "iv"<br/>
* {"A":5} is "E"<br/>
* {"a":6} is "f"<br/>
* for pure text: the text is the label.
*/
public String GetPageLabel(int pageno)
{
return getPageLabel(hand_val, pageno);
}

Or using advanced properties :

Ref ref = doc.Advance_GetRef();
if(ref != null) {
Obj rootObj = doc.Advance_GetObj(ref);
if(rootObj != null) {
int count = rootObj.DictGetItemCount();
for(int cur = 0; cur < count; cur++) {
String tag = rootObj.DictGetItemTag(cur);
Obj item = rootObj.DictGetItem(cur);
if(tag.equals("PageLabels") && item.GetType() == 8) {
rootObj = doc.Advance_GetObj(item.GetReference());
if(rootObj.DictGetItemCount() > 0 && rootObj.DictGetItem(0).GetType() == 6) {
item = rootObj.DictGetItem(0);
int arrayCount = item.ArrayGetItemCount();
for(int i = 0; i < arrayCount; i++) { //note from here it can be handled based on (Table 8.10 - Entries in a page label dictionary) in PDF reference 1.7
rootObj = item.ArrayGetItem(i);
if(rootObj.GetType() == 2)
Log.i("--ADV--", "array item " + i + ": value = " + rootObj.GetInt());
else if (rootObj.GetType() == 7)
handleDictionary(rootObj, doc);
else if(rootObj.GetType() == 8) {
rootObj = doc.Advance_GetObj(rootObj.GetReference());
if (rootObj.GetType() == 7)
handleDictionary(rootObj, doc);
}
}
}
break;
}
}
}
}

---
private  void handleDictionary(Obj obj, Document doc) {
try {
int count = obj.DictGetItemCount();
for (int cur = 0; cur < count; cur++) {
String tag = obj.DictGetItemTag(cur);
Obj item = obj.DictGetItem(cur);
int type = item.GetType();
String type_name = get_type_name(type);

Log.i("--ADV--", "tag:" + cur + "---" + tag + ":" + type_name + " ->");

if (type == 1) //boolean
Log.i("--ADV--", " value = " + item.GetBoolean());
else if (type == 2) //int
Log.i("--ADV--", " value = " + item.GetInt());
else if (type == 3) //real
Log.i("--ADV--", " value = " + item.GetReal());
else if (type == 4) //string
Log.i("--ADV--", " value = " + item.GetTextString());
else if (type == 5) //name
Log.i("--ADV--", " value = " + item.GetName());
else if (type == 6) { //array
int arraycount = item.ArrayGetItemCount();
for (int k = 0; k < arraycount; k++) {
Obj array_obj = item.ArrayGetItem(k);
Log.i("--ADV--", "array item " + k + ": value = " + array_obj.GetReal());
}
} else if (type == 7) //dictionary
handleDictionary(item, doc);
else if(type == 8) {
handleDictionary(doc.Advance_GetObj(item.GetReference()), doc);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
Applies To

RadaeePDF SDK for Android

Details

Created : 2018-06-08 11:05:27, Last Modified : 2018-06-11 16:17:44