Hi, I have the requirement to search a PDF and display the matches in a listview with the associated page numbers and the sentence in which the text was found.
To do this I am using some code I found elsewhere on the support forums here:
- (void)searchPDFPages:(NSString*)textToFind {
NSCharacterSet *charactersToRemove = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
for (NSInteger i = 0 ; i < totalPDFPages ; i++) {
PDFPage *currentPage = [m_doc page:i];
[currentPage objsStart];
PDFFinder *mFinder = [currentPage find:textToFind :FALSE :FALSE];
if (mFinder) {
NSInteger finds = [mFinder count];
for (NSInteger j = 0 ; j < finds ; j++) {
NSInteger foundIndex = [mFinder objsIndex:j];
NSInteger phraseStartIndex = foundIndex - charsBeforeMatch < 0 ? 0 : foundIndex - charsBeforeMatch;
NSInteger phraseEndIndex = (foundIndex + [textToFind length]) + charsAfterMatch < [currentPage objsCount] ? (foundIndex + [textToFind length]) + charsAfterMatch : [currentPage objsCount] - 1;
// adjust phrase start and end so that they observe word boundaries (no partial words)
phraseStartIndex = [currentPage objsAlignWord:phraseStartIndex :-1];
phraseEndIndex = [currentPage objsAlignWord:phraseEndIndex :1];
// get the matching string
NSString *phrase = [currentPage objsString:phraseStartIndex :phraseEndIndex + 1];
// clean the matching string
phrase = [phrase stringByTrimmingCharactersInSet:charactersToRemove];
phrase = [phrase stringByReplacingOccurrencesOfString:@"\r\n" withString:@" "];
NSLog(@"page: %d phrase: %@", i+1, phrase);
// create a search match object
PDFSearchMatch *match = [[PDFSearchMatch alloc] init];
match.matchType = MatchTypePhrase;
match.matchingPhrase = phrase;
match.matchingPage = i+1;
[searchResults addObject:match];
}
}
mFinder = nil;
currentPage = nil;
}
}
There are 2 things about this:
1. the use of
causes all objects to be loaded into memory for the current page and the memory usage goes up - is there a way to remove the loaded objects from memory to reduce this once the search has completed?
and
2. it's quite slow - approx 6 seconds to search a 20+ page PDF
Is there a better search method I should use on iOS that will still allow me to get the page number the text was found on AND the sentence that the text was found in (this is just the 10 characters either side of the matching text).
Many thanks
Russell