Signin/Signup with: 
Welcome, Guest
Username: Password: Remember me
Questions about iOS development and PDF
  • Page:
  • 1

TOPIC:

Search PDF - is there a faster way? 10 years 3 months ago #5078

  • rhill
  • rhill's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 21
  • Thank you received: 0
Hi, I have the requirement to search a PDF and display the matches in a listview with the associated page numbers and the sentence in which the text was found.

To do this I am using some code I found elsewhere on the support forums here:
- (void)searchPDFPages:(NSString*)textToFind {
    NSCharacterSet *charactersToRemove = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
    
    for (NSInteger i = 0 ; i < totalPDFPages ; i++) {
        PDFPage *currentPage = [m_doc page:i];
    
        [currentPage objsStart];
    
        PDFFinder *mFinder = [currentPage find:textToFind :FALSE :FALSE];
        
        if (mFinder) {
            NSInteger finds = [mFinder count];
            for (NSInteger j = 0 ; j < finds ; j++) {
                NSInteger foundIndex = [mFinder objsIndex:j];
                NSInteger phraseStartIndex = foundIndex - charsBeforeMatch < 0 ? 0 : foundIndex - charsBeforeMatch;
                NSInteger phraseEndIndex = (foundIndex + [textToFind length]) + charsAfterMatch < [currentPage objsCount] ? (foundIndex + [textToFind length]) + charsAfterMatch : [currentPage objsCount] - 1;
                
                // adjust phrase start and end so that they observe word boundaries (no partial words)
                phraseStartIndex = [currentPage objsAlignWord:phraseStartIndex :-1];
                phraseEndIndex = [currentPage objsAlignWord:phraseEndIndex :1];
                
                // get the matching string
                NSString *phrase = [currentPage objsString:phraseStartIndex :phraseEndIndex + 1];

                // clean the matching string
                phrase = [phrase stringByTrimmingCharactersInSet:charactersToRemove];
                phrase = [phrase stringByReplacingOccurrencesOfString:@"\r\n" withString:@" "];
                
                NSLog(@"page: %d phrase: %@", i+1, phrase);

                // create a search match object
                PDFSearchMatch *match = [[PDFSearchMatch alloc] init];
                match.matchType = MatchTypePhrase;
                match.matchingPhrase = phrase;
                match.matchingPage = i+1;
                [searchResults addObject:match];
            }
        }
        mFinder = nil;
        currentPage = nil;
    }
}

There are 2 things about this:

1. the use of
[currentPage objsStart];
causes all objects to be loaded into memory for the current page and the memory usage goes up - is there a way to remove the loaded objects from memory to reduce this once the search has completed?

and

2. it's quite slow - approx 6 seconds to search a 20+ page PDF

Is there a better search method I should use on iOS that will still allow me to get the page number the text was found on AND the sentence that the text was found in (this is just the 10 characters either side of the matching text).

Many thanks

Russell
The topic has been locked.

Search PDF - is there a faster way? 10 years 3 months ago #5085

  • rhill
  • rhill's Avatar Topic Author
  • Offline
  • Junior Member
  • Junior Member
  • Posts: 21
  • Thank you received: 0
Any help on this guys?
The topic has been locked.
  • Page:
  • 1
Powered by Kunena Forum