Ticket T942135
Visible to All Users

How to extract PDF pages with highlighted text

created 4 years ago

How would one go about searching a document for highlights on a page (whether highlighted using a DevEx control or Acrobat Pro preferably, but at a minimum using DexEx control), and then extracting those pages to a new document programmatically (without changing the original document), and then laying text onto the bottom of each the new pages?

  • 200 page original
  • 20 pages have highlights somewhere
  • extract 20 pages
  • apply text to bottom of extracted pages
  • save as new document

Thank you!

Answers approved by DevExpress Support

created 4 years ago

Hello Greg,

To accomplish the task, iterate through the pages in the source document and check if the PdfPage.Annotations collection contains PdfTextMarkupAnnotation objects. If so, extract this page to a new document using the PdfDocumentProcessor.Document.Pages.Add method.
Here is a sample code snippet that demonstrates this solution:

C#
using (PdfDocumentProcessor source = new PdfDocumentProcessor()) { source.LoadDocument(sourceFile); List<int> highlightedPageNumbers = new List<int>(); var pages = source.Document.Pages; for (int i = 0; i < pages.Count; i++) { PdfPage page = pages[i]; var textMarkupAnnotations = page.Annotations.OfType<PdfTextMarkupAnnotation>(); if (textMarkupAnnotations.Count() > 0) highlightedPageNumbers.Add(i); } if (highlightedPageNumbers.Count == 0) return; using (PdfDocumentProcessor target = new PdfDocumentProcessor()) { target.CreateEmptyDocument("HighlightedPages.pdf"); foreach (int pageIndex in highlightedPageNumbers) target.Document.Pages.Add(pages[pageIndex]); } }

If something is unclear or further explanation is required, please leave a reply in the comment section below.

    Disclaimer: The information provided on DevExpress.com and affiliated web properties (including the DevExpress Support Center) is provided "as is" without warranty of any kind. Developer Express Inc disclaims all warranties, either express or implied, including the warranties of merchantability and fitness for a particular purpose. Please refer to the DevExpress.com Website Terms of Use for more information in this regard.

    Confidential Information: Developer Express Inc does not wish to receive, will not act to procure, nor will it solicit, confidential or proprietary materials and information from you through the DevExpress Support Center or its web properties. Any and all materials or information divulged during chats, email communications, online discussions, Support Center tickets, or made available to Developer Express Inc in any manner will be deemed NOT to be confidential by Developer Express Inc. Please refer to the DevExpress.com Website Terms of Use for more information in this regard.