Package com.logicaldoc.core.util
Class PDFImageExtractor
java.lang.Object
com.logicaldoc.core.util.PDFImageExtractor
- All Implemented Interfaces:
AutoCloseable
This utility class allows the extraction of raster images from a PDF document
- Since:
- 1.0.0
- Author:
- Marco Meschieri - LogicalDOC
-
Constructor Summary
ConstructorsConstructorDescriptionPDFImageExtractor(File pdfFile) Creates a new PDF reader for the given PDF file -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()Closes the PDF and releases resources usedextractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) Extracts the imageKey image from the given pageExtracts all images of the entire documentSet<org.apache.pdfbox.cos.COSName>getImageKeys(int pageIndex) Gets the set of images identifiers inside the given pageintReturns the total number of pages in the PDFgetPageAsImage(int pageIndex) Renders the specified page as a buffered image
-
Constructor Details
-
PDFImageExtractor
Creates a new PDF reader for the given PDF file- Parameters:
pdfFile- the pdf file
-
-
Method Details
-
close
Closes the PDF and releases resources used- Specified by:
closein interfaceAutoCloseable- Throws:
IOException- if the pdf file cannot be read
-
getNumberOfPages
public int getNumberOfPages()Returns the total number of pages in the PDF- Returns:
- the total number of pages
-
getPageAsImage
Renders the specified page as a buffered image- Parameters:
pageIndex- zero based page index, i.e., the first page is page 0- Returns:
- object representation of the image
- Throws:
IOException- if the pdf file cannot be read
-
extractImage
public BufferedImage extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey) throws IOException Extracts the imageKey image from the given page- Parameters:
pageIndex- zero based page index, i.e., the first page is page 0imageKey- identifier of the image- Returns:
- object representation of the image
- Throws:
IOException- if the pdf file cannot be read
-
rotate90SX
-
getImageKeys
Gets the set of images identifiers inside the given page- Parameters:
pageIndex- zero based page index, i.e., the first page is page 0- Returns:
- set of image identifiers
- Throws:
IOException- if the pdf file cannot be read
-
extractImages
Extracts all images of the entire document- Returns:
- The list of images
- Throws:
IOException- if the pdf file cannot be read
-