java.lang.Object
- com.logicaldoc.ocr.PDFImageExtractor

```
public class PDFImageExtractor
extends Object
```
This utility class allows the extraction of raster images from a PDF document

Since:

1.0.0

Author:

Marco Meschieri - LogicalDOC

Constructor Summary

Constructors
Constructor Description

PDFImageExtractor(File pdfFile)
Creates a new PDF reader for the given PDF file

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`void`	`close()`	Closes the PDF and releases resources used
`BufferedImage`	`extractImage(int pageIndex, org.apache.pdfbox.cos.COSName imageKey)`	Extracts the imageKey image from the given page
`List<BufferedImage>`	`extractImages()`	Extracts all images of the entire document
`Set<org.apache.pdfbox.cos.COSName>`	`getImageKeys(int pageIndex)`	Gets the set of images identifiers inside the given page
`int`	`getNumberOfPages()`	Returns the total number of pages in the PDF
`BufferedImage`	`getPageAsImage(int pageIndex)`	Renders the specified page as a buffered image
`BufferedImage`	`rotate90SX(BufferedImage bi)`

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - PDFImageExtractor
```
public PDFImageExtractor(File pdfFile)
```
    Creates a new PDF reader for the given PDF file
    
    Parameters:
    
    pdfFile - the pdf file
- Method Detail
  - close
```
public void close()
           throws IOException
```
    Closes the PDF and releases resources used
    
    Throws:
    
    IOException - if the pdf file cannot be read
  - getNumberOfPages
```
public int getNumberOfPages()
```
    Returns the total number of pages in the PDF
    
    Returns:
    
    the total number of pages
  - getPageAsImage
```
public BufferedImage getPageAsImage(int pageIndex)
                             throws IOException
```
    Renders the specified page as a buffered image
    
    Parameters:
    
    pageIndex - zero based page index, i.e., the first page is page 0
    
    Returns:
    
    object representation of the image
    
    Throws:
    
    IOException - if the pdf file cannot be read
  - extractImage
```
public BufferedImage extractImage(int pageIndex,
                                  org.apache.pdfbox.cos.COSName imageKey)
                           throws IOException
```
    Extracts the imageKey image from the given page
    
    Parameters:
    
    pageIndex - zero based page index, i.e., the first page is page 0
    
    imageKey - identifier of the image
    
    Returns:
    
    object representation of the image
    
    Throws:
    
    IOException - if the pdf file cannot be read
  - rotate90SX
```
public BufferedImage rotate90SX(BufferedImage bi)
```
  - getImageKeys
```
public Set<org.apache.pdfbox.cos.COSName> getImageKeys(int pageIndex)
                                                throws IOException
```
    Gets the set of images identifiers inside the given page
    
    Parameters:
    
    pageIndex - zero based page index, i.e., the first page is page 0
    
    Returns:
    
    set of image identifiers
    
    Throws:
    
    IOException - if the pdf file cannot be read
  - extractImages
```
public List<BufferedImage> extractImages()
                                  throws IOException
```
    Extracts all images of the entire document
    
    Returns:
    
    The list of images
    
    Throws:
    
    IOException - if the pdf file cannot be read

Class PDFImageExtractor

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

PDFImageExtractor

Method Detail

close

getNumberOfPages

getPageAsImage

extractImage

rotate90SX

getImageKeys

extractImages