Package org.jpedal.examples.acroform
Class ExtractEmbeddedFiles
java.lang.Object
org.jpedal.examples.BaseExample
org.jpedal.examples.acroform.ExtractEmbeddedFiles
public class ExtractEmbeddedFiles
extends org.jpedal.examples.BaseExample
File Extraction from PDF files
This class provides a simple Java API to extract embedded files and file attachments from a PDF file and also a static convenience method if you just want to dump all files from a PDF file or directory containing PDF files. All files are extracted to a folder at the given output location with a name matching the pdf filename
Example 1 - access API methods
 ExtractEmbeddedFiles extract=new ExtractEmbeddedFiles("C:/pdfs/mypdf.pdf");
 //extract.setPassword("password");
 if (extract.openPDFFile()) {
     if (extract.containsEmbeddedFiles()) {
         extract.extractEmbeddedFiles("C:/output/");
     }
     if (extract.containsFilesAttachments()) {
         extract.extractFileAttachments("C:/output");
     }
 }
 extract.closePDFfile();
 Example 2 - convenience static method
Extract all embedded files and file attachments from a pdf
 ExtractEmbeddedFiles.extractAllFilesFromPdf("C:/pdfs/mypdf.pdf", "C:/output");
 Example 3 - Access directly from the Jar
ExtractEmbeddedFiles can run from jar directly using the command and will extract all embedded files and file attachments from a PDF file or directory to a defined output directory:java -cp libraries_needed org/jpedal/examples/acroform/ExtractEmbeddedFiles inputValuesWhere inputValues is 3 values:
- First value: The PDF filename (including the path if needed) or a directory containing PDF files. If it contains spaces it must be enclosed by double quotes (ie "C:/Path with spaces/").
- Second value: The location to write out extracted files from the PDF file or files. If it contains spaces it must be enclosed by double quotes (ie "C:/Path with spaces/").
- 
Constructor SummaryConstructorsConstructorDescriptionExtractEmbeddedFiles(byte[] byteArray) ExtractEmbeddedFiles(String fileName) 
- 
Method SummaryModifier and TypeMethodDescriptionbooleanMethod to flag if the current file contains embedded files.booleanMethod to flag if the current file contains file attachmentsextractAllEmbeddedFilesAsMap(String inputFilename) extractAllFileAttachmentFilesOnPage(int page) extractAllFileAttachmentsAsMap(String inputFilename) extractAllFileAttachmentsOnPageAsMap(String inputFilename, int page) static voidextractAllFilesFromPdf(String inputDir, String outputDir) static method to write out all pages in a PDF files or directory of PDF files as imagesbyte[]extractEmbeddedFile(String requestedFile) voidextractEmbeddedFiles(String outputDirectory) Extract embedded files and place them in the output directory specified.byte[]extractFileAttachment(String requestedFile) voidextractFileAttachments(String outputDirectory) Extract files from file attachment annotations in the open file and place them in the output directory specified.String[]String[]static voidvoidsetPassword(String password) voidvoidMethods inherited from class org.jpedal.examples.BaseExampleclosePDFfile, openPDFFile
- 
Constructor Details- 
ExtractEmbeddedFiles
- 
ExtractEmbeddedFilespublic ExtractEmbeddedFiles(byte[] byteArray) 
 
- 
- 
Method Details- 
main
- 
setPassword- Parameters:
- password- the USER or OWNER password for the PDF file
 
- 
containsFilesAttachmentspublic boolean containsFilesAttachments()Method to flag if the current file contains file attachments- Returns:
- True is file attachments are present, otherwise false.
 
- 
extractFileAttachmentsExtract files from file attachment annotations in the open file and place them in the output directory specified. A directory is placed in the given directory, the name is that of the pdf and it contains all extracted files. When extracting the files any existing files of the same name will be replaced. This does not extract files contained within the EmbeddedFiles dictionary (such as those found in Portfolios).- Parameters:
- outputDirectory- Path where the extract files should be saved.
 
- 
extractAllEmbeddedFilesAsMappublic static Map<String,byte[]> extractAllEmbeddedFilesAsMap(String inputFilename) throws PdfException - Throws:
- PdfException
 
- 
extractEmbeddedFile
- 
getEmbeddedFileNames
- 
extractAllFileAttachmentsAsMappublic static Map<String,byte[]> extractAllFileAttachmentsAsMap(String inputFilename) throws PdfException - Throws:
- PdfException
 
- 
extractAllFileAttachmentsOnPageAsMappublic static Map<String,byte[]> extractAllFileAttachmentsOnPageAsMap(String inputFilename, int page) throws PdfException - Throws:
- PdfException
 
- 
extractFileAttachment
- 
extractAllFileAttachmentFilesOnPage
- 
getFileAttachmentNames
- 
containsEmbeddedFilespublic boolean containsEmbeddedFiles()Method to flag if the current file contains embedded files.- Returns:
- True is embedded files are present, otherwise false.
 
- 
extractEmbeddedFilesExtract embedded files and place them in the output directory specified. A directory is placed in the given directory, the name is that of the pdf and it contains all extracted files. When extracting the files any existing files of the same name will be replaced. This does not extract files contained within File Attachment annotations.- Parameters:
- outputDirectory- Path where the extracted files should be saved.
 
- 
extractAllFilesFromPdfstatic method to write out all pages in a PDF files or directory of PDF files as images- Parameters:
- inputDir- directory of files to convert
- outputDir- directory of output
- Throws:
- PdfException- PdfException
 
- 
showEmbeddedFilesDetailspublic void showEmbeddedFilesDetails()
- 
showFileAttachmentDetailspublic void showFileAttachmentDetails()
 
-