I finally got him to work. I outlined the steps that I took to get a working example. Hope someone finds this helpful.
Download Java JDK
Download IKVM 0.42.0.6
Download PDFBox 1.6.0-src.zip
Useful Ant Guide .
I renamed the Ant and PDFBox folders to shorten their names and move them to my C drive:
You need to set up environment variables. (Windows 7) Right-click My Computer-> Properties-> Advanced System Settings-> Environment Variables
I used the settings below, but yours will differ depending on where you installed Java and where you placed the Ant and PDF Box folders.
Variable value
ANT_HOME C: \ apache-ant \
JAVA_HOME C: \ Program Files (x86) \ Java \ jdk1.7.0_01
Path; C: \ apache-ant \ bin \ (Append semi-colon and path)
Once this is done, enter "ant" in the command window, you should get "build.xml does not exist!". if everything is configured correctly.
Modify the build.xml file inside the folder "pdfbox-1.6.0 \ pdfbox". Find the line that has Replace "." with "IKVM folder path".
I moved IKVM to "C: \ IKVM", so mine looks like this:
Open a command window and cd to "C: \ pdfbox-1.6.0 \ pdfbox" and enter "ant"
... and then a miracle happens.
The pdfbox folder should now have many new folders. The necessary DLLs are located in the bin folder. I donβt know why, but I have "-SNAPSHOT" and the end of all my files (pdfbox-1.6.0-SNAPSHOT.dll).
IKVM.GNU.Classpath (also called IKVM.OpenJDK.Classpath) no longer exists, it has been modular since the release of 0.40. Now it is available as several IKVM.OpenJDK dlls. You only need a few of them.
Create a new project in Visual Studio C #
Copy these files from the pdfBox bin folder to the bin folder in the bin folder of the Visual C # project:
pdfbox-1.6.0-SNAPSHOT.dll
fontbox-1.6.0-SNAPSHOT.dll
commons-logging.dll
Copy these files from the ikvm bin folder to the bin folder in the bin folder of the Visual C # project:
IKVM.OpenJDK.Core.dll
IKVM.OpenJDK.SwingAWT.dll
IKVM.OpenJDK.Text.dll
IKVM.OpenJDK.Util.dll
IKVM.Runtime.dll
Add links to the IKVM DLL above and create a project.
Add a link to the pdfbox dll and create the project again.
Now you are ready to write the code. The simple example below gave a nice text file from pdf input.
using System; using System.IO; using org.apache.pdfbox.pdmodel; using org.apache.pdfbox.util; namespace testPDF { class Program { static void Main() { PDFtoText pdf = new PDFtoText(); string pdfText = pdf.parsePDF(@"C:\Sample.pdf"); using (StreamWriter writer = new StreamWriter(@"C:\Sample.txt")) { writer.Write(pdfText); } } class PDFtoText { public string parsePDF(string filepath) { PDDocument document = PDDocument.load(filepath); PDFTextStripper stripper = new PDFTextStripper(); return stripper.getText(document); } } } }