Convert Word doc, docx and Excel xls, xlsx to PDF with PHP

I am looking for a way to convert Word and Excel files to PDF using PHP.

The reason for this is because I need to combine files of different formats into one document. I know that if I can convert everything to PDF, I can combine PDF files into one file using PDFMerger (which uses fpdf).

I can already create PDF files from other types of files / images, but I'm stuck in Word Docs. (I think that maybe I can convert the Excel files using the PHPExcel library, which I already use to create Excel files from html code).

I do not use the Zend Framework, so I hope someone can point me in the right direction.

Alternatively, if there is a way to create image files (jpg) from Word documents, this will be workable.

Thanks for any help!

+31
php ms-word excel pdf-generation
Apr 04 2018-11-11T00:
source share
12 answers

I found a solution to my problem and after the request posted it here to help others. Sorry if I missed any details, it has been a while since I worked on this solution.

First of all, you need to install Openoffice.org on the server. I asked my hosting provider to install an open office RPM on my VPS. This can be done directly through WHM.

Now that the server has the ability to process MS Office files, you can convert files by executing command-line commands through PHP. To handle this, I found PyODConverter : https://github.com/mirkonasato/pyodconverter

I created a directory on the server and placed the pyOD PyODConverter file in it. I also created a simple text file above the root of the website (I called it "adocpdf") with the following command line commands:

directory=$1 filename=$2 extension=$3 SERVICE='soffice' if [ "`ps ax|grep -v grep|grep -c $SERVICE`" -lt 1 ]; then unset DISPLAY /usr/bin/soffice -headless -accept="socket,host=127.0.0.1,port=8100;urp;" -nofirststartwizard & sleep 5s fi python /home/website/python/DocumentConverter.py /home/website/$directory$filename$extension /home/website/$directory$filename.pdf 

This verifies that the openoffice.org libraries are running, and then calls the PyODConverter script to process the file and output it as a PDF. The 3 variables in the first three lines are provided when the script is executed from a PHP file. Delay ("sleep 5s") is used to ensure that openoffice.org has enough time to initiate, if necessary. I have been using this for several months now, and the 5 second gap seems to provide enough room for respite.

The script will create a PDF version of the document in the same directory as the original.

Finally, by initiating the conversion of a Word / Excel file from PHP (I have it inside a function that checks if the file we're dealing with is a word / excel document) ...

 //use openoffice.org $output = array(); $return_var = 0; exec("/opt/adocpdf {$directory} {$filename} {$extension}", $output, $return_var); 

This PHP function is called after the Word / Excel file has been uploaded to the server. The 3 variables in the call to exec () refer directly to 3 at the beginning of the plain text script above. Note that the $ directory variable does not require a forward slash if the file to be converted is within the web root.

Good thing it is! I hope this will be useful to someone and relieve them of the difficulties and learning curve that I have encountered.

+22
Jul 26 2018-12-12T00:
source share

Well, my 2 cents, when it comes to the word Word 2007 docx , the words 97-2004 doc , pdf and all other types of MS Office who want to "convert from y to z but in real life they don’t want to be." In my experience, you cannot convert with LibreOffice or OpenOffice. Although .doc documents are generally better supported than the word 2007 .docx . In general, it is very difficult to convert .docx to .doc without breaking anything.

.docx also very useful for templates where .doc not binary.

Converting from .doc to PDF was in most cases quite reliable. If you can still influence the design or content of the word document, this may be satisfactory, but in my situation the documents were provided by foreign companies, where even after generating the .docx templates in some scenarios the generated .docx should have been slightly modified with additional text before its creation in PDF.




BASIC WINDOWS BASICS!

All this hiccups led me to conclude that the only true reliable conversion method I found was to use the COM class in PHP, and let MS Word or Excel do all the work for you. I will just give an example of converting .docx to .doc and / or PDF. If you do not have MS Office installed, you can download a trial version in 60 days, which will give you enough space for testing.

COM.net extension is commented out by default in php.ini , just find the line php_com_dotnet.dll and uncomment it like this

  extension=php_com_dotnet.dll 

Reboot the web server (IIS is not pre, Apache will work just as well).

The code below is a demonstration of how simple it is.

  $word = new COM("Word.Application") or die ("Could not initialise Object."); // set it to 1 to see the MS Word window (the actual opening of the document) $word->Visible = 0; // recommend to set to 0, disables alerts like "Do you want MS Word to be the default .. etc" $word->DisplayAlerts = 0; // open the word 2007-2013 document $word->Documents->Open('yourdocument.docx'); // save it as word 2003 $word->ActiveDocument->SaveAs('newdocument.doc'); // convert word 2007-2013 to PDF $word->ActiveDocument->ExportAsFixedFormat('yourdocument.pdf', 17, false, 0, 0, 0, 0, 7, true, true, 2, true, true, false); // quit the Word process $word->Quit(false); // clean up unset($word); 

This is just a small demonstration. I can simply say that if it comes to conversion, it was the only real reliable option that I could use and even recommend.

+16
Nov 17 '13 at 20:27
source share

1) I use WAMP.

2) I installed Open Office (from the Apache website http://www.openoffice.org/download/ ).

3) $output_dir = "C: /wamp/www/projectfolder/"; this is the folder of my project where I want to create an output file.

4) I have already posted my input file C: /wamp/www/projectfolder/wordfile.docx";

Then I run my code .. (below)

 <?php set_time_limit(0); function MakePropertyValue($name,$value,$osm){ $oStruct = $osm->Bridge_GetStruct("com.sun.star.beans.PropertyValue"); $oStruct->Name = $name; $oStruct->Value = $value; return $oStruct; } function word2pdf($doc_url, $output_url){ //Invoke the OpenOffice.org service manager $osm = new COM("com.sun.star.ServiceManager") or die ("Please be sure that OpenOffice.org is installed.\n"); //Set the application to remain hidden to avoid flashing the document onscreen $args = array(MakePropertyValue("Hidden",true,$osm)); //Launch the desktop $oDesktop = $osm->createInstance("com.sun.star.frame.Desktop"); //Load the .doc file, and pass in the "Hidden" property from above $oWriterDoc = $oDesktop->loadComponentFromURL($doc_url,"_blank", 0, $args); //Set up the arguments for the PDF output $export_args = array(MakePropertyValue("FilterName","writer_pdf_Export",$osm)); //print_r($export_args); //Write out the PDF $oWriterDoc->storeToURL($output_url,$export_args); $oWriterDoc->close(true); } $output_dir = "C:/wamp/www/projectfolder/"; $doc_file = "C:/wamp/www/projectfolder/wordfile.docx"; $pdf_file = "outputfile_name.pdf"; $output_file = $output_dir . $pdf_file; $doc_file = "file:///" . $doc_file; $output_file = "file:///" . $output_file; word2pdf($doc_file,$output_file); ?> 
+10
Jan 03 '13 at 9:50
source share

I have successfully posted the portable version of libreoffice on my host web server, which I am invoking with PHP to do command line conversion from .docx, etc. in pdf. on the fly. I do not have administrator rights on my host web server. Here is my blog post about what I did:

http://geekswithblogs.net/robertphyatt/archive/2011/11/19/converting-.docx-to-pdf-or-.doc-to-pdf-or-.doc.aspx

Hurrah! Convert directly from .docx or .odt to .pdf using PHP with LibreOffice (successor to OpenOffice)!

+8
Nov 20 2018-11-11T00:
source share

Open Office / LibreOffice-based solutions will do the OK job, but don't expect your PDFs to look like your source files if they were created in MS-Office. A PDF that looks 90% like the original is not considered acceptable in many areas.

The only way to make sure your PDF files look exactly like the originals is to use a solution that uses the official MS-Office DLLs under the hood. If you use the PHP solution on servers other than Windows, then this requires an additional Windows Server. It might be showstopper, but if you really care about the look of your PDF files, you may not be able to.

Check out this blog post . It shows how to use PHP to convert MS-Office files with a high level of accuracy.

Disclaimer: I wrote this blog post and worked on a related commercial product, so I consider myself biased. However, this is a great solution for the PHP people I work with.

+4
Feb 14 '13 at 12:25
source share

Step 1. Install "Apache_OpenOffice_4.1.2" on your system. Step 2. Download the "unoconv" library from github or elsewhere.

-> C: \ Program Files (x86) \ OpenOffice 4 \ program \ python.exe = Path to the open office installation directory

-> D: \ wamp \ www \ doc_to_pdf \ libobasis4.4-pyuno \ unoconv = Path to the library folder

-> D: / wamp / www / doc_to_pdf / files /'.$ pdf_File_name. '= path and pdf file name

-> D: / wamp / www / doc_to_pdf / files /'.$ doc_file_name = Path to the file of your document.

If a PDF is not created than the last step, Go to β†’ Control Panel \ All Control Panel Items \ Administrative Tools-> services-> find "wampapache" β†’ right-click and select a property β†’ click the login tab. Check the box to enable the service to interact with the desktop.

Create a sample .php file and put below code and run wamp or xampp on the server

 $result = exec('"C:\Program Files (x86)\OpenOffice 4\program\python.exe" D:\wamp\www\doc_to_pdf\libobasis4.4-pyuno\unoconv -f pdf -o D:/wamp/www/doc_to_pdf/files/'.$pdf_File_name.' D:/wamp/www/doc_to_pdf/files/'.$doc_file_name); 

This code works for me on Windows-8 operating system

+3
Mar 21 '16 at 13:45
source share

I found some solution after so many searches. You can also try if you are tired of looking for a good solution.

For general use of SOAP API

You need a username and password to make a SOAP request at https://www.livedocx.com

Register using this https://www.livedocx.com/user/account_registration.aspx and follow these steps.

Use below code in your .php file.

 ini_set ('soap.wsdl_cache_enabled', 0); // you will get this username and pass while register define ('USERNAME', 'Username'); define ('PASSWORD', 'Password'); // SOAP WSDL endpoint define ('ENDPOINT', 'https://api.livedocx.com/2.1/mailmerge.asmx?wsdl'); // Define timezone date_default_timezone_set('Europe/Berlin'); $soap = new SoapClient(ENDPOINT); $soap->LogIn( array( 'username' => USERNAME, 'password' => PASSWORD ) ); $data = file_get_contents('test.doc'); $soap->SetLocalTemplate( array( 'template' => base64_encode($data), 'format' => 'doc' ) ); $soap->CreateDocument(); $result = $soap->RetrieveDocument( array( 'format' => 'pdf' ) ); $data = $result->RetrieveDocumentResult; file_put_contents('tree.pdf', base64_decode($data)); $soap->LogOut(); unset($soap); 

Follow this link for more information http://www.phplivedocx.org/

For ubuntu

Requires installation of OpenOffice and Unoconv.

from the command line

 apt-get remove --purge unoconv git clone https://github.com/dagwieers/unoconv cd unoconv sudo make install 

Now add the code below to your PHP script and make sure the file needs to be executed.

 shell_exec('/usr/bin/unoconv -f pdf folder/test.docx'); shell_exec('/usr/bin/unoconv -f pdf folder/sachin.png'); 

We hope this solution helps you.

+1
Aug 23 '16 at 12:16
source share

Have you tried http://www.phpdocx.com/ ? In addition, it can be hosted on your server.

0
Apr 04 2018-11-11T00:
source share

For PHP-specific you can try PHPWord - this library is written in pure PHP and provides a set of classes for writing and reading from different document file formats (including .doc and .docx). The main drawback is that the quality of the converted files can be quite variable.

Alternatively, if you want a better option, you can use the file conversion API, for example Zamzar . You can use it to convert a wide range of office formats (and others) to PDF, and you can call from any platform (Windows, Linux, OS X, etc.).

The PHP code to convert the file will look like this:

 <?php $endpoint = "https://api.zamzar.com/v1/jobs"; $apiKey = "API_KEY"; $sourceFilePath = "/my.doc"; // Or docx/xls/xlsx etc $targetFormat = "pdf"; $postData = array( "source_file" => $sourceFile, "target_format" => $targetFormat ); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $endpoint); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST'); curl_setopt($ch, CURLOPT_POSTFIELDS, $postData); curl_setopt($ch, CURLOPT_SAFE_UPLOAD, false); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); $body = curl_exec($ch); curl_close($ch); $response = json_decode($body, true); print_r($response); ?> 

Full disclosure: I am the lead developer of the Zamzar API.

0
Nov 10 '17 at 11:51 on
source share

Another way to do this is to use the parameter directly in the libreoffice command:

 libreoffice --convert-to pdf /path/to/file.{doc,docx} 
0
Jun 18 '18 at 14:50
source share

In my opinion, the easiest way to do this is with the free PHP Cloudmersive library, just call convertDocumentDocxToPdf:

 <?php require_once(__DIR__ . '/vendor/autoload.php'); // Configure API key authorization: Apikey $config = Swagger\Client\Configuration::getDefaultConfiguration()->setApiKey('Apikey', 'YOUR_API_KEY'); $apiInstance = new Swagger\Client\Api\ConvertDocumentApi( new GuzzleHttp\Client(), $config ); $input_file = "/path/to/file.txt"; // \SplFileObject | Input file to perform the operation on. try { $result = $apiInstance->convertDocumentDocxToPdf($input_file); print_r($result); } catch (Exception $e) { echo 'Exception when calling ConvertDocumentApi->convertDocumentDocxToPdf: ', $e->getMessage(), PHP_EOL; } ?> 

Be sure to replace $ input_file with the appropriate file path. You can also configure it to use a byte array if you prefer to do it this way. The result will be the bytes of the converted PDF file.

0
May 16 '19 at 5:37
source share

Anyone who wants to do this on Ubuntu / Linux using php -

Ubuntu comes with libre office installed by default. Anyone can use the shell command to use the headless libre office for this.

 shell_exec('/usr/bin/libreoffice --headless --convert-to pdf:writer_pdf_Export --outdir /var/www/html/demo/public_html/src/var/output /var/www/html/demo/public_html/src/var/source/sample.doc'); 

Hope this helps others like me.

0
Aug 21 '19 at 12:49
source share



All Articles