How to extract slide notes from a PowerPoint file using ColdFusion

I have a .PPT file (PowerPoint, transferrable to ODP or PPTX) with speaker notes on each slide. I want to extract the entire presentation into something dynamic, so that I can create a cover for the speakers to work on the phone or desk during a conversation (thumbnail slide with speaker notes). I do this often enough to HATE doing it manually.

This is almost easy with <cfpresentation format="html" showNotes="yes"> , which breaks the PPT into HTML pages and creates an image for each slide. However, the presentation does not convey the speaker's notes; they are lost in translation.

I also tried <cfdocument> , which has no options for saving slide notes after converting it to PDF.

Is there a way to extract notes from a PowerPoint file from ColdFusion?

+4
source share
4 answers

The simplest solution:

Convert your PowerPoint presentation to OpenOffice ODP format. This is a zip file. CFML can unzip it and inside there is a content.xml file that contains slides and notes, so CFML can extract notes from this format.

Given the functionality of CFDOCUMENT, maybe ColdFusion can even convert PPT to ODP for you?

+4
source

There is no way to do this directly in CF. You can do this by switching to basic Java. I stand fixed. Using the showNotes attribute on the <cfpresentation> tag , add HTML notes.

Alternatively, or if this does not work for any reason, you can use the Apache POI to do this, although you may need to use a newer version of poi than the one supplied with your version of coldfusion, which may require additional work .

 public static LinkedList<String> getNotes(String filePath) { LinkedList<String> results = new LinkedList<String>(); // read the powerpoint FileInputStream fis = new FileInputStream(filePath); SlideShow slideShow = new SlideShow(is); fis.close(); // get the slides Slide[] slides = ppt.getSlides(); // loop over the slides for (Slide slide : slides) { // get the notes for this slide. Notes notes = slide.getNotesSheet(); // get the "text runs" that are part of this slide. TextRun[] textRuns = notes.getTextRuns(); // build a string with the text from all the runs in the slide. StringBuilder sb = new StringBuilder(); for (TextRun textRun : textRuns) { sb.append(textRun.getRawText()); } // add the resulting string to the results. results.add(sb.toString()); } return results; } 

Conducting complex formatting can be a problem ( bullet points , bold, italics, links, colors, etc.), since you will have to delve deeper into TextRun s, as well as the associated API and drawing, how to create HTML.

+3
source

CFPRESENTATION (at least since version 9) has the showNotes attribute, but you still have to parse the output. Depending on the layout of the output, jQuery will do a short job capturing what you want.

+2
source

Not bad that my answer above did not work, so I dug a little. It is a bit outdated, but it works. PPTUtils , which is based on the apache library proposed by @Antony. I updated this function to do what you want. You may need to tweak it a bit to do what you want, but I like the fact that this utility returns data to you in data format, and not in HTML that you have to parse.

And just in case, here is the POI API Reference link . I used the getNotes () function.

  <cffunction name="extractText" access="public" returntype="array" output="true" hint="i extract text from a PPT by means of an array of structs containing an array element for each slide in the PowerPoint"> <cfargument name="pathToPPT" required="true" hint="the full path to the powerpoint to convert" /> <cfset var hslf = instance.loader.create("org.apache.poi.hslf.HSLFSlideShow").init(arguments.pathToPPT) /> <cfset var slideshow = instance.loader.create("org.apache.poi.hslf.usermodel.SlideShow").init(hslf) /> <cfset var slides = slideshow.getSlides() /> <cfset var notes = slideshow.getNotes() /> <cfset var retArr = arrayNew(1) /> <cfset var slide = structNew() /> <cfset var i = "" /> <cfset var j = "" /> <cfset var k = "" /> <cfset var thisSlide = "" /> <cfset var thisSlideText = "" /> <cfset var thisSlideRichText = "" /> <cfset var rawText = "" /> <cfset var slideText = "" /> <cfloop from="1" to="#arrayLen(slides)#" index="i"> <cfset slide.slideText = structNew() /> <cfif arrayLen(notes)> <cfset slide.notes = notes[i].getTextRuns()[1].getRawText() /> <cfelse> <cfset slide.notes = "" /> </cfif> <cfset thisSlide = slides[i] /> <cfset slide.slideTitle = thisSlide.getTitle() /> <cfset thisSlideText = thisSlide.getTextRuns() /> <cfset slideText = "" /> <cfloop from="1" to="#arrayLen(thisSlideText)#" index="j"> <cfset thisSlideRichText = thisSlideText[j].getRichTextRuns() /> <cfloop from="1" to="#arrayLen(thisSlideRichText)#" index="k"> <cfset rawText = thisSlideRichText[k].getText() /> <cfset slideText = slideText & rawText /> </cfloop> </cfloop> <cfset slide.slideText = duplicate(slideText) /> <cfset arrayAppend(retArr, duplicate(slide)) /> </cfloop> <cfreturn retArr /> </cffunction> 
0
source

Source: https://habr.com/ru/post/1411404/


All Articles