As ever, the vanilla MI / BI tool never does quite what's required by the business, so time to draw on those wrangling skills to get the job done.
Basic problem, convert the PDF report to a PowerPoint presentation. For guidance on PowerPoint, I turned to Microsoft wrangler-extraordinaire, Mark Townsend, currently crushing out solutions for Deutsche Asset Management!
Just a few catches: content holds a 'RESTRICTED' classification (cannot be shared externally), the converted content MUST be non-editable on the slide and only software on the client's approved list can be used to get the job done.
Can we run it on our own computers too...
OK!
After hassling a few SMEs internally turns out we have Nuance (Power PDF) and Adobe Standard to play with.
The problem statement:
Mark quickly turned around this outline solution with interop libraries, a great foundation for the final deliverable.
using System;
using Microsoft.Office.Interop.PowerPoint;
namespace ConsoleApplication4
{
class Program
{
static void Main(string[] args)
{
var ap = new Application();
var pp = ap.Presentations.Open(@"\\PathTo\MyDummyPresentation.pptx");
var sl = pp.Slides[1];
var sp = sl.Shapes.AddPicture(@"\\PathTo\MyPicture.jpg", Microsoft.Office.Core.MsoTriState.msoFalse, Microsoft.Office.Core.MsoTriState.msoTrue, 100, 100);
pp.SaveAs(@"\\PathTo\MyNewPresentation.pptx");
pp.Close();
// there’s some clean up missing here. COM cleanup is tricky so if you use this let me know and I’ll spend a bit of time getting the cleanup correct
}
}
}
After a little more wrangling with the two PDF tools we ended up with the two solutions below.1. Adobe Standard (v11.0).
function Convert-PDF
{
<#
.SYNOPSIS
Convert PDF file(s) into PowerPoint presentation(s) containing non-editable graphics on each slide.
.DESCRIPTION
Function converts PDF file(s) into PowerPoint presentation(s).
Function takes an array of files, testing each file is a PDF type, then converts each PDF to PowerPoint. This is
managed by converting each page of the PDF to a graphic file; content is non-editable. For each graphic, a new slide
is created in the PowerPoint presentation and the graphic is added.
The new PowerPoint presentation(s) is saved to the same location as the PDF.
.NOTES
File Name : Convert-PDF.ps1
Author : Justin Townsend
Create Date : 17/05/2017
Purpose / Change : Initial version
Prerequisite : Acrobat Standard (v11.0)
.LINK
https://link-to-help-file.com
.EXAMPLE
Convert-PDF -pdfs "C:\test.pdf"
.EXAMPLE
Convert-PDF -pdfs "C:\test_1.pdf", "C:\test_2.pdf" -InfoClass "RESTRICTED"
.PARAMETER pdfs
PDF file(s) for processing (accepts array).
.PARAMETER InfoClass
Information Classification, used for marking sensitive content.
#>
[cmdletbinding()]
param ([Parameter(Mandatory=$true,
Position=0,
HelpMessage='Specify the location(s) of PDF files.',
ValueFromPipeline=$true)]
[ValidateScript({ foreach ($pdf in $_) { if () { throw "$($pdf) is an invalid PDF file!" } } return$true })]
[string[]] $pdfs
,
[Parameter(Position=1,
HelpMessage='Information Classification.',
ValueFromPipeLine=$true)]
[ValidateSet("HIGHLY RESTRICTED","RESTRICTED","INTERNAL","PUBLIC","Not Applicable")]
[string] $InfoClass = "RESTRICTED"
)
:pdfloop foreach ($pdf in $pdfs)
{
$pdf = get-childitem $pdf
$out_dir = $pdf.DirectoryName
$out_dir = $out_dir + "\" + $pdf.Basename
$out_dir += "_PROC"
$out_file = $out_dir + "\" + $pdf.Basename
new-item $out_dir -type directory -force
# Adobe Acrobat Standard (convert to graphic files)
$adobeApp = New-Object -ComObject AcroExch.AVDoc;
$adobeApp.Open($pdf.Fullname, "") | Out-Null;
$pdfDoc = $adobeApp.GetPDDoc();
$pdfJSObject = $pdfDoc.GetJSObject();
$TypeExt="jpeg";
$closeDocParam = $true;
$T = $pdfJSObject.GetType();
$T.InvokeMember("SaveAs",
[Reflection.BindingFlags]::InvokeMethod -bor `
[Reflection.BindingFlags]::Public -bor `
[Reflection.BindingFlags]::Instance,
$null,
$pdfJSObject,
@([IO.Path]::ChangeExtension($out_file, $TypeExt), ("com.adobe.acrobat."+$TypeExt)));
$T.InvokeMember("closeDoc",
[Reflection.BindingFlags]::InvokeMethod -bor `
[Reflection.BindingFlags]::Public -bor `
[Reflection.BindingFlags]::Instance,
$null,
$pdfJSObject,
$closeDocParam) | Out-Null;
$pdfDoc.Close() | Out-Null;
$adobeApp.Close(1) | Out-Null;
# Microsoft PowerPoint creation
Add-type -AssemblyName Office;
Add-Type -AssemblyName Microsoft.Office.Interop.PowerPoint;
$msoappPPT = New-Object -ComObject powerpoint.application;
$msoappPPT.visible = [Microsoft.Office.Core.MsoTriState]::msoTrue;
$slideType = "microsoft.office.interop.powerpoint.ppSlideLayout" -as [type];
$slideSize = "microsoft.office.interop.powerpoint.ppSlideSizeType" -as [type];
$msoSendToBack = 1;
$out_ppt = $pdf.DirectoryName + "\" + $pdf.Basename
$pptPres = $msoappPPT.Presentations
$pptPres = $pptPres.add()
$pptPres.PageSetup.slideSize = $slideSize::ppSlideSizeA4Paper;
get-childitem -path $out_dir | sort-object -Property CreationTime | ForEach-Object { `
$pic = $_.fullname
$add_slide = $pptPres.Slides.Add($pptPres.Slides.Count + 1, 15);
$add_slide.layout = $slideType::ppLayoutBlank;
$add_slide.HeadersFooters.Footer.Visible = [Microsoft.Office.Core.MsoTriState]::msoTrue;
$add_slide.HeadersFooters.Footer.text = $InfoClass;
$add_slide.Shapes.Range("Footer Placeholder 2").Left = -100;
$shape = $add_slide.Shapes.AddPicture($pic, $false, $true, 0, 0, -1, -1);
$shape.ZOrder($msoSendToBack);
}
$pptPres.SaveAs($out_ppt)
$pptPres.Close()
$msoappPPT.quit()
$msoappPPT = $null;
}
Remove-Item $out_dir -recurse
}
2. Nuance Power PDF Advanced (v1.2).function Convert-PDF
{
<#
.SYNOPSIS
Convert PDF file(s) into PowerPoint presentation(s) containing non-editable graphics on each slide.
.DESCRIPTION
Function converts PDF file(s) into PowerPoint presentation(s).
Function takes an array of files, testing each file is a PDF type, then converts each PDF to PowerPoint. This is
managed by converting each page of the PDF to a graphic file; content is non-editable. For each graphic, a new slide
is created in the PowerPoint presentation and the graphic is added.
The new PowerPoint presentation(s) is saved to the same location as the PDF.
.NOTES
File Name : Convert-PDF.ps1
Author : Justin Townsend
Create Date : 17/05/2017
Purpose / Change : Initial version
Prerequisite : Nuance Power PDF Advanced (v1.2)
.LINK
https://link-to-help-file.com
.EXAMPLE
Convert-PDF -pdfs "C:\test.pdf"
.EXAMPLE
Convert-PDF -pdfs "C:\test_1.pdf", "C:\test_2.pdf" -InfoClass "RESTRICTED"
.PARAMETER pdfs
PDF file(s) for processing (accepts array).
.PARAMETER InfoClass
Information Classification, used for marking sensitive content.
#>
[cmdletbinding()]
param ([Parameter(Mandatory=$true,
Position=0,
HelpMessage='Specify the location(s) of PDF files.',
ValueFromPipeline=$true)]
[ValidateScript({ foreach ($pdf in $_) { if () { throw "$($pdf) is an invalid PDF file!" } } return$true })]
[string[]] $pdfs
,
[Parameter(Position=1,
HelpMessage='Information Classification.',
ValueFromPipeLine=$true)]
[ValidateSet("HIGHLY RESTRICTED","RESTRICTED","INTERNAL","PUBLIC","Not Applicable")]
[string] $InfoClass = "RESTRICTED"
)
:pdfloop foreach ($pdf in $pdfs)
{
# Nuance batch conversion
$pdf = get-childitem $pdf
$outExt= "jpg"
$out_dir = $pdf.DirectoryName
$out_dir = $out_dir + "\" + $pdf.Basename
$out_dir += "_PROC"
$out_file = $out_dir + "\" + $pdf.Basename + "." + $outExt
new-item $out_dir -type directory -force
& "C:\Program Files\Nuance\Power PDF\batchconverter" -I"$pdf" -O"$out_file" -TTIF -CcJpegMax -Q
# Microsoft PowerPoint creation
Add-type -AssemblyName Office;
Add-Type -AssemblyName Microsoft.Office.Interop.PowerPoint;
$msoappPPT = New-Object -ComObject powerpoint.application;
$msoappPPT.visible = [Microsoft.Office.Core.MsoTriState]::msoTrue;
$slideType = "microsoft.office.interop.powerpoint.ppSlideLayout" -as [type];
$slideSize = "microsoft.office.interop.powerpoint.ppSlideSizeType" -as [type];
$msoSendToBack = 1;
$out_ppt = $pdf.DirectoryName + "\" + $pdf.Basename
$pptPres = $msoappPPT.Presentations
$pptPres = $pptPres.add()
$pptPres.PageSetup.slideSize = $slideSize::ppSlideSizeA4Paper;
get-childitem -path $out_dir | sort-object -Property CreationTime | ForEach-Object { `
$pic = $_.fullname
$add_slide = $pptPres.Slides.Add($pptPres.Slides.Count + 1, 15);
$add_slide.layout = $slideType::ppLayoutBlank;
$add_slide.HeadersFooters.Footer.Visible = [Microsoft.Office.Core.MsoTriState]::msoTrue;
$add_slide.HeadersFooters.Footer.text = $InfoClass;
$add_slide.Shapes.Range("Footer Placeholder 2").Left = -100;
$shape = $add_slide.Shapes.AddPicture($pic, $false, $true, 0, 0, -1, -1);
$shape.ZOrder($msoSendToBack);
}
$pptPres.SaveAs($out_ppt)
$pptPres.Close()
$msoappPPT.quit()
$msoappPPT = $null;
}
Remove-Item $out_dir -recurse
}
If the sequence of the files is important to you, try not to rely on the standard naming convention of the output. As per the requirement, we've ensured the sequence is correct by sorting the output.
get-childitem -path $out_dir | sort-object -Property CreationTimeHope you find this useful. You can always get in touch.
