<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Upthrust &#187; pyPdf</title>
	<atom:link href="http://blog.mpathirage.com/tag/pypdf/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mpathirage.com</link>
	<description>This is the weblog of Milinda Pathirage</description>
	<lastBuildDate>Mon, 29 Aug 2011 02:00:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Merging PDF Files In Linux Using PyPDF</title>
		<link>http://blog.mpathirage.com/2010/01/16/merging-pdf-files-in-linux-using-pypdf/</link>
		<comments>http://blog.mpathirage.com/2010/01/16/merging-pdf-files-in-linux-using-pypdf/#comments</comments>
		<pubDate>Sat, 16 Jan 2010 16:26:58 +0000</pubDate>
		<dc:creator>Milinda Lakmal</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[pdf]]></category>
		<category><![CDATA[pyPdf]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://blog.mpathirage.com/?p=263</guid>
		<description><![CDATA[PyPDF is a handy and valuable Python library for merging and splitting PDF files in Linux. It&#8217;s pure Python library built as a PDF toolkit. It is capable of: extracting document information (title, author, &#8230;), splitting documents page by page, merging documents page by page, cropping pages, merging multiple pages into a single page, encrypting [...]]]></description>
			<content:encoded><![CDATA[<p><strong><a rel="nofollow" href="http://pybrary.net/pyPdf/" target="_blank">PyPDF</a></strong> is a handy and valuable Python library for merging and splitting PDF files in Linux. It&#8217;s pure Python library built as a PDF toolkit. It is capable of:</p>
<ul>
<li>extracting document information (title, author, &#8230;),</li>
<li>splitting documents page by page,</li>
<li>merging documents page by page,</li>
<li>cropping pages,</li>
<li>merging multiple pages into a single page,</li>
<li>encrypting and decrypting PDF files.</li>
</ul>
<p>PyPDF is a great Python library use by many Python applications which handles PDF files directly. <strong>PDF-Shuffler </strong> is a one of the tools written based on PyPDF which you can use to merge PDF files easily in Linux. In Ubuntu you can install it using following command.</p>
<div class="codecolorer-container bash default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">apt-get install</span> pdfshuffler</div></div>
<p>Here is a sample code that merge PDF files together using PyPDF library. In this code I have used PdfFileWriter and PdfFileReader classes from PyPDF module to read and append PDF files together. This sample doesn&#8217;t contain completed error handling logic for file handling.</p>
<div class="codecolorer-container python default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;height:350px;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;"># Copyright (C) 2010 Milinda Pathirage</span><br />
<br />
<span style="color: #808080; font-style: italic;"># Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);</span><br />
<span style="color: #808080; font-style: italic;"># you may not use this file except in compliance with the License.</span><br />
<span style="color: #808080; font-style: italic;"># You may obtain a copy of the License at</span><br />
<span style="color: #808080; font-style: italic;">#</span><br />
<span style="color: #808080; font-style: italic;"># &nbsp; &nbsp;http://www.apache.org/licenses/LICENSE-2.0</span><br />
<span style="color: #808080; font-style: italic;">#</span><br />
<span style="color: #808080; font-style: italic;"># Unless required by applicable law or agreed to in writing, software</span><br />
<span style="color: #808080; font-style: italic;"># distributed under the License is distributed on an &quot;AS IS&quot; BASIS,</span><br />
<span style="color: #808080; font-style: italic;"># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span><br />
<span style="color: #808080; font-style: italic;"># See the License for the specific language governing permissions and</span><br />
<span style="color: #808080; font-style: italic;"># limitations under the License.</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span><br />
<span style="color: #ff7700;font-weight:bold;">from</span> pyPdf <span style="color: #ff7700;font-weight:bold;">import</span> PdfFileWriter<span style="color: #66cc66;">,</span> PdfFileReader<br />
<br />
<span style="color: #ff7700;font-weight:bold;">def</span> mergePDFFiles<span style="color: black;">&#40;</span>outputFile<span style="color: #66cc66;">,</span> filesToBeMerged<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; output <span style="color: #66cc66;">=</span> PdfFileWriter<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span><span style="color: black;">&#40;</span><span style="color: #008000;">len</span><span style="color: black;">&#40;</span>filesToBeMerged<span style="color: black;">&#41;</span> <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Empty Input File List'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span><span style="color: #66cc66;">;</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> inFile <span style="color: #ff7700;font-weight:bold;">in</span> filesToBeMerged:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Adding file'</span> + inFile + <span style="color: #483d8b;">' to the out put'</span> <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># Read the input PDF file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008000;">input</span> <span style="color: #66cc66;">=</span> PdfFileReader<span style="color: black;">&#40;</span><span style="color: #008000;">file</span><span style="color: black;">&#40;</span>inFile<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot;rb&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># Add every page in input PDF file to output</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> page <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">input</span>.<span style="color: black;">pages</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output.<span style="color: black;">addPage</span><span style="color: black;">&#40;</span>page<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Writing the final out put to file system'</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># Out put stream for output file &nbsp; &nbsp; &nbsp; &nbsp;</span><br />
&nbsp; &nbsp; outputStream <span style="color: #66cc66;">=</span> <span style="color: #008000;">file</span><span style="color: black;">&#40;</span>outputFile<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot;wb&quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; output.<span style="color: black;">write</span><span style="color: black;">&#40;</span>outputStream<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; outputStream.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <br />
<br />
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'__main__'</span>:<br />
&nbsp; &nbsp; i <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; outputFile <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; inputFiles <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> arg <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; i <span style="color: #66cc66;">=</span> i + <span style="color: #ff4500;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># Getting out file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> arg <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'-o'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; outputFile <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span><span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Output File: '</span> + outputFile<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># Extracting Input files</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> arg <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'-i'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; outfileOptionPos <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span>.<span style="color: black;">index</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'-o'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> i <span style="color: #66cc66;">&lt;</span> outfileOptionPos:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; inputFiles <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span><span style="color: black;">&#91;</span>i: outfileOptionPos<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filesStr <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">&quot;,&quot;</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span>inputFiles<span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;,&quot;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot; &quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Input Files: '</span> + filesStr<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>: &nbsp; &nbsp;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; inputFiles <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">sys</span>.<span style="color: black;">argv</span><span style="color: black;">&#91;</span>i:<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filesStr <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">&quot;,&quot;</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span>inputFiles<span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;,&quot;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot; &quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Input Files: '</span> + filesStr<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># Merging PDF files</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Merging PDF Files......'</span> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<br />
&nbsp; &nbsp; mergePDFFiles<span style="color: black;">&#40;</span>outputFile<span style="color: #66cc66;">,</span> inputFiles<span style="color: black;">&#41;</span></div></div>
<p>You can get more understanding about usages of PyPDF if you explore more about open source projects which uses PyPDF. Here are some of the projects which use PyPDF.</p>
<ul>
<li><a href="http://code.google.com/p/flaxcode/" target="_blank">Falx: Fast, feature-rich, flexible enterprise search </a></li>
<li><a href="http://pybrary.net/" target="_blank">Pybrary</a></li>
<li><a href="http://code.google.com/p/xhtml2pdf-base/" target="_blank">xhtml2pdf-base</a></li>
<li><a href="http://code.google.com/p/nglib/" arget="_blank">nglib</a></li>
</ul>
<p>Related Resources:</p>
<ul>
<li><a rel="nofollow" href="http://code.activestate.com/recipes/511465/" target="_blank"> Pure Python PDF to text converter</a></li>
<li><a rel="nofollow" href="http://sourceforge.net/projects/pypdfgui/" target="_blank">PyPDF GUI</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.mpathirage.com/2010/01/16/merging-pdf-files-in-linux-using-pypdf/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

