How to Extract Tables from PDF in Python.How to Extract All PDF Links in Python.How to Extract Images from PDF in Python.How to Highlight and Redact Text in PDF Files with Python.Here are some other related PDF tutorials: I hope you enjoyed the tutorial and found this PDF compressor helpful for your tasks. # Summary #Īs you can see, a new compressed PDF file with the size of 498KB instead of 757KB. The following is the output: PDFNet is running in demo mode. Let's test it out: $ python pdf_compressor.py bert-paper.pdf bert-paper-min.pdf We get the input and output files from the command-line arguments and then use our defined compress_file() function to compress the PDF file. # Parsing command line arguments entered by user Now let's define our main code: if _name_ = "_main_": It takes the PDF input_file and produces the compressed PDF output_file. This function compresses a PDF file by removing redundant information and compressing the data streams it then prints a summary showing the compression ratio and the size of the file after compression. Master PDF Manipulation with Python by building PDF tools from scratch. Get Our Practical Python PDF Processing EBook "Compression Ratio": "".format(i, j) for i, j in ems())) "Output File": output_file, f"Compressed Size": get_size_format(compressed_size), "Input File": input_file, "Initial Size": get_size_format(initial_size), Ratio = 1 - (compressed_size / initial_size) # Reduce PDF size by removing redundant information and compressing data streamsĭoc.Save(output_file, SDFDoc.e_linearized)Ĭompressed_size = os.path.getsize(output_file) Initial_size = os.path.getsize(input_file) Now let's define our core function: def compress_file(input_file: str, output_file: str): Next, let's define a function that prints the file size in the appropriate format (grabbed from this tutorial): def get_size_format(b, factor=1024, suffix="B"):įor unit in :ĭownload: Practical Python PDF Processing EBook. Open up a new Python file and import the necessary modules: # Import Librariesįrom PDFNetPython3.PDFNetPython import PDFDoc, Optimizer, SDFDoc, PDFNet To get started, let's install the Python wrapper using pip: $ pip install PDFNetPython3=8.1.0 Read also: How to Compress Images in Python. You can check this tutorial for compressing and archiving files. Note that this tutorial only works for compressing PDF files and not any file. This tutorial aims to develop a lightweight command-line-based utility through Python-based modules without relying on external utilities outside the Python ecosystem (e.g., Ghostscript) that compress PDF files. We will use the free trial version of this SDK for this tutorial. It offers two licenses depending on whether you're developing an external/commercial product or an in-house solution. Developers use PDFTron SDK to read, write, and edit PDF documents compatible with all published versions of PDF specifications (including the latest ISO32000). With PDFTron components, you can build reliable & speedy applications that can view, create, print, edit, and annotate PDFs across various operating systems. PDFNetPython3 is a wrapper for PDFTron SDK. In this tutorial, you will learn how to compress PDF files using the PDFTron library in Python. As a result, it significantly increases effectiveness and shareability. Try it out!Ĭompressing PDF allows you to decrease the file size as small as possible while maintaining the quality of the media in that PDF file. Confused by complex code? Let our AI-powered Code Explainer demystify it for you.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |