Assalamualaikum, jumpa lagi di triyonos.com dengan tutorial-tutorial pemrograman menarik lainnya. Kali ini triyonos.com akan membahas pembuatan aplikasi PDF Image extractor berbasis web menggunakan Flask, PyMuPDF dan Pillow di sisi back end serta Bootstrap 4 dan lightbox di sisi front end. Di sini saya masih menggunakan Pycharm sebagai IDE untuk pengembangan aplikasinya.
Sesuai dengan judul tutorial ini, aplikasi web yang akan saya buat berfungsi sebagai ekstraktor gambar yang bersumber dari file PDF.
Berikut demo aplikasinya:
Langkah-langkah Pembuatan Project
-
Create New Project
Buka Pycharm IDE kemudian Create New Project dengan nama FlaskPdfImage.
-
Install Packages
Package-package yang harus diinstal antara lain:
- Flask
- PyMuPDF
- Pillow
-
Download & Extract File-file Pendukung
-
Folder static dan templates.
Klik link staticNtemplatesFiles.zip ini untuk mendownload folder static dan templates yang berisikan file-file pendukung project. Kemudian exctract ke dalam root folder project.
-
Sample file-file PDF
Klik link pdf_file_example.zip ini untuk mendownload sample file-file PDF yang akan diupload melalui web app.
-
Folder static dan templates.
-
Buat New Python File, kemudian beri nama app.py
Copy kode python di bawah ini kemudian paste-kan ke file app.py.
from flask import Flask, flash, render_template, request, redirect, url_for import os from werkzeug.utils import secure_filename import fitz # PyMuPDF import io from PIL import Image app = Flask(__name__) UPLOAD_FOLDER = 'static/pdfFiles/' IMAGE_FOLDER = 'static/pdfImages/' app.secret_key = "rahasia" app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER app.config['IMAGE_FOLDER'] = IMAGE_FOLDER app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024 ALLOWED_EXTENSIONS = set(['pdf']) def allowed_file(filename): return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS @app.route('/') def home(): return render_template('index.html') @app.route('/', methods = ['POST']) def upload_image(): if 'file' not in request.files: flash('No file part') return redirect(request.url) file = request.files['file'] if file.filename == '': flash('No PDF file selected for uploading') return redirect(request.url) if file and allowed_file(file.filename): arr_img = [] tot_img = 0 filename = secure_filename(file.filename) file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename)) pdf_file = fitz.open(os.path.join(app.config['UPLOAD_FOLDER'], filename)) for page_index in range(len(pdf_file)): page = pdf_file[page_index] image_list = page.get_images() if image_list: print(f"[+] Found a total of {len(image_list)} images in page {page_index}") tot_img = tot_img + len(image_list) else: print("[!] No images found on page", page_index) for image_index, img in enumerate(image_list, start=1): xref = img[0] base_image = pdf_file.extract_image(xref) image_bytes = base_image["image"] image_ext = base_image["ext"] image = Image.open(io.BytesIO(image_bytes)) result_path = os.path.join(app.config['IMAGE_FOLDER'], filename) isExist = os.path.exists(result_path) if not isExist: os.makedirs(result_path) arr_img.append(f"image{page_index+1}_{image_index}.{image_ext}") image.save(os.path.join(result_path, f"image{page_index+1}_{image_index}.{image_ext}")) flash('PDF file successfully uploaded') return render_template('index.html', filename=filename, totimg=tot_img, arrimg=arr_img) else: flash('Allowed image types is PDF only') return redirect(request.url) if __name__ == "__main__": app.run(host='127.0.0.1', port=5000, debug=True)