Assalamualaikum, jumpa lagi di triyonos.com dengan tutorial-tutorial pemrograman menarik lainnya. Kali ini triyonos.com akan membahas pembuatan aplikasi PDF Image extractor berbasis web menggunakan Flask, PyMuPDF dan Pillow di sisi back end serta Bootstrap 4 dan lightbox di sisi front end. Di sini saya masih menggunakan Pycharm sebagai IDE untuk pengembangan aplikasinya.

how application work

Sesuai dengan judul tutorial ini, aplikasi web yang akan saya buat berfungsi sebagai ekstraktor gambar yang bersumber dari file PDF.

how application work

Berikut demo aplikasinya:

home page face recognition

Langkah-langkah Pembuatan Project

  1. Create New Project

    Buka Pycharm IDE kemudian Create New Project dengan nama FlaskPdfImage.

    pycharm flask socketio led strip ws2812
  2. Install Packages

    Package-package yang harus diinstal antara lain:

    • Flask
    • PyMuPDF
    • Pillow

  3. Download & Extract File-file Pendukung

    • Folder static dan templates.

      Klik link staticNtemplatesFiles.zip ini untuk mendownload folder static dan templates yang berisikan file-file pendukung project. Kemudian exctract ke dalam root folder project.

    • Sample file-file PDF

      Klik link pdf_file_example.zip ini untuk mendownload sample file-file PDF yang akan diupload melalui web app.

  4. Buat New Python File, kemudian beri nama app.py

    Copy kode python di bawah ini kemudian paste-kan ke file app.py.

    from flask import Flask, flash, render_template, request, redirect, url_for
    import os
    from werkzeug.utils import secure_filename
    import fitz # PyMuPDF
    import io
    from PIL import Image
    
    app = Flask(__name__)
    
    UPLOAD_FOLDER = 'static/pdfFiles/'
    IMAGE_FOLDER = 'static/pdfImages/'
    
    app.secret_key = "rahasia"
    app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
    app.config['IMAGE_FOLDER'] = IMAGE_FOLDER
    app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024
    
    ALLOWED_EXTENSIONS = set(['pdf'])
    
    def allowed_file(filename):
        return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
    
    @app.route('/')
    def home():
        return render_template('index.html')
    
    @app.route('/', methods = ['POST'])
    def upload_image():
       if 'file' not in request.files:
           flash('No file part')
           return redirect(request.url)
       file  = request.files['file']
    
       if file.filename == '':
           flash('No PDF file selected for uploading')
           return  redirect(request.url)
    
       if file and allowed_file(file.filename):
           arr_img = []
           tot_img = 0
    
           filename = secure_filename(file.filename)
           file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
    
           pdf_file = fitz.open(os.path.join(app.config['UPLOAD_FOLDER'], filename))
    
           for page_index in range(len(pdf_file)):
               page = pdf_file[page_index]
               image_list = page.get_images()
    
               if image_list:
                   print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
                   tot_img = tot_img + len(image_list)
               else:
                   print("[!] No images found on page", page_index)
    
               for image_index, img in enumerate(image_list, start=1):
                   xref = img[0]
    
                   base_image = pdf_file.extract_image(xref)
                   image_bytes = base_image["image"]
    
                   image_ext = base_image["ext"]
    
                   image = Image.open(io.BytesIO(image_bytes))
    
                   result_path = os.path.join(app.config['IMAGE_FOLDER'], filename)
                   isExist = os.path.exists(result_path)
    
                   if not isExist:
                       os.makedirs(result_path)
    
                   arr_img.append(f"image{page_index+1}_{image_index}.{image_ext}")
    
                   image.save(os.path.join(result_path, f"image{page_index+1}_{image_index}.{image_ext}"))
    
           flash('PDF file successfully uploaded')
           return render_template('index.html', filename=filename, totimg=tot_img, arrimg=arr_img)
       else:
           flash('Allowed image types is PDF only')
           return redirect(request.url)
    
    
    if __name__ == "__main__":
        app.run(host='127.0.0.1', port=5000, debug=True)
    
    					

Untuk lebih jelasnya silakan tonton video dari channel youtube Kode Erik di bawah ini. Kalau ada pertanyaan silakan komen di youtube dan jangan lupa subscribe: