In this tutorial, we’ll guide you through the process of extracting text from a batch of PDF invoices using FileDrop’s OCR (Optical Character Recognition) tools. Whether you have a stack of invoices, documents, or any other PDF files that need text extraction, FileDrop’s bulk text extraction with OCR feature can save you time and effort.
Let’s get started.
Batch Extract Text from files using the FileDrop Web Platform
Step 1: Login and use the click the Bulk OCR tool
Go to the Batch OCR tool and our web platform and upload your files. Here you have multiple options including the language of your documents, for most English is the preffered language.
You can upload up to 100 files per batch. You extract text from up to 2000 files per month with our Business+ plan, if you need more credits please contact us.
Based on the type of files you have you can select from 2 OCR engines we offer. You can test with 1-2 images to see which offer the best results.
Another option is how you want the resulted files to be saved. Here there are multiple options:
- TXT
- DOCX
- XLSX
You can have the results in one file or each extraction in its own file.
Step 2: Start the text extraction process
After upload and selecting your settings start the text extraction process using the Start OCR button. Depending the the files you have this might take a few seconds to minuts. You will receive an email once the process is done.
Step 3: Download
When you receive the email or the job stats is displaying completed you can now download the files. The files are archived in the zip format, after download extract the files on your computer.
Batch Extract Text from Invoices using the FileDrop Google Sheets Add-on
If you don’t have it already please install the add-on. Also you must be on the Business+ plan to access this tool.
Step 1: Accessing FileDrop’s OCR Tools
First, navigate to the “Tools” menu within FileDrop. Within the “Tools” menu, locate the “Bulk OCR” option. Click on it to begin the process. You will be prompted to provide some necessary information.
Step 3: Adding Folder ID
To proceed, you need to add your folder ID. You can find your folder ID within FileDrop. Simply copy the folder ID and paste it into the designated field.
Step 4: Language Selection
Select the language of the text in your PDF files. If your language isn’t listed, choose English as a default option.
Step 5: Choosing Output Format
Decide on the output format for the extracted text. By default, the tool will extract the text and place it into a Google Sheet. This allows each file’s content to be stored in a separate cell. You can also extract the text in a Google Doc, each file in it’s own Google Sheet, or each file in it’s own Google Doc.
Step 6: Start the OCR Process
After configuring the settings, initiate the OCR process by clicking the “Start” button. The tool will begin processing all the PDF files in your specified folder.
Depending on the number and size of your PDF files, the OCR process may take a few minutes. Keep in mind that there might be limits on the size and page count of individual PDFs.
Step 7: Receiving Email Confirmation and Results
Once the OCR process is completed, you will receive an email confirmation. This email will contain a link to access the resulting file.
Open the email and click the provided link to access the extracted text. Note that only pdf and images are supported, and there could be limits on PDF page counts.
Batch OCR Video Tutorial
Conclusion
FileDrop’s bulk OCR feature is a convenient tool for extracting text from PDF files. It’s suitable for various types of PDFs and offers the flexibility to work with different languages.
The Bottom Line:
One keeps you awake. The other gets work done.
A month of coffee: $150
A month of FileDrop: $9.99
Why not have both?