How duplicate detection works
Datamolino automatically detects duplicates when the same file is uploaded more than once into the same folder. We recognise duplicates on two levels:
File duplicates - not yet extracted files, located inside the Import history; identical file uploaded twice, e.g. the same email forwarded twice.
Data duplicates - detected after extraction, yellow icon inside your folder; the same content but NOT identical file, e.g. first uploaded as a digital file emailed by the supplier and later uploaded as a scan of the same invoice received on paper.
👉 Does duplicate detection work across folders?
No. Duplicate detection runs per folder. The same file uploaded into a different folder is treated as a new document and processed normally.
File duplicates
File duplicates are recognised immediately. With every upload, Datamolino detects if the identical file was uploaded before. You can find the file duplicates in the Import history and choose to delete or process the file. This helps if you accidentally uploaded an invoice to Datamolino twice. You only pay for duplicate invoices if you choose to process the 'File Duplicate'.
This is how the file duplicates show in the Import history:
👉 I edited my PDF and re-uploaded it - will it be treated as a duplicate?
No. Even small changes to the file create a different fingerprint, so the edited version is treated as a new upload and processed normally.
👉 How long are file duplicates kept?
File duplicates are kept for between 46 and 76 days so you can review them before they are deleted automatically. If you want to keep one of them, open the file duplicate in the Import history and click Process anyway before it is removed. You can also delete file duplicates manually at any time from the Import history.
Data duplicates
Data duplicates are recognised after processing is finished. Although the files are visually the same, they might have been created at a different time and in a different format. For example, you upload a digital file that you received in an email from your supplier and later upload a scan of the same invoice that you received on paper through post. In such case, we first need to process the documents and recognise its content. By comparing the invoice numbers and supplier name we are able to tell you if this invoice is a duplicate, however, we will have to process the file which means you will be charged for this invoice.
Data duplicates will always show in the Needs Review tab inside the folder
Note: If you are using our 'Auto-Export' feature, Duplicates are not automatically exported and you need to resolve the duplicates manually.
Resolving a "Data duplicate detected" warning
When Datamolino flags a document as a data duplicate, the warning means a previously uploaded document in the same folder has the same supplier name and invoice number. The message always includes an "Original can be found here" link to the matching transaction.
👉 How to fix a "Data duplicate detected" warning?
if you viewed the original and decided that you need to export the duplicate anyway, simply click Dismiss and export the document as usual.
Enable additional parameters
As a default setting, Datamolino detects duplicates based on the supplier name and invoice number match. If needed, you can enable two additional parameters to narrow down duplicate detection results:
Total
Issue date
You can add one of them or both depending on your use case. In practice, this means that all three (or four) conditions need to be met in order to flag the document as a duplicate.
For example, if a supplier re-issues an invoice with different amounts and you do not want such documents to be flagged as duplicates, you can add 'Total'. This means that as long as the invoice has different amounts, it won't be flagged as a duplicate. Or if you upload receipts where Datamolino captures generic text instead of a specific number (on receipts often missing), you may add one more element to recognise duplicates.
To set it up follow these steps:
go to Folder menu > Accounting and Automation
got to Workflow > Invoice Data Duplicate Detection
add one or both extra parameters and click save.






