Question 1

How do I train InvoiceNet on my own invoice data?

Accepted Answer

Use the Trainer GUI by running trainer.py, set the data folder to your PDF and JSON files, and click Prepare Data and Start. Alternatively, use the CLI with prepare_data.py and train.py commands as detailed in the README.

Question 2

InvoiceNet or commercial services like Amazon Textract for invoice extraction?

Accepted Answer

InvoiceNet is open-source and customizable for on-premise use with data privacy, but requires training; commercial services offer pre-trained models with less control and ongoing costs.

Question 3

Can InvoiceNet extract data from scanned JPEG invoices?

Accepted Answer

Yes, it supports JPG and PNG formats via OCR, but you may need to configure the OCR engine in the code as noted in the predict.py section for optimal accuracy.

Question 4

How to add a custom field like purchase order number in InvoiceNet?

Accepted Answer

Edit invoicenet/__init__.py to add the field under the appropriate type (e.g., general or optional), then retrain the model using the GUI or CLI to incorporate it.

Question 5

What are the system requirements for running InvoiceNet on Ubuntu?

Accepted Answer

It's tested on Ubuntu 20.04 with CUDA 11.8, cuDNN 8.9.7, and TensorFlow 2.13.1; the install.sh script sets up a virtual environment, but GPU is recommended for training.

Question 6

Does InvoiceNet work without an internet connection?

Accepted Answer

Yes, it's self-hosted and offline-capable once installed, allowing training and extraction on local machines without external API calls, though initial setup requires downloading dependencies.

InvoiceNet

What is InvoiceNet?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions