253 questions
0
votes
0
answers
143
views
Document AI importDocument returns error code 13 when using REST API
I’m currently having an issue with my code — I’m using the REST API to train a Document AI model with the custom extraction type.
I have already completed the following steps:
Called the v1 process ...
1
vote
1
answer
143
views
Why do I only get a RESOURCE_EXHAUSTED error when using the Document AI client library with C++? [duplicate]
I am testing out Document AI's Expense processor using the provided Google Cloud client libraries in a few different languages (JavaScript, Python, and C++). I'm testing with a set of JPEG images ...
0
votes
2
answers
325
views
Google Document AI batch processing failing
Following Google's documentation, I am trying to perform a Document AI OCR batch request (async), and I constantly receive an error.
I tried both with gcs_input_uri and gcs_input_prefix.
I can not ...
1
vote
1
answer
142
views
Error 403, when sending a request to Google Document AI API
I get this error in my log-file each time I am trying to send a request to Google Document AI API:
403 Client Error: Forbidden for url: https://us-documentai.googleapis.com/v1/projects/230636727467/...
-1
votes
1
answer
108
views
DocumentAI detect if image contains non-text visual elements in it
Most of my target images contain only text elements, which is expected, since my main purpose is to extract text from them. But some of the target images contain non-text visual elements (actual ...
4
votes
1
answer
152
views
With Custom Extractor, Python API view of schema does not provide access to EntityTypes; it should according to docs
The API documentation shows that the DocumentSchema has EntityType children which should contain details of all fields in a Custom Extractor. I am able to obtain the DocumentSchema as expected. ...
0
votes
0
answers
2k
views
Document AI - Processor location issue [duplicate]
I'm using a Mac and I have created a simple Document AI processor on the Google Cloud Platform (PDF splitter). This processor was trained, tested and deployed.
I'm now desperately trying to make use ...
0
votes
0
answers
130
views
ProcessDocument API Errors - No remaining quota for ParseDocument
As part of our workflow we invoke DocumentAI ProcessDocument API (v1) API from our back end and the code has been in place for over 6 months and running without any errors. In the past one week we ...
1
vote
0
answers
47
views
custom classifier/splitter dataset test limit
I am currently working on a project that utilizes the docai custom classifier. I have a question regarding the test dataset size limitations.
As I understand, the current limit for the test dataset ...
0
votes
1
answer
929
views
Getting an error when I am trying to use pre-built contract model on AI Document Intelligence Studio. Error code in the body
I was trying to analyze a contract using Microsoft's Document Intelligence Studio. All the pre-built models are working except for the contract pre-built model. I am getting error code:
"...
0
votes
1
answer
747
views
Document AI - Multi-page files performance affect
I’ve noticed that it’s possible to upload multi-page files to Document AI, such that all pages are connected to each other by being associated to the same file.
My use case is invoice files that I ...
0
votes
1
answer
595
views
Auto-Labeling in Document AI with Custom Extractor: Schema Requirement Issue
I am using Document AI with a Custom Extractor. When I create a new Custom Extractor, it offers to manage my dataset.
I expect that doing so will automatically create label names for the documents I ...
0
votes
1
answer
458
views
Google Document AI create labeling instruction
https://cloud.google.com/document-ai/docs/workbench/label-documents#labeling
For google Document AI, what is a labeling instruction exactly? Is it a pdf where every label are annotated using a box? If ...
0
votes
1
answer
195
views
Document AI adding folders
I'm using Document AI to parse PDF files from one bucket and then save them as JSON in another bucket in GCS. However, Document AI creates a folder with a subfolder in my bucket.
I've read a lot and I ...
0
votes
1
answer
107
views
Does the `Number` type in Google Document AI include decimals?
I've been using the document AI tool for a while and have quite a few documents labeled and just thought of a question: does the Number field type allow for decimals (ex: 0.3456) or does it only allow ...
0
votes
1
answer
126
views
GCP API for AI Documents
I'm having issues with the API, there is no response whatsoever. I have created the service account with the corresponding API key with its JSON file, however, I cannot seem to get any response when ...
0
votes
0
answers
161
views
How can I tell Google Document AI Enterprise OCR to always assume one column?
How can I tell Google Document AI Enterprise OCR to always assume one column?
My text (scans of old books) are always one column. However, due to layout, (lots of) whitespace, and inline figures, ...
1
vote
1
answer
653
views
How can I use Google Document AI OCR to find the non-text images in a text document?
How can I use Google Document AI OCR to find the non-text images in a text document?
I'm using Google Document AI Enterprise OCR to OCR images (scans of old books_, and it works well. The books have ...
0
votes
0
answers
146
views
Will adjusting the value acquired from bounding box annotation train the model to be able to make inferences?
This may be a silly question but I've been annotating quite a few documents with the Google Document AI tool and have had this worry in the back of my mind. My task is to use Doc AI to extract ...
0
votes
0
answers
107
views
Line Ordering Issue with Arabic PDF Text Using Google Cloud Document AI
I have an app that uses Document AI to process PDFs and extract text from it. When I use the stable version but still is not accurate. The processed text seems to have its lines mixed up, not ...
1
vote
1
answer
294
views
Response from Document AI stored in Google Cloud Storage
I am using a GCP workflow and eventarc trigger connected to cloud storage to have a document evaluated by Document AI when the cloud storage bucket receives it. The issue I'm encountering is, whenever ...
1
vote
1
answer
268
views
Reskewing GCP Document AI Result
GCP's Document AI is pre-processing images to remove things like skew. The bounding boxes it produces correspond to the pre-processed image, not the image sent to the API. I need to reskew them so ...
0
votes
1
answer
513
views
Document AI batch processing timeout using Java
I am trying to batch process a set of documents using Document AI and its Java SDK. My code is derived from the batch processing example for Java (seen here), but I have modified it to add more than ...
0
votes
0
answers
82
views
Impact of Using PDF Training Data and JPG Test Data on Document AI Model Performance
I'm currently working on a document AI project (with Custom Extractor) and have encountered a scenario that I'm unsure how to navigate. My training dataset of Shipping instruction documents consists ...
1
vote
1
answer
680
views
Document AI "400 No valid schema provided for processing" with Cloud Function
I’ve been experiencing an issue with the Google Cloud Document AI API in my Firebase Cloud Function that handles documents uploaded to Google Cloud Storage. The function triggers correctly upon PDF ...
1
vote
0
answers
370
views
(Terraform) BigQuery Job misses IAM permissions, which have been granted
I read this blogpost about the recently published Document AI - BigQuery Integration. I want to configure this setup completly using terraform.
An important step in the blog post is the configuration ...
1
vote
0
answers
185
views
No valid schema provided for processing
I'm trying to make a nodejs project to extract data from invoices using Cloud Document AI API. I have copied the code provided on google doc as follows:
/**
* TODO(developer): Uncomment these ...
0
votes
1
answer
126
views
Cannot Import Google Cloud
I am following Google's official tutorial on setting up Document AI: https://cloud.google.com/document-ai/docs/libraries#client-libraries-install-java
My POM file:
<project>
<...
1
vote
1
answer
287
views
Using Batch Processing Document AI inside the google cloud function
I have a scenario where I am uploading a local file to a Cloud Storage bucket, triggering a Cloud Function (xyz). Within this Cloud Function, I am performing a batch processing task using Google Cloud ...
0
votes
1
answer
172
views
Google Cloud Document AI namespace issue
I'm currently working on a project that involves using the Google Cloud Document AI Client Library in my PHP application.
However, I've encountered an issue with the library's namespace that's been ...
0
votes
1
answer
405
views
document ai forms processor tags multiple lines as 1 line
I'm trying to use the document ai forms processor to get the rows of a table. When I upload a document, the forms processor does not get each line separately. It combines multiple lines into a single &...
2
votes
0
answers
123
views
Document AI Custom Processor - annotating across pages
I am currently building a custom OCR extractor with Google's Document AI, my documents are usually around 8-14 pages long and I have created a schema across all possible pages. Using the defined ...
0
votes
2
answers
721
views
Google Document AI Custom Extractor type object 'Property' has no attribute 'OccurenceType'
I have a Google Document AI Custom Extractor model trained and it works great when I test it in the cloud console but I'm struggling to get a sample python program to work.
I've taken this sample code ...
0
votes
1
answer
804
views
Remove Headers and Footers
I'm looking to find ways to remove header, footer text in a pdf with document ai, I couldn't find in any API documentation here
I'm using OCR_PROCESSOR and tried to enable_native_pdf_parsing but there ...
0
votes
1
answer
187
views
Google Cloud Document AI OCR - Different number of words and tokens
I'm using Google Document AI OCR to extract the text from and image following this guide.
I'm using this image:
Test image
This is what I'm doing:
from google.cloud import documentai_v1 as documentai
...
0
votes
1
answer
441
views
google.api_core.exceptions.InvalidArgument: 400 The resource projects/{my-proj-id}/locations/eu is not located in us
I am trying to use the Google DocAI Warehouse sample Python code and it looks like that the location parameter is always ignored and just assumes the 'us' location.
My prototype project has 'eu' as ...
0
votes
1
answer
222
views
How do I iterate through JSON files stored in GCP bucket in different folders. Example; | Bucket/Dict/Folder2/file.json Bucket/Dict/Folder1/file.json
I have dumped JSON files from DOCAI to GCP but each file is stored in individual folder, although they are in the same bucket on Cloud Storage. I am not able to iterate through the JSON files stored ...
0
votes
1
answer
947
views
Google Document AI Form Parser is not returning entities for all pages
I am trying the Google Document AI with a standard Form Parser. I processed a 60 pages PDF file and the OCR result returned entities for a first few pages and the rest of the pages do not include the ...
0
votes
0
answers
210
views
Train a custom classifier on Document AI via code
I want to understand how to train via code a document classifier using the Document AI API, but I haven't found relevant information in the documentation or code samples. I have defined an Invoice OCR ...
0
votes
0
answers
505
views
Removing Documents from a Google Document AI Processor Dataset in Python
I am currently working on a project involving Google Document AI, and I need assistance with removing documents from a Processor dataset using Python. I have tried various approaches but haven't been ...
0
votes
1
answer
198
views
DocumentAI - Custom Extractor no entities
I just train a custom extractor at Document AI and test it there and get the values for the tags that I created, but I was following the Sample Request for Python (here's the code sample) but I get no ...
1
vote
0
answers
36
views
Single letter recognition fails consitently
I am running some example custom processors to read values of governmental documents.
Some of the values we are after have a single letter as value, for example on a passport "Gender" has a ...
0
votes
1
answer
588
views
Batch Import and Label Assignment in Google Document AI
We are integrating Google's Document AI into our document management system and require an automated solution to import and label PDF documents for a custom classifier processor's dataset.
Is there an ...
0
votes
1
answer
27k
views
PermissionDenied: 403 Permission denied on resource project XXXXXXX
I am trying to process documents using Google's document AI. But there are some issues, which I am facing. I have created a service account, given all the necessary access, but I am still not able to ...
0
votes
1
answer
189
views
InvoiceParser: errors with uptraing new version after activating invoice_type
we are using gcloud document ai to parse invoices and we recently enabled the invoice type feature and relabeled all documents with the labeling feature. so that all invoices will have an invocie_type,...
0
votes
1
answer
2k
views
How to locally process a batch of files using Document AI with the Python client?
I'm trying to use the Python console to use the Document OCR processor to locally process a large amount of pdf documents (native and scanned) to extract the text and some metadata. The documents are ...
0
votes
1
answer
674
views
Uploading file directly to Google Cloud Document AI
I am trying to upload a file directly to Google Cloud Document AI for processing. I am receiving the error
400 Request contains an invalid argument. [field_violations { field: "raw_document....
-1
votes
1
answer
982
views
Google Cloud Document AI can be installed and used on local, on-premise hardware?
I had a discussion with a female representative from Google Cloud sales. She claimed that ‘Document AI’ is now available as an on-premise solution. I have doubts about this claim. Can anyone confirm ...
2
votes
1
answer
446
views
Document AI Custom Doc Splitter - Internal error encountered
I created a training set for custom document splitter with a total 3803 docs and 158 labels. I checked with documents quota and limits and all my docs and pages are within the limits. When I run the ...
2
votes
3
answers
724
views
In GCP's DocumentAI, when importing documents via API, is it possible to add a Document Type label?
I am creating a Custom Document Classification Processor in GCP's DocumentAI platform, and am trying to understand whether it is possible to assign a Document Type label to documents when importing ...