Newest 'cloud-document-ai' Questions

0 votes

0 answers

143 views

Document AI importDocument returns error code 13 when using REST API

I’m currently having an issue with my code — I’m using the REST API to train a Document AI model with the custom extraction type. I have already completed the following steps: Called the v1 process ...

Nor3soN

1

asked Nov 11, 2025 at 9:38

1 vote

1 answer

143 views

Why do I only get a RESOURCE_EXHAUSTED error when using the Document AI client library with C++? [duplicate]

I am testing out Document AI's Expense processor using the provided Google Cloud client libraries in a few different languages (JavaScript, Python, and C++). I'm testing with a set of JPEG images ...

Jasen Chan

11

asked Jul 2, 2025 at 22:11

0 votes

2 answers

325 views

Google Document AI batch processing failing

Following Google's documentation, I am trying to perform a Document AI OCR batch request (async), and I constantly receive an error. I tried both with gcs_input_uri and gcs_input_prefix. I can not ...

RanH

852

asked May 27, 2025 at 11:55

1 vote

1 answer

142 views

Error 403, when sending a request to Google Document AI API

I get this error in my log-file each time I am trying to send a request to Google Document AI API: 403 Client Error: Forbidden for url: https://us-documentai.googleapis.com/v1/projects/230636727467/...

Stella Slad

11

asked May 7, 2025 at 23:27

-1 votes

1 answer

108 views

DocumentAI detect if image contains non-text visual elements in it

Most of my target images contain only text elements, which is expected, since my main purpose is to extract text from them. But some of the target images contain non-text visual elements (actual ...

jonah_w

1,030

asked Jan 21, 2025 at 19:25

4 votes

1 answer

152 views

With Custom Extractor, Python API view of schema does not provide access to EntityTypes; it should according to docs

The API documentation shows that the DocumentSchema has EntityType children which should contain details of all fields in a Custom Extractor. I am able to obtain the DocumentSchema as expected. ...

stu2

109

asked Nov 17, 2024 at 9:21

0 votes

0 answers

2k views

Document AI - Processor location issue [duplicate]

I'm using a Mac and I have created a simple Document AI processor on the Google Cloud Platform (PDF splitter). This processor was trained, tested and deployed. I'm now desperately trying to make use ...

AlexCT

35

asked Jul 26, 2024 at 22:29

0 votes

0 answers

130 views

ProcessDocument API Errors - No remaining quota for ParseDocument

As part of our workflow we invoke DocumentAI ProcessDocument API (v1) API from our back end and the code has been in place for over 6 months and running without any errors. In the past one week we ...

Charles

1

asked Jul 3, 2024 at 16:47

1 vote

0 answers

47 views

custom classifier/splitter dataset test limit

I am currently working on a project that utilizes the docai custom classifier. I have a question regarding the test dataset size limitations. As I understand, the current limit for the test dataset ...

Al Monteagudo

11

asked May 16, 2024 at 2:10

0 votes

1 answer

929 views

Getting an error when I am trying to use pre-built contract model on AI Document Intelligence Studio. Error code in the body

I was trying to analyze a contract using Microsoft's Document Intelligence Studio. All the pre-built models are working except for the contract pre-built model. I am getting error code: "...

Harsh Khewal

5

asked Apr 19, 2024 at 9:09

0 votes

1 answer

747 views

Document AI - Multi-page files performance affect

I’ve noticed that it’s possible to upload multi-page files to Document AI, such that all pages are connected to each other by being associated to the same file. My use case is invoice files that I ...

Yaniv Ben-Malka

57

asked Mar 26, 2024 at 15:52

0 votes

1 answer

595 views

Auto-Labeling in Document AI with Custom Extractor: Schema Requirement Issue

I am using Document AI with a Custom Extractor. When I create a new Custom Extractor, it offers to manage my dataset. I expect that doing so will automatically create label names for the documents I ...

tmighty

11.6k

asked Mar 23, 2024 at 2:57

0 votes

1 answer

458 views

Google Document AI create labeling instruction

https://cloud.google.com/document-ai/docs/workbench/label-documents#labeling For google Document AI, what is a labeling instruction exactly? Is it a pdf where every label are annotated using a box? If ...

Max

1

asked Mar 1, 2024 at 10:42

0 votes

1 answer

195 views

Document AI adding folders

I'm using Document AI to parse PDF files from one bucket and then save them as JSON in another bucket in GCS. However, Document AI creates a folder with a subfolder in my bucket. I've read a lot and I ...

c0nfusion

1

asked Feb 29, 2024 at 12:00

0 votes

1 answer

107 views

Does the `Number` type in Google Document AI include decimals?

I've been using the document AI tool for a while and have quite a few documents labeled and just thought of a question: does the Number field type allow for decimals (ex: 0.3456) or does it only allow ...

pl8nt

49

asked Feb 28, 2024 at 14:40

0 votes

1 answer

126 views

GCP API for AI Documents

I'm having issues with the API, there is no response whatsoever. I have created the service account with the corresponding API key with its JSON file, however, I cannot seem to get any response when ...

Keagan Gilmore

1

asked Feb 22, 2024 at 10:15

0 votes

0 answers

161 views

How can I tell Google Document AI Enterprise OCR to always assume one column?

How can I tell Google Document AI Enterprise OCR to always assume one column? My text (scans of old books) are always one column. However, due to layout, (lots of) whitespace, and inline figures, ...

SRobertJames

9,457

asked Feb 21, 2024 at 1:18

1 vote

1 answer

653 views

How can I use Google Document AI OCR to find the non-text images in a text document?

How can I use Google Document AI OCR to find the non-text images in a text document? I'm using Google Document AI Enterprise OCR to OCR images (scans of old books_, and it works well. The books have ...

SRobertJames

9,457

asked Feb 20, 2024 at 23:24

0 votes

0 answers

146 views

Will adjusting the value acquired from bounding box annotation train the model to be able to make inferences?

This may be a silly question but I've been annotating quite a few documents with the Google Document AI tool and have had this worry in the back of my mind. My task is to use Doc AI to extract ...

pl8nt

49

asked Feb 20, 2024 at 19:31

0 votes

0 answers

107 views

Line Ordering Issue with Arabic PDF Text Using Google Cloud Document AI

I have an app that uses Document AI to process PDFs and extract text from it. When I use the stable version but still is not accurate. The processed text seems to have its lines mixed up, not ...

Khaled Saleh

168

asked Feb 18, 2024 at 1:51

1 vote

1 answer

294 views

Response from Document AI stored in Google Cloud Storage

I am using a GCP workflow and eventarc trigger connected to cloud storage to have a document evaluated by Document AI when the cloud storage bucket receives it. The issue I'm encountering is, whenever ...

Lofton Gentry

333

asked Feb 17, 2024 at 18:00

1 vote

1 answer

268 views

Reskewing GCP Document AI Result

GCP's Document AI is pre-processing images to remove things like skew. The bounding boxes it produces correspond to the pre-processed image, not the image sent to the API. I need to reskew them so ...

user19213041

11

asked Feb 16, 2024 at 1:24

0 votes

1 answer

513 views

Document AI batch processing timeout using Java

I am trying to batch process a set of documents using Document AI and its Java SDK. My code is derived from the batch processing example for Java (seen here), but I have modified it to add more than ...

Filip Östermark

445

asked Feb 6, 2024 at 17:36

0 votes

0 answers

82 views

Impact of Using PDF Training Data and JPG Test Data on Document AI Model Performance

I'm currently working on a document AI project (with Custom Extractor) and have encountered a scenario that I'm unsure how to navigate. My training dataset of Shipping instruction documents consists ...

lht_18018

64

asked Feb 1, 2024 at 7:58

1 vote

1 answer

680 views

Document AI "400 No valid schema provided for processing" with Cloud Function

I’ve been experiencing an issue with the Google Cloud Document AI API in my Firebase Cloud Function that handles documents uploaded to Google Cloud Storage. The function triggers correctly upon PDF ...

HaZeust

13

asked Jan 31, 2024 at 17:52

1 vote

0 answers

370 views

(Terraform) BigQuery Job misses IAM permissions, which have been granted

I read this blogpost about the recently published Document AI - BigQuery Integration. I want to configure this setup completly using terraform. An important step in the blog post is the configuration ...

Brian

117

asked Jan 26, 2024 at 10:48

1 vote

0 answers

185 views

No valid schema provided for processing

I'm trying to make a nodejs project to extract data from invoices using Cloud Document AI API. I have copied the code provided on google doc as follows: /** * TODO(developer): Uncomment these ...

Maurizio Liguori

11

asked Jan 15, 2024 at 13:03

0 votes

1 answer

126 views

Cannot Import Google Cloud

I am following Google's official tutorial on setting up Document AI: https://cloud.google.com/document-ai/docs/libraries#client-libraries-install-java My POM file: <project> <...

SolidCloudinc

321

asked Jan 14, 2024 at 18:25

1 vote

1 answer

287 views

Using Batch Processing Document AI inside the google cloud function

I have a scenario where I am uploading a local file to a Cloud Storage bucket, triggering a Cloud Function (xyz). Within this Cloud Function, I am performing a batch processing task using Google Cloud ...

Manish gupta

11

asked Jan 4, 2024 at 17:52

0 votes

1 answer

172 views

Google Cloud Document AI namespace issue

I'm currently working on a project that involves using the Google Cloud Document AI Client Library in my PHP application. However, I've encountered an issue with the library's namespace that's been ...

shoop79

1

asked Dec 31, 2023 at 6:56

0 votes

1 answer

405 views

document ai forms processor tags multiple lines as 1 line

I'm trying to use the document ai forms processor to get the rows of a table. When I upload a document, the forms processor does not get each line separately. It combines multiple lines into a single &...

mijaro

1

asked Dec 15, 2023 at 16:43

2 votes

0 answers

123 views

Document AI Custom Processor - annotating across pages

I am currently building a custom OCR extractor with Google's Document AI, my documents are usually around 8-14 pages long and I have created a schema across all possible pages. Using the defined ...

Cookie Monster

21

asked Dec 14, 2023 at 10:23

0 votes

2 answers

721 views

Google Document AI Custom Extractor type object 'Property' has no attribute 'OccurenceType'

I have a Google Document AI Custom Extractor model trained and it works great when I test it in the cloud console but I'm struggling to get a sample python program to work. I've taken this sample code ...

Matt Reidy

37

asked Dec 13, 2023 at 19:45

0 votes

1 answer

804 views

Remove Headers and Footers

I'm looking to find ways to remove header, footer text in a pdf with document ai, I couldn't find in any API documentation here I'm using OCR_PROCESSOR and tried to enable_native_pdf_parsing but there ...

Prany

2,131

asked Dec 13, 2023 at 18:48

0 votes

1 answer

187 views

Google Cloud Document AI OCR - Different number of words and tokens

I'm using Google Document AI OCR to extract the text from and image following this guide. I'm using this image: Test image This is what I'm doing: from google.cloud import documentai_v1 as documentai ...

galex

671

asked Dec 13, 2023 at 8:17

0 votes

1 answer

441 views

google.api_core.exceptions.InvalidArgument: 400 The resource projects/{my-proj-id}/locations/eu is not located in us

I am trying to use the Google DocAI Warehouse sample Python code and it looks like that the location parameter is always ignored and just assumes the 'us' location. My prototype project has 'eu' as ...

caoimhinmacg

11

asked Dec 8, 2023 at 3:02

0 votes

1 answer

222 views

How do I iterate through JSON files stored in GCP bucket in different folders. Example; | Bucket/Dict/Folder2/file.json Bucket/Dict/Folder1/file.json

I have dumped JSON files from DOCAI to GCP but each file is stored in individual folder, although they are in the same bucket on Cloud Storage. I am not able to iterate through the JSON files stored ...

Vedant Patil

1

asked Dec 5, 2023 at 17:41

0 votes

1 answer

947 views

Google Document AI Form Parser is not returning entities for all pages

I am trying the Google Document AI with a standard Form Parser. I processed a 60 pages PDF file and the OCR result returned entities for a first few pages and the rest of the pages do not include the ...

Emma

9,633

asked Nov 28, 2023 at 18:19

0 votes

0 answers

210 views

Train a custom classifier on Document AI via code

I want to understand how to train via code a document classifier using the Document AI API, but I haven't found relevant information in the documentation or code samples. I have defined an Invoice OCR ...

Kronchik X

23

asked Nov 28, 2023 at 8:23

0 votes

0 answers

505 views

Removing Documents from a Google Document AI Processor Dataset in Python

I am currently working on a project involving Google Document AI, and I need assistance with removing documents from a Processor dataset using Python. I have tried various approaches but haven't been ...

Mikkel

318

asked Nov 20, 2023 at 11:21

0 votes

1 answer

198 views

DocumentAI - Custom Extractor no entities

I just train a custom extractor at Document AI and test it there and get the values for the tags that I created, but I was following the Sample Request for Python (here's the code sample) but I get no ...

Alberto Martinez

1

asked Nov 16, 2023 at 11:48

1 vote

0 answers

36 views

Single letter recognition fails consitently

I am running some example custom processors to read values of governmental documents. Some of the values we are after have a single letter as value, for example on a passport "Gender" has a ...

Stefan Walther

71

asked Nov 13, 2023 at 11:42

0 votes

1 answer

588 views

Batch Import and Label Assignment in Google Document AI

We are integrating Google's Document AI into our document management system and require an automated solution to import and label PDF documents for a custom classifier processor's dataset. Is there an ...

Michael Maramzin

1

asked Nov 8, 2023 at 18:37

0 votes

1 answer

27k views

PermissionDenied: 403 Permission denied on resource project XXXXXXX

I am trying to process documents using Google's document AI. But there are some issues, which I am facing. I have created a service account, given all the necessary access, but I am still not able to ...

Akshay Malik

1

asked Nov 2, 2023 at 17:11

0 votes

1 answer

189 views

InvoiceParser: errors with uptraing new version after activating invoice_type

we are using gcloud document ai to parse invoices and we recently enabled the invoice type feature and relabeled all documents with the labeling feature. so that all invoices will have an invocie_type,...

Christian Schmitt

922

asked Oct 31, 2023 at 12:15

0 votes

1 answer

2k views

How to locally process a batch of files using Document AI with the Python client?

I'm trying to use the Python console to use the Document OCR processor to locally process a large amount of pdf documents (native and scanned) to extract the text and some metadata. The documents are ...

Vojta Partík

1

asked Oct 29, 2023 at 23:17

0 votes

1 answer

674 views

Uploading file directly to Google Cloud Document AI

I am trying to upload a file directly to Google Cloud Document AI for processing. I am receiving the error 400 Request contains an invalid argument. [field_violations { field: "raw_document....

SeaSky

1,312

asked Oct 17, 2023 at 8:39

-1 votes

1 answer

982 views

Google Cloud Document AI can be installed and used on local, on-premise hardware?

I had a discussion with a female representative from Google Cloud sales. She claimed that ‘Document AI’ is now available as an on-premise solution. I have doubts about this claim. Can anyone confirm ...

Seyed Hossein Mirheydari

105

asked Oct 13, 2023 at 8:40

2 votes

1 answer

446 views

Document AI Custom Doc Splitter - Internal error encountered

I created a training set for custom document splitter with a total 3803 docs and 158 labels. I checked with documents quota and limits and all my docs and pages are within the limits. When I run the ...

Services GitHub

23

asked Oct 11, 2023 at 23:02

2 votes

3 answers

724 views

In GCP's DocumentAI, when importing documents via API, is it possible to add a Document Type label?

I am creating a Custom Document Classification Processor in GCP's DocumentAI platform, and am trying to understand whether it is possible to assign a Document Type label to documents when importing ...

J L

440

asked Oct 2, 2023 at 17:06

Collectives™ on Stack Overflow