Skip to content

Commit 9941ce0

Browse files
committed
Adds document text detection tutorial.
1 parent 959ca2c commit 9941ce0

6 files changed

Lines changed: 292 additions & 0 deletions

File tree

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
.. This file is automatically generated. Do not edit this file directly.
2+
3+
Google Cloud Vision API Python Samples
4+
===============================================================================
5+
6+
This directory contains samples for Google Cloud Vision API. `Google Cloud Vision API`_ allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content
7+
8+
9+
10+
11+
.. _Google Cloud Vision API: https://cloud.google.com/vision/docs
12+
13+
Setup
14+
-------------------------------------------------------------------------------
15+
16+
17+
Authentication
18+
++++++++++++++
19+
20+
Authentication is typically done through `Application Default Credentials`_,
21+
which means you do not have to change the code to authenticate as long as
22+
your environment has credentials. You have a few options for setting up
23+
authentication:
24+
25+
#. When running locally, use the `Google Cloud SDK`_
26+
27+
.. code-block:: bash
28+
29+
gcloud beta auth application-default login
30+
31+
32+
#. When running on App Engine or Compute Engine, credentials are already
33+
set-up. However, you may need to configure your Compute Engine instance
34+
with `additional scopes`_.
35+
36+
#. You can create a `Service Account key file`_. This file can be used to
37+
authenticate to Google Cloud Platform services from any environment. To use
38+
the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to
39+
the path to the key file, for example:
40+
41+
.. code-block:: bash
42+
43+
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json
44+
45+
.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow
46+
.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using
47+
.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount
48+
49+
Install Dependencies
50+
++++++++++++++++++++
51+
52+
#. Install `pip`_ and `virtualenv`_ if you do not already have them.
53+
54+
#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
55+
56+
.. code-block:: bash
57+
58+
$ virtualenv env
59+
$ source env/bin/activate
60+
61+
#. Install the dependencies needed to run the samples.
62+
63+
.. code-block:: bash
64+
65+
$ pip install -r requirements.txt
66+
67+
.. _pip: https://pip.pypa.io/
68+
.. _virtualenv: https://virtualenv.pypa.io/
69+
70+
Samples
71+
-------------------------------------------------------------------------------
72+
73+
Document Text tutorial
74+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
75+
76+
77+
78+
To run this sample:
79+
80+
.. code-block:: bash
81+
82+
$ python doctext.py
83+
84+
usage: doctext.py [-h] image_file
85+
86+
positional arguments:
87+
image_file The image for text detection.
88+
89+
optional arguments:
90+
-h, --help show this help message and exit
91+
92+
93+
94+
95+
The client library
96+
-------------------------------------------------------------------------------
97+
98+
This sample uses the `Google Cloud Client Library for Python`_.
99+
You can read the documentation for more details on API usage and use GitHub
100+
to `browse the source`_ and `report issues`_.
101+
102+
.. Google Cloud Client Library for Python:
103+
https://googlecloudplatform.github.io/google-cloud-python/
104+
.. browse the source:
105+
https://github.com/GoogleCloudPlatform/google-cloud-python
106+
.. report issues:
107+
https://github.com/GoogleCloudPlatform/google-cloud-python/issues
108+
109+
110+
.. _Google Cloud SDK: https://cloud.google.com/sdk/
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# This file is used to generate README.rst
2+
3+
product:
4+
name: Google Cloud Vision API
5+
short_name: Cloud Vision API
6+
url: https://cloud.google.com/vision/docs
7+
description: >
8+
`Google Cloud Vision API`_ allows developers to easily integrate vision
9+
detection features within applications, including image labeling, face and
10+
landmark detection, optical character recognition (OCR), and tagging of
11+
explicit content.
12+
13+
setup:
14+
- auth
15+
- install_deps
16+
17+
samples:
18+
- name: Document Text tutorial
19+
file: doctext.py
20+
show_help: True
21+
22+
cloud_client_library: true
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
#!/usr/bin/env python
2+
3+
# Copyright 2017 Google Inc. All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
"""Outlines document text given an image.
18+
19+
Example:
20+
python doctext.py resources/cropme.jpg
21+
"""
22+
# [START full_tutorial]
23+
# [START imports]
24+
import argparse
25+
from enum import Enum
26+
import io
27+
28+
from google.cloud import vision
29+
from PIL import Image, ImageDraw
30+
# [END imports]
31+
32+
33+
class FeatureType(Enum):
34+
PAGE = 1
35+
BLOCK = 2
36+
PARA = 3
37+
WORD = 4
38+
SYMBOL = 5
39+
40+
41+
def draw_boxes(im, blocks, color, width):
42+
"""Draw a border around the image using the hints in the vector list."""
43+
# [START draw_blocks]
44+
draw = ImageDraw.Draw(im)
45+
46+
for block in blocks:
47+
draw.line([block.vertices[0].x, block.vertices[0].y,
48+
block.vertices[1].x, block.vertices[1].y],
49+
fill=color, width=width)
50+
draw.line([block.vertices[1].x, block.vertices[1].y,
51+
block.vertices[2].x, block.vertices[2].y],
52+
fill=color, width=width)
53+
draw.line([block.vertices[2].x, block.vertices[2].y,
54+
block.vertices[3].x, block.vertices[3].y],
55+
fill=color, width=width)
56+
draw.line([block.vertices[3].x, block.vertices[3].y,
57+
block.vertices[0].x, block.vertices[0].y],
58+
fill=color, width=width)
59+
60+
return im
61+
# [END draw_blocks]
62+
63+
64+
def get_document_bounds(image_file, feature):
65+
# [START detect_bounds]
66+
"""Returns document bounds given an image."""
67+
vision_client = vision.Client()
68+
69+
bounds = []
70+
71+
with io.open(image_file, 'rb') as image_file:
72+
content = image_file.read()
73+
74+
image = vision_client.image(content=content)
75+
document = image.detect_full_text()
76+
77+
for b, page in enumerate(document.pages):
78+
79+
for bb, block in enumerate(page.blocks):
80+
81+
for p, paragraph in enumerate(block.paragraphs):
82+
83+
for w, word in enumerate(paragraph.words):
84+
85+
for s, symbol in enumerate(word.symbols):
86+
87+
if (feature == FeatureType.SYMBOL):
88+
bounds.append(symbol.bounding_box)
89+
90+
if (feature == FeatureType.WORD):
91+
bounds.append(word.bounding_box)
92+
93+
if (feature == FeatureType.PARA):
94+
bounds.append(paragraph.bounding_box)
95+
96+
if (feature == FeatureType.BLOCK):
97+
bounds.append(block.bounding_box)
98+
99+
if (feature == FeatureType.PAGE):
100+
bounds.append(block.bounding_box)
101+
102+
return bounds
103+
# [END detect_bounds]
104+
105+
106+
def render_doc_text(filein, fileout):
107+
# [START render_doc_text]
108+
im = Image.open(filein)
109+
bounds = get_document_bounds(filein, FeatureType.PAGE)
110+
draw_boxes(im, bounds, 'blue', 3)
111+
bounds = get_document_bounds(filein, FeatureType.PARA)
112+
draw_boxes(im, bounds, 'green', 2)
113+
bounds = get_document_bounds(filein, FeatureType.WORD)
114+
draw_boxes(im, bounds, 'yellow', 1)
115+
116+
if fileout is not 0:
117+
im.save(fileout)
118+
else:
119+
im.show()
120+
# [END render_doc_text]
121+
122+
123+
if __name__ == '__main__':
124+
# [START run_crop]
125+
parser = argparse.ArgumentParser()
126+
parser.add_argument('detect_file', help='The image for text detection.')
127+
parser.add_argument('-out_file', help='Optional output file', default=0)
128+
args = parser.parse_args()
129+
130+
parser = argparse.ArgumentParser()
131+
render_doc_text(args.detect_file, args.out_file)
132+
133+
# [END run_crop]
134+
# [END full_tutorial]
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Copyright 2017 Google Inc. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
import os
16+
17+
import doctext
18+
19+
20+
def test_text(cloud_config, capsys):
21+
"""Checks the output image for drawing the crop hint is created."""
22+
doctext.render_doc_text('resources/text_menu.jpg', 'output-text.jpg')
23+
out, _ = capsys.readouterr()
24+
assert os.path.isfile('output-text.jpg')
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
google-cloud-vision==0.23.2
2+
pillow==4.0.0
52 KB
Loading

0 commit comments

Comments
 (0)