-2

I have applied the following adjustments to the original image:

  • resized
  • changed the colour scale
  • greyscaled
  • thresholded
  • inverted the colours

This results in the following image

enter image description here

Using tesseract, i'm converting the image to a string but it only seems to recognise the 4.

Code to convert to text -

print (tess.image_to_string(img, config='--psm 6 -c tessedit_char_whitelist="9876543210"'))
4

I then attempted to sharpen using the following code resulting in the next image, but tesseract is still only recognising the 4. Any idea how I can sharpen this further so tesseract recognises this as 40?

kernel = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
sharpened = cv2.filter2D(img,-1,kernel)
print (tess.image_to_string(sharpened, config='--psm 6 -c tessedit_char_whitelist="9876543210"'))
4

enter image description here

Alternatively, the original image is the following without any resizing.

enter image description here

Tesseract does pick this up as 40 but I need it to pick up the larger image. Is there a way I can resize but retain the quality/sharpness?

Resizing code -

img = cv2.resize(img,(0,0),fx=5,fy=5)
7
  • What interpolation method are you using to upscale the image? Try dropping anti-aliased methods, try nearest-neighbor interpolation (cv2.INTER_NEAREST). Alternatively, apply some morphology after thresholding and before inverting colors. Maybe a little bit of dilation with a small kernel, that might improve the appearance of the 0, which seems a little bit disconnected at the corners. Commented Nov 3, 2023 at 3:09
  • 2
    The image is sharp. The only reason the enlarged image looks blurry is due to the resizing that you are doing? Why must you enlarge? What is your end goal for needing it enlarged? Are you trying to do OCS and it is finding that it won't read it due to the font being too small. Commented Nov 3, 2023 at 4:06
  • 2
    this is another of those "tesseract sucks but let's make it OpenCV's problem" questions Commented Nov 3, 2023 at 11:02
  • 1
    @ChristophRackwitz Not a very helpful comment. What alternatives are there? Commented Nov 4, 2023 at 2:11
  • plain old convolution on a hand-picked set of templates/kernels. you've got a bitmap font there. -- I'm sure you've heard of other OCR packages like easyocr and paddle... yes? I hate to just mention those because I'll never know if you heard of them before. -- what program produces the text you presented? you've presented a single sample. not much to go on. Commented Nov 4, 2023 at 2:16

1 Answer 1

2

If you have the possibility to use ImageMagic:

import subprocess
import cv2
import pytesseract

# Image manipulation
# Commands https://imagemagick.org/script/convert.php
mag_img = r'D:\Programme\ImageMagic\magick.exe'
con_bw = r"D:\Programme\ImageMagic\convert.exe" 

in_file = r'40.png'
out_file = r'40_bw.png'

# Play with black and white and contrast for better results
process = subprocess.run([con_bw, in_file, "-resize", "100%","-threshold","60%", out_file])

# Text ptocessing
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread(out_file)

# Parameters see tesseract doc 
custom_config = r'--psm 7 --oem 3 -c tessedit_char_whitelist=01234567890' 

tex = pytesseract.image_to_string(img, config=custom_config)
print(tex)

with open("number.txt", 'w') as f:
    f.writelines(tex)

cv2.imshow('image',img)
cv2.waitKey(12000)
cv2.destroyAllWindows()

Output: enter image description here

Option 2: With OpenCV, what you prefer.

import cv2
import pytesseract

img = cv2.imread("40.png")

# Read a grayscale image
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Convert grayscale image to binary use THRESH_OTSU, named after its creator Nobuyuki Otsu is a good start point of thresholding.
(thresh, im_bw) = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

# Optional play around with the thresh value
# thresh = 200 
# im_bw = cv2.threshold(gray, thresh, 255, cv2.THRESH_BINARY)[1]

# write image to disk
cv2.imwrite('bw_40.png', im_bw)

# Parameters see tesseract doc 
custom_config = r'--psm 7 --oem 3 -c tessedit_char_whitelist=01234567890' 
tex = pytesseract.image_to_string(im_bw, config=custom_config)
print(tex)

cv2.imshow('image',im_bw)
cv2.waitKey(12000)
cv2.destroyAllWindows()

Output: enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the response. With regards to option 2, how are you managing to convert the image to grayscale? In my pre-processing, i already applied grayscaling to obtain the images in the question. Therefore any further grayscaling fails as the image doesn't have any RGB channels.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.