Now you can Subscribe using RSS

Submit your Email

Tuesday

Text recognition (OCR) in Android app using Tesseract example

Vishal Shrestha
In this example, we will detect text using an Android app using Tesseract for Android Studio.
We can easily do OCR in an android app using Tesseract library. Tesseract for Android can be used as a dependency and you can learn how to setup Tesseract in Android Studio in this tutorial.
In this example we will continue from previous part where we detected texted in an android app using openCV. Now we will recognize text, i.e perform OCR in Android app using Tesseract.
Using Tesseract for text detection in Android is pretty simple, here's how you do it.

Android-app-ocr-using-tesseract

Using Tesseract for OCR in Android Studio : 

  1. Initialize the TessBaseAPI with the path to traineddata file and proper page segmentation mode.
  2. Just pass the image from which you want to detect text as  bitmap to the tessBaseAPI variable.
  3. Finally call the getUTF8Text method on the variable, this returns a String value.
How to use Tesseract with Android Studio will be properly explained in a post soon. This post just describes how to detect text from image using Tesseract.

So continuing from the previous post on Text Detection in Android using openCV, let's add the following :
In the detect text function replace the following :

        for (int ind = 0; ind < contour2.size(); ind++) {
            rectan3 = Imgproc.boundingRect(contour2.get(ind));
            rectan3 = Imgproc.boundingRect(contour2.get(ind));
            if (rectan3.area() > 0.5 * imgsize || rectan3.area() < 100
                    || rectan3.width / rectan3.height < 2) {
                Mat roi = new Mat(morbyte, rectan3);
                roi.setTo(zeos);

            } else
                Imgproc.rectangle(mRgba, rectan3.br(), rectan3.tl(),
                        CONTOUR_COLOR);
        }


With:
for (int ind = 0; ind < contour2.size(); ind++) {
                rectan3 = Imgproc.boundingRect(contour2.get(ind));
try {
                    Mat croppedPart;
                    croppedPart = mIntermediateMat.submat(rectan3);
                    bmp = Bitmap.createBitmap(croppedPart.width(), croppedPart1.height(), Bitmap.Config.ARGB_8888);
                    Utils.matToBitmap(croppedPart, bmp);
                } catch (Exception e) {
                    Log.d(TAG, "cropped part data error " + e.getMessage());
                }
                if (bmp != null) {
                   doOCR(bmp);
                }
}

And add the following method to out MainActivity :

private void doOCR(final Bitmap bitmap) {
               String text = mTessOCR.getOCRResult(bitmap);
            }

We also need to add TessOCR mTessOCR and initialize it. Read the NOTE below.
NOTE :
Now we also need to define mTessOCR variable which  is used to initialize TessBaseAPI. We have created a java class for that. Check out how to use tesseract with Android here - Creating android OCR app with Tesseract. Feel free to drop questions below. Happy Coding!

Vishal Shrestha / Author & Founder

A developer by profession, a born Adventurer. I mainly do Android but like to get my hands dirty with web development and a little bit of Python. I would't rather go on a Trek than a party and you can find me having a few rounds with the heavy bag to let out the steam ;)

For Business info : My Portfolio Site.

0 comments:

Post a Comment

Coprights @ 2017 | The Code City by Vishal Shrestha Vishal Shrestha