Eren Akbulut's Blog

React Tesseract OCR Tutorial

February 1st, 2021

React OCR Application

Given the current state of web technologies and the power of web applications overall, many formerly challenging and computationally intensive tasks are now available to run on browsers easier than ever. OCR(Optical Character Recognition) is one of the tasks that we could consider in that category. It’s extremely demanding when it comes to computing power, it takes lots and lots of hours to end up with good results, and existing solutions mostly offers more than 1 individual’s life long work. Luckily some of the very powerful examples of that mentioned solutions are now highly reachable and truly easy to use.

But why would we want to use OCR on a browser? Well there could be many reasons for such a need to be born, for example, OCR is a perfect fit when you want to extract text from a scanned document using on a browser can be really beneficial, or when we want to automatize the process of the detection of unwanted behavior that is hidden in a text OCR is again a goto choice, or maybe for something simpler we are frustrated to see screenshots of the IBAN numbers that could be sent for many reasons to help ourselves in such a situation OCR can be a quite practical helper. As you can see the use cases can vary and it’s possible to add pages and pages of ideas here in one sit.

In today’s tutorial, I’ll walk you through a React application that you can upload your images in various formats and can apply OCR to extract text content so easily. So if you are ready let’s jump into the application right away.

Our desired result is going to be as below, the application should have features to make image upload and OCR start available.

There are 2 options to follow along with the project, you can either go ahead and create your own “create-react-app” project or you can simply go and download the final version of the project from here then use it to follow the tutorial step by step. A guide on how to use the “create-react-app” is also here.

If you go with the second option you can delete the unnecessary files and end up with a folder structure like this, it can still open to be simplified but it doesn’t seem as bothering as the default version.

If you downloaded the version I provided all you need to do is to open your terminal in the root folder and run the following commands “yarn && yarn start”. The “yarn” command it’ll automatically download the dependencies and create a node_modules folder for you, since your app is ready to take off now you can simply use the following the “yarn start” command to start to project.

However, if you are using the “create-react-app” you are going to need to do some installments that we’ll use during the tutorial. We’ll use 3 third party libraries for different purposes. The first one is “tesseract.js”, it’ll allow us to run OCR easily with just a couple of lines of code. The second one is “react-images-upload”, it’ll allow us to upload images to the browser and access their locations easily. The third one is “react-spinners”, it’ll allow us to create cool-looking spinners with so little amount of code, that wasn’t really necessary to achieve the purposes of the tutorial obviously but using such a technique while loading something is often considered as a good practice. So if you used “create-react-app” you can simply get those 3 by running the command “yarn add react-images-upload react-spinners tesseract.js”, after the download is done you should be set to go.

The first thing that we are going to do is to is cleaning our App.js file and separating our code to somewhere else.

Now that we have our App.js set we can take a look at what we are bringing here. We are adding a divider with an h1 tag and another divider in it. So since the default styling doesn’t consist of such CSS classes we also changed our CSS here. Let’s take a look at them step by step.

We added 2 CSS classes to create a balanced look in the first class we create a flex layout we set the direction of it as a column we center our items and then we add a little margin to the top, and we are using the second class to turn our header to a block element so it won’t stay side by side with the same elements of the div.

To not get any errors from the App.js file we should create our ImageLoader component, now go ahead and create a scripts folder, and in the scripts folder, create a folder called imageLoader with imageLoader.js and imageLoader.css files in it.

After that step, you put that code snippet into your imageLoader.js file and you have your component ready.

Since the only 2 files that require further coding are imageLoadar.js and imageLoader.css I’ll copy and paste here the pictures of code snippets then I’ll explain them to you step by step.

Here we are importing the required libraries that are mentioned earlier, we’ll also need to import useState since we’ll use the state hook and our CSS file.

Here we do a couple of new things and you may want to follow them carefully. We should put the code snippet above the return we are creating 3 states here to achieve different goals. The first state we have here is to take the image URLs from our ImageUploader component and save it for further processing, the second state is the one that we keep our OCR applied texts, and the third one is the one for loading logic.

onDrop function is called whenever a change happened in our ImageUploader component you’ll see that component down below soon enough. So whenever a change happens like adding or removing the photos in our component this function will be fired up. It’ll take the argument we need in the second position after the onChange event is fired so we can simply bypass the first argument by putting “_” and take the second one. It has link/links for our images in it and it’ll bring the full latest list of links so whenever we get a list from our function we can safely put that in our array.

runOcr function seems tricky but in fact, it’s quite simple, we take our URLs state and run the following recognition script for each element. Tesseract recognize function takes at least 2 arguments, first is the URL and the second one is the target language, tesseract.js supports over 100 languages. After the requests resolve we take we output text and add it to our current array. After we fire runOcr command we can setIsLoading to true so we can display our spinner.

Now we can return our JSX here. Since we want to keep everything under control and get a decent look we apply some CSS to our parent div like that.

Now we can start adding our child elements into our div. On our ImageUploader component, we have many options like setting a max image size our limiting the file types. Also showing or not showing a preview is an option.

After we call our ImageUploader we can create another button for firing up the OCR process for that I wanted to create a plain a plain div element and style it rather than using a button directly.

We set some style with a special pseudo-class to make it behave differently when it’s hovered. We copied the same color and design approach from our ImageUploader to be consistent with the design.

After our button, we have our main logic and it’s quite straight forward too. We first check if there are any items in our ocrText state, if that’s the case we can simply create an unordered list to display whenever a new item is added to the ocrText state. We use the index of the items for both providing a key to each item and enumerating each output.

The styling for that part doesn’t contain anything but some coloring and carving from the edges.

You see when we don’t have anything in our ocrText state we’ll automatically check for the spinner since the spinner also relies on being triggered from runOcr function it won’t show any side effects while staying there.

Alright everyone, that's it for today I hope you enjoyed it. I'll do more educational content like that one in the future. Until then, stay safe and take care of yourselves.