Many of us face image recognition on a daily basis. Active Facebook users watch how AI recognizes and marks their friends in photos. Photo enthusiasts use filters to create masterpieces from ordinary images. Entertainment apps allow us to add or remove age from our faces.
But such software isn’t just for fun. Today, it is actively used in all areas of business activity. In this post, we will see how image recognition penetrated fintech and which companies utilize it. We will also shed light on how machines understand what is depicted and tell how we applied image recognition in our project.
What is image recognition?
Image recognition (IR) is a technology designed to capture, analyze, understand, and process images from the real world to convert them into digital information. This area involves data mining, machine learning, pattern recognition, and knowledge base expansion.
IR is also called a computer vision (CV) because modern devices can mimic human vision thanks to improved cameras that take high-quality pictures. These pictures are further processed by special software that extracts the necessary information and processes data.
How do the devices understand what’s in the picture?
IR algorithms are built based on deep neural networks – these are computer systems designed to recognize patterns. They inherit the structure of the human brain, hence the title. A deep neural network consists of three layers: the input layer, the hidden layer, and the output layer. The first one is responsible for collecting data, the second for processing, and the third for drawing conclusions.
When does a neural network become deep? When the number of hidden layers is about hundreds. For comparison, a typical neural network has no more than three such layers.
Now, let’s get clear how a neural network understands what is in the picture. In a nutshell, it compares the obtained image with the reference image. The latter a set of features that the system learns and trains to identify. Facing a new object, it analyzes every detail and decides whether it is X or Y.
It is worth noting that the whole process takes a fraction of a second and is very similar to the human brainwork. Imagine seeing an animal. One glance is enough to understand who it is: a dog, cat, or someone else. In a moment, your brain estimates the body size, head shape, fur color, etc. Then it compares the details with the fixed image and decides who it sees.
What is a convolutional neural network?
There are various neural network architectures. A convolutional neural network (CNN) is the one designed for image recognition. It is used to identify objects, persons, signs, and plays a key role in adjusting computer vision in self-driving cars and other robotic technologies. CNN consists of a few layers with a limited number of neurons. Each set of neurons perceives little portions of the picture. Next, the results from each set in the layer are partially joined to get the complete image at the given stage. A deeper layer performs the same actions on the resulting image allowing the system to learn every detail of the image structure. At the output, we get the class or probability of classes that best describe the image.
The world first heard about CNN in the early 80s. However, for a long time, the accuracy of image recognition was far from perfect. Since 2010, neural networks have been trained and deployed using GPUs. This allowed making the systems based on neural networks way much faster.
How to apply image recognition in business. Softensy expertise
There are many ways to apply image recognition and give your business new opportunities. Whether you develop an app for a broad range of users or internal use, the technology will deliver an improved UX and better customer relations. Let’s see how we used image recognition in one of our projects – a corporate app for bank personnel.
About the project
Our client is a large Eastern European bank, for which we develop mobile banking and internal corporate services. One such service is the Card Personalization Center, which includes an iPad app for employees. The bank managers use this app at meetings that go beyond the bank branch.
Here is a live example. The firm wants to pay a salary to its employees using the cards issued by our client. For that, all the staff members should register as bank clients and receive the cards. To speed up the process, the bank sends a manager to the firm. The manager takes photos of each employee’s documents using the iPad app. Сomputer vision reads the necessary information from documents, and the app fills up the form automatically. Thus, the employees do not need to enter the name, tax ID, and other data manually – this is done by recognition. Also, the data of the issued card is recognized and recorded. After the meeting, the card will be automatically linked to the client.
Technologies in use
We used native iOS technologies, namely Swift and Xcode, to write the iPad app for bank staff. To implement image recognition, we chose the OpenCV – an open-source computer vision library. It includes more than 2K algorithms, both classic and modern, for CV and ML. Some of them are image interpretation, 3D reconstruction, object segmentation, gesture recognition, as well as removing optical distortion, detecting similarity, and others. OpenCV greatly simplifies embedding IR in an app. In fact, it is a collection of data types, functions, and classes for image processing by computer vision algorithms.
For text recognition, we utilized Tesseract. It is an open-source OCR engine that provides different types of recognition: image as a word, block of text, vertical text, etc. Tesseract has nearly 200 trained language models and supports more than 100 languages. It gives highly accurate results and is easily customizable.
Face recognition. World practice
Today, IR is increasingly used for face authentication. Unlike login by password, this method is more reliable and secure. Something the user knows, something the user possesses, something the user is – here are three pillars of Strong Customer Authentication. While the first two can be falsified or stolen, the last one proves a person’s identity. With this in mind, banks and fintech firms invest in the IR and CV to protect data and reduce fraud cases.
Such finance giants as Chase, HSBC, and USAA have made biometric authentication using the Apple Face ID. It allows users to securely log in to the mobile app looking in the phone’s front camera.
MasterCard has been successfully using the selfie pay since 2016. It helps users quickly confirm a transaction by placing face in front of the camera. Seeing the usability of this method, Amazon has also filed a patent for selfie pay. So, we expect a wide use of this technology soon.
WeChat Pay and AliPay go beyond the online experience. They have equipped their offline point-of-sales (POS) with face recognition to identify shoppers and let them pay without the card or phone. This is very convenient since you don’t have to take anything with you in the store. Also, you can be sure nobody spies on your password or steals your phone.
Grow your business with IR
Image recognition is a promising technology that is used by companies all over the world. There are various use cases for corporations. In addition to updating the shopping experience, you can optimize workflow and internal business processes. For example, you can identify your employees by face when they enter the building, a conference room, or any other company unit. Besides, staff members can access the corporate app using a phone cam.
No matter what kind of business you run, CV can bring it to the next level. If you are thinking of investing in this area, share your idea with us. We will guide you through the technical aspects and reveal all the possible pitfalls. For more information, drop us a line and get the first consultation for free.