Envision AI allows visually impaired people to read texts in real time or by recognizing documents, including handwritten texts. It also has options for color identification, scene description and product identification by scanning the barcode.
This application is available for Android, being necessary to have a version equal to or greater than 5.0, and for IOS, where at least version 10.0 is required.
Next, an analysis of the interface and the different functions of the application will be carried out, both on Android and IOS.
The interface of Android and IOS versions differs slightly in terms of the number of tabs, location of these and their content.
The first difference is that in Android the tabs are located at the top of the screen, being 3, while in IOS they are located at the bottom and are 4. The tab that doesn't have the Android version is called Scan and Find.
Another important difference between both versions is that in Android you can activate the camera flash to improve the lighting, which cannot be done in the IOS version without the button for such an option.
The other significant differences are given by the buttons that exist in one version and another to activate the different functions of the application, as well as the location in one tab or another. In this sense, the first button that appears on Android but not on IOS is the Read Manuscript Text on the Text tab. Also within the Text tab is a button in IOS called More Actions, which allows the user to recognize text from PDFs, images or multiple pages. However, the Android version lacks such a button.
A very important button for people with visual impairment is Detect Color. However, this function is only present in IOS within the General tab.
As for the Scan Barcode and Teach Envision buttons, these are present in both Android and IOS, although they are located in different tabs . In Android they are located in the General tab while in IOS they are located within the Scan and Find tab.
Finally, it is necessary to mention the options Find People and Search for objects, which are present in IOS within the Scan tab and Find, but not present on Android.
The following is a detailed analysis of the different functions that Envision AI can perform, both in the Android and IOS version.
Magnification functions and text
The magnifying glass function allows Use the device's camera as a magnifying glass. To activate this function, press the button called "Magnifier". When pressed, a sliding bar is displayed to indicate the magnification magnification from 0 to 100. To close the magnifying glass, press the button again.
Start reading instantly
To activate this option, you have to go to the Text tab and press the button that says" Start reading instantly ", located at the bottom left of the screen. Once the button is pressed, Envision immediately begins reading any text that the camera captures. To stop instant reading, press the button again, which now says "Stop reading instantly".
Read handwritten text
To activate this option, only available on Android, you have to go to the Text tab and press the button that says "Read handwritten text" , located at the bottom center of the screen. After pressing it, the application starts reading the text when it is detected.
This option allows the user to take a picture of a text to be recognized, or the application when it detects a document Take a photo of it automatically for recognition.
Once the text has been recognized, the application displays a screen with it. In IOS, several buttons to start reading are displayed at the bottom of the screen the text automatically and pause the reading, to change the size of the text and to export it, meanwhile, similar buttons are presented on Android, although there are differences since there is no button to change the text size and two buttons are included, "Next page" and "Previous page".
This option, located within the Text tab, is only available in the IOS version, and allows reading texts with multiple pages, PDFs and images. The button to activate it is located in the lower right part of the screen and after pressing it a submenu with the different options is displayed. If you click on "Import PDF", a browser opens to search files on the phone Pressing "Import image" opens the device's photo library. Pressing "multiple pages" opens the camera to take a picture and displays an indicator that shows the number of pages captured along with two more buttons, one to take a picture and one to stop taking pictures.
Describe the scene
To activate this option, located within the General tab, press the button with that name, located at the bottom left of the screen After activating the function, the user has to take a picture of his surroundings and the application will describe the scene that is displayed on it.
This option is located within the General tab and is available only for IOS. Allows the user to identify colors of objects by using the camera Once the corresponding button, located in the lower central part of the screen, is pressed, the application begins to verbalize the colors it detects.You can do so with a standard accuracy of 30 colors or with a more descriptive precision of 950 colors.
Describe the scene
This option allows the application to recognize faces of friends or family, either by taking photographs of the person, or from images stored in the phone's library. It is located in the General tab on Android, while in IOS it is located the Scan and find tab.To do this, press the button that says "Teach Envision", located at the bottom right of the screen. If the option "Show a face" is choosen, the camera will open and you can take up to five photos of the person for Envision to recognize. If the "Open library" option is choosen, the phone's photo library will be displayed for the user to select a photo. Once the person has been recognized and has been assigned a name, when the "Describe the scene" option is pressed, if Envision recognizes any face it has stored, describing it will indicate the person or people who are in it.
This option is located on the General tab and allows the user to identify a product by scanning Your barcode. To do this, press the button located in the lower right in IOS and in the lower center in Android. After pressing it, by using the camera, the barcode is scanned, and if it is identified, the product name is displayed along with a button that allows more information about it.
Scan and find features
This option is available only in IOS and is located on the Scan and Find tab. To activate it, you have to press the button with the name "Find people" located in the lower left of the screen. After pressing it, the user can move the mobile around it, and when Envision detects a person, it will beep to indicate it.
Search for objects
This function is only available in IOS and it is located on the Scan and Find tab. To activate it, you have to press the button with the name "Search for objects", located in the lower central part of the screen. Once the button is pressed, a list is displayed with a series of objects, such as car, cat, toothbrush, bag or fork. The user must select one of the options, and then move the mobile around it. When Envision detects an object of the type selected by the user, the application will emit a sound to indicate it.
Envision AI allows you to configure different parameters, which depend on the version used. The parameters that can be configured in IOS are more than in Android because in the first one the application has more functionality.
The following are the possible configuration parameters, which are all located in the tab called Help .
Offline text recognition
This option can Be activated or deactivated. When enabled, it allows faster text recognition for texts written in Latin-based languages.
Automatic language detection
This option can be enabled or disabled. When disabled, the application only reads text in the system language.
Text to speech
This option is called" Text to speech "on Android and" Speak "on IOS, and allows you to configure the voice that is not the screen reader.
On Android you can choose the synthesizer that you want to use from those installed in the system.
In IOS this option allows you to adjust the speed of the voice that is not VoiceOver, as well as choose the language and the "person" who speaks.
This setting is only available in IOS. Allows you to select the accuracy of the color identification when using the Detect Color functionality. You can choose between standard (30 colors) or descriptive (950).
Within the help, you can access a series of online tutorials in English by clicking on the option" Read tutorials ", send an opinion by clicking on the option" Give opinion "or request that the developer put contact the user by telephone by clicking on the "Request a call" option.
Other options on the help tab
To carry out the tests, an iPhone SE with IOS 12.4 and a Huawei Mate P20 Lite with Android 9.0 have been used. It has also been used a Samsung Galaxy J3 with Android 8.0.
It should be noted that the application does not work correctly when Silent Mode is activated on Android, because voice synthesis in functions such as Read instantly does not work, although the reader screens works perfectly.
In order to analyze and evaluate the application, a series of tests of the different options have been carried out, both in the Android and IOS version.
The tests carried out for the recognition of texts with the different options for this purpose have been very positive in general. Both in instant text reading and in document recognition, the application has shown very satisfactory results, clearly identifying the texts presented. The only difficulty with which the application has been found has been with texts with multiple boxes and images, where the results obtained have not allowed a clear reading of the text. It should also be noted that the recognition of manual writing has been equally satisfactory, reading these texts without problem. Where the application has presented more difficulties has been in the recognition of PDFs and images in IOS, which may have been affected by the content of images, as well as the structure and quality of the documents.
A very noteworthy point When reading documents, the application indicates whether there are visible borders or not. Keep in mind that this function is designed to scan documents, such as letters or folios, and that these indications are very useful for people who completely lack the sense of sight or that their reduced vision does not allow them to see even the document. In this sense, in the tests carried out the application has identified the edges at all times and correctly indicated when they were within the limits of the camera and when not.
The option "Describe the scene", in the tests made has presented reasonable behavior, although it has presented some failures. These have been due to the fact that it has not correctly identified some objects, although the application has indicated in these cases that the description may not be accurate. You have also used a word that was not the right word, but this has been a mistranslation of the description. It should also be noted that in the tests performed on Android the scene was described in the default language of the phone, but when updating to version 0.9.6 the scenes are described in English.
On the other hand, during the tests performed , the identification of the color has been carried out with great precision and in a very precise way, giving a wide range of shades to the colors when the descriptive mode is activated in the parameters.
The tests carried out on the functionality of finding people have Very positive result. It is not about describing or identifying the person or people, but rather indicating the user if there is any person, which is quite useful for people who cannot do it. In all cases, the application has worked correctly, emitting a sound to indicate it.
The Search Objects function has correctly identified most objects, from bags or armchairs, to laptops. He has only been unable to identify objects in the telephone category, whether mobile or landline.
Teaching Envision has presented completely satisfactory results. During the tests carried out, the application has perfectly recognized the man and woman who have entered it. When scenes have been captured, the man and woman have been identified without problems, either separately, or appearing together.
The negative aspects of the application, like all those that make intensive use of the camera and Internet connection, refer to high data and battery consumption. The first is because most of the functions require sending the data through an Internet connection to be processed on an external server. As images are sent, and surely with high resolution for good analysis, data consumption is high. Regarding the second one, since the camera is constantly operating, the energy consumption of the device is very high, which makes the application quickly consume the battery of the device.
Finally, the magnifying glass function has been tested, which works perfectly, although it should be noted that it is likely to work better or worse, in terms of magnification, depending on the quality of the device's camera.
As seen in the tests performed, Envision AI has shown great behavior With very remarkable results. This application, as we have seen, is designed primarily for people with total blindness or with very low vision, since functions such as "Find people" and "Find objects" prove it. This does not mean that it is not very useful for other people with low vision, since the functions of text recognition and color identification, as well as the magnifying glass are very useful for this group.
One aspect It is very noteworthy that it is indicated if the edges are within the limits of the camera when recognizing a document, as it is very useful for people who are totally blind.
It is worth highlighting the possibility of choosing between the standard mode and the descriptive when identifying colors. This is of great interest since users who conform to a basic range of colors can activate the standard mode, while the most demanding users can activate the descriptive mode for greater accuracy.
Very interesting is also the options of voice settings. Although it seems that in IOS it is more configurable, allowing you to select the speed, language and "person", in Android, once the voice synthesis is selected, it can be configured from the Android settings themselves.
Aspects Less positive of the application are given by the large consumption of data, since the application has to connect to the Internet in most of its functions to carry out the processing of the images on an external server, as well as the high battery consumption which requires the camera to be permanently active.
In general, it can be concluded that it is a great application, very interesting for the blind and low vision group, which performs with great precision and correction the functions that incorporates.
- Recognition of handwritten text.
- Recognition of PDFs and images in IOS .
- High accuracy in text recognition.
- Possibility of recognizing 950 colors.
- Possibility of obtaining additional product information when scanning the barcode.
- Possibility to learn faces from friends and family members./li>
- Possibility to configure the voice that is not the screen reader.
- It might be suggested to include the functions of Detect color, Find people and search for objects in the Android version.
- The possibility of including a button to activate the flash in IOS could be studied.
- The description of the scenes in the default language of the phone in the Android version could be studied.
- In future versions we could study how to reduce the battery consumption of the device.
- We could analyze how to reduce data consumption in future versions.