Seeing AI is a Microsoft application developed for IOS devices that allows the user to have in the same application different functionalities useful for people with blindness or low vision. Each of these functionalities is called channel. Channels can increase if new functionalities are added.
The application allows, among others, to recognize text in both documents and images, detect light intensity, identify colors or describe scenes.
When the application is opened, the camera preview is shown along with the menu and quick help buttons, as well as the channel selector and a button to pause and resume automatic detection.
All menus, buttons and information are in English, although you can change the recognition language to different languages, including Spanish, as well as predefine the type of currency.
Some of the channels may work with automatic detection. The recognition accuracy can be affected by the user's pulse, the document orientation and the distance to it.
The application menu allows access to the application settings, the device's photo gallery and different information.
This option allows the user to access the device's photo gallery and recognize photo content be it a text or a scene.
During the tests performed, this option has successfully recognized the scenes that appeared in different photographs stored on the device.
This option allows access to the application help.
This option allows contact the developers by sending an email with the aim of providing suggestions or communicating any kind of incident.
This option allows configure different aspects of the application such as the sort of currency, the ordering of the channels or voice settings among others.
This option offers information about the application and developers.
This channel allows identify short texts in real time, such as the one that appears on product labels.
During the tests carried out, the application has identified with great results the packaging texts, product surfaces and even the display of electronic devices.
This channel allows focus a text, capture it and recognize it. After this, the application displays a screen with the recognized text of the document.
In the tests carried out, it has been verified that the recognition is very good, although it is influenced by different aspects such as the document orientation, the size or type of letter or the type of document among others.
The image on the left shows a photograph of a document. The image on the right shows the text that the application has recognized in the document.
This channel allows identify the products through their barcode, provided that their information is available. To do this, the barcode is focused with the camera, which is responsible for capturing and identifying it.
In the tests performed, the application has correctly identified the barcode. However, product identification depends on your information being available in the database, such as the Bezoya mineral water bottle that has correctly identified the application.
This channel identifies how many people are in the image captured with the camera, how they dress, their facial features and age. For this channel to work properly, people must be at a distance not too far away.
During the tests performed, the application has correctly identified people in terms of sex and clothing, although it has given a variable range in relation to age.
In the image on the left you can see a young woman with an English text provided by the application that says "30 years old woman with black hair looking happy". In the image on the right you can see a young man and a woman with a text provided by the application that says "2 people detected. 36 years old man with brown hair looking happy. 27 years old woman with brown hair looking happy".
This channel allows identify the monetary value of the notes in the predefined currency and in real time.
In the tests carried out it has been possible to verify that the application correctly identifies the notes, such as the € 20 bill that can be seen in the image. Once the application has identified the value of the ticket, that value is verbalized aloud.
This channel allows describe the scene that appears in the image captured by the camera after pressing the take picture button. The application verbalizes aloud what is shown in the image.
The image on the left shows a woman sitting at a desk and with a computer in front of her. In the image on the right you can see the same scene after being recognized by the application with an English text that says "A Person sitting at a desk with a computer in an office chair.".
This channel detects the main color or colors of an object or surface. The identification of the color can be affected by different aspects such as the hue of the same or the lighting of the environment. Generally, under appropriate conditions, the application correctly identifies the colors of the focused surface.
In the tests performed, the application has successfully identified the colors of the objects focused with the camera.
This channel allows recognize handwritten texts. When the application recognizes the text, it verbalizes it aloud.
The image on the left shows a photograph of a notebook with the following handwritten text: "En Orientatech probamos el reconocimiento de la escritura manual de la aplicación Seeing AI ("In Orientatech we tested the recognition of the manual writing of the Seeing AI application"). On the right is the screenshot with the text recognized by the application, which as it can be seen, has been correctly recognized.
This channel allows detect the light intensity. To do this, it uses a musical scale in which the higher the intensity of the light, the more acute the musical notes that reproduce.
In the tests performed, the application has reproduced the highest notes when the camera has focused light emitting objects, such as the computer screen or the light source that can be seen in the image.
Microsoft's Seeing AI application is a great tool for people with some kind of visual functional diversity, especially for those with very low vision or with total blindness. This application brings together in a single app different functionalities that contribute to improving the activities of daily life and favor greater personal autonomy of the group with visual functional diversity.
It is worth mentioning with special mention the recognition of handwritten texts with great precision, as well as the identification of scenes and people. The OCR (Optical Character Recognition) is also very useful, either for short texts such as packaging, or for documents.
Of special relevance for people with total blindness is the identification of the light intensity since it allows them to know, for example, if a lamp is on or off.
As mentioned earlier, it is an application of great interest to the group of people with visual functional diversity. However, that the interface is only available in English and the high battery consumption in mobile devices are points to consider when using it.
- Handwriting recognition with great precision.
- Precise identification of scenes and people in photographs.
- OCR in real time for short texts.
- High precision OCR for documents.
- Light intensity detection.
- It's free.
- The translation of the interface into other languages could be suggested since it is only available in English at the moment.
- The reduction of battery consumption could be studied for future versions.
- The possibility of increasing the number of products identified by the application through the barcode could be studied.
- The development of a version for Android devices could be analyzed since at the moment it's only available in IOS.