Microsoft’s Azure Cognitive Services made news recently when it announced an innovative new service that allows developers to automatically generate captions for images. This latest addition to the cognitive intelligence system that leverages Computer Vision technology can reportedly generate image captions that are, in many cases, better or more accurate descriptions than what humans write. An image caption generated by a machine for a machine, will be much more effective and make your Bing or Google search results much more relevant. This can help drive organic traffic to your webpage.
Computer Vision-based image captioning is a big milestone because it means the AI systems are beginning to detect, understand and describe an action or motion within the context of everything else around it. It leverages deep learning to detect what the item is, the action it is performing and then uses Natural Language Generation (NLG) to describe it. Imagine what this technology could do for the blind or visually impaired; they could see through the eyes of a computer. For example, a blind or visually impaired person walking down the sidewalk in an urban location, could detect an approaching intersection and be notified of the remaining distance, so they can stop just in time to avoid stepping into oncoming traffic.
The Phenomenon of Computer Vision
Computer Vision replicates the “visual” intelligence of the human brain. Humans perceive up to 80% of all impressions by means of sight and 30% of the human cortex is devoted to vision, in comparison only 8% for touch and 3% for hearing. Just like vision is the most important human sense, it is also critical for computers to gain a more robust understanding of the environment. If they can learn to understand an action in the context of everything else around it, they can become that much more intelligent. The modern Computer Vision actually relies on deep learning algorithms a.k.a. neural networks, to understand the objects it’s witnessing. These neural networks use massive amounts of visual data to learn and find patterns to arrive at a highly educated conjecture about what a certain object actually is. These algorithms are inspired by the human understanding of how brains function, in particular, the interconnections between the neurons in the cerebral cortex.
Although Computer Vision, as a subsegment of Artificial Intelligence, has been around since the 1960s, there have been some recent breakthroughs that have led to increased adoption of this technology. Also, the increased processing power from microchip producers such as NVIDIA and Intel has played a big role in this phenomenon. As we continue to harness additional processing power and improve technologies that Computer Vision relies on, such as the rollout of 5G internet service, we are likely to see widespread adoption and an increased rate of automation. The Fourth Industrial Revolution is, indeed, underway.
Computer Vision Applications
Computer Vision is advanced enough to be applied in many areas already and there are open-source software solutions available in hopes that the public would use them to innovate and drive adoption in this somewhat young field. There are many Computer Vision applications businesses are leveraging to automate or streamline processes. In the Healthcare field, Computer Vision can detect cancer from CT scans better than doctors. In highly secure environments, retinal and fingerprint scanning can uniquely identify individuals to enable or restrict access. Wind turbines may be inspected for defects via autonomous drone footage with high-definition mounted cameras. In addition, billions of dollars are currently being invested in autonomous transportation where Computer Vision is playing a big role to guide vehicles by identifying obstacles, people, and road signs along the way.
Automating Package Handling with Computer Vision
Technology is best utilized when eliminating boring, repetitive jobs that humans don’t enjoy doing anyway. That’s why companies are using Computer Vision to automate mundane tasks like sorting green apples from red apples, separating out recyclable items from the trash, etc. This is helping automate the supply chain and redirecting the humans to perform more complicated tasks where they apply their human intelligence to solve problems that have never been solved before. A good example is Position Imaging where the company is applying its Amoeba Computer Vision technology to help multifamily property managers automate the package handling process and redirect the staff to manage residents, rather than packages. It provides an enhanced experience for the residents because they no longer have to wait in line or contact staff to pick-up their package. Couriers deliver packages directly to the Smart Package Room where the Amoeba Computer Vision technology virtually tags and monitors the location of each package, essentially keeping eyes on the packages 24/7. It then locates the package, which used to be the property manager’s task, and guides the resident to their package when they come to pick it up. With its advanced surveillance capabilities, if the resident accidentally picks up the wrong package, built-in audio guidance alerts the resident immediately. The Smart Package Room is smart in the sense that it can see and make sense of actions that allow it to track the location of items in 3D space. It doesn’t just produce a digital map of the room, but rather a 3D replica of the shelves with a 3D coordinate system of where each package is located in the room.
With all these advancements in Computer Vision, we have only just scratched the surface. The possibilities are endless, and the future is bright. The use of Computer Vision technology will soon be widespread and access to information will become that much easier. The computation will be done by machines faster than ever before and they will help us make better and faster decisions. Most importantly, these advancements will help automate mundane tasks like locating a package. As a Forbes Contributor, Rob Toews, recently said, “A wave of billion-dollar Computer Vision startups is coming.”