Sunday, February 1, 2015

Oh, patents! Google AR (augmented reality) image recognition

Copyright©Françoise Herrmann

Just take a picture of a real world object with an AR app that superimposes virtual information in the form of text or graphics onto the real world object for real time interaction. Voila! This is Augmented Reality! The real world object acquires new properties and more depth in the virtual world, where conversely, the user can experience the virtual world as part of the real world, resulting in an augmented impression of the real world.

Easy said… but how does it work? How do you get the application to recognize an image (i.e.; bring information to the image)? How does the application differentiate among the multiple objects of an image (e.g.; a street with all of its different overlapping buildings)? How does the application provide you with information about the parts of an image that are hidden or actually non visible, and for what purposes?   

For example, how does the application provide information about the closest restaurant, or hair salon, or post-office when the actual shops or business are not visible on the image? And even if the objects are visible (e.g.; the TransAmerica or the Francis Ford Coppola  buildings in San Francisco), how does the application recognize them, and supply relevant information about the sites?

Google patent US 8810599, titled Image recognition in augmented reality discloses an invention that precisely addresses these issues. Specifically, the invention seeks to match position information attached to an acquired image with stored geo-coded images, in view of both characterizing the acquired image and matching it against the descriptive information known for the stored images, and finally superimposing Augmented Reality display data (graphic or text) on the captured image for querying by the user. 

The invention includes a number of additional aspects, including: means for adjusting position or location, when for example the image was acquired at a slightly different angle from the geo-coded stored image, or when the coordinates obtained for the position data are slightly different from what was previously stored virtually for that location; means for re-calibrating a compass tool on the computing device, and using the compass for determining aim and direction sensed when the image was taken; means for ordering the search and display of information according to popularity (e.g. coffee vs tobacco shop), among many additional aspects.

Below appears the abstract for US 8810599, titled Image recognition in augmented reality, and above Figure 1 of this patent depicting use of  GPS compass coordinates for determining location of the image objects, using a mobile device and front camera.
A computer-implemented augmented reality method includes obtaining an image acquired by a computing device running an augmented reality application, identifying image characterizing data in the obtained image, the data identifying characteristic points in the image, comparing the image characterizing data with image characterizing data for a plurality of geo-coded images stored by a computer server system, identifying locations of items in the obtained image using the comparison, and providing, for display on the computing device at the identified locations, data for textual or graphical annotations that correspond to each of the items in the obtained image, and formatted to be displayed with the obtained image or a subsequently acquired image.

Of course, when reading the above-cited abstract, you will have already noticed that in a world of patents where computer programs that transform everyone’s life cannot be patented, US 8810599 is a “computer-implemented augmented reality method…” where “an image is acquired by a computing device, running an augmented reality application…”. 

Consequently, and elsewhere in the specifications of the patent, since the coded instructions of a computer program cannot be patented, you will discover that this invention also includes "tangible non-transient recordable computer storage media” that “stores the instructions which once executed make it possible to acquire an image with a device, running an augmented reality program”. 

Finally, in addition to the media support for this method, the patent also covers the means for acquiring the image, means for sending the image data wirelessly to a server containing geo-coded image data; means for comparing and matching acquired data with stored data, means for extracting and displaying the virtual data for an augmented impression of reality..., plus much more, in terms of the scope of the invention and its variations: of input modes (voice, stylus, keyboard), of operating systems (Android, iOS, Rim Blackberry, Microsoft  Windows Mobile, Symbian…), of the means for determining location of user, and position of image; of the networks supporting transmission of information etc! (The patent runs 26 pages.)

Cheers, and thanks Google! 

