Eye see you – Part II

In the first post of this series, I posted on the Avatar 3D experience and my thoughts on what would make it better. This post is also inspired by a movie, albeit in an unexpected fashion. I was traveling to Pune from Bangalore for the vacation and the movie Hum Apke Hain Kaun was being shown in the in-travel entertainment system. Well, I settled back in my seat to enjoy the mega marriage fest. Toward the end of the movie there’s an about 3-minute sequence that kind of set me thinking. Here’s the sequence, to jog the memory of Indian audiences and for the benefit of those that haven’t seen this Sooraj Barjatya magnum opus.

Android cop

Robot Vision

So the sub-plot that’s being played out is that the leading lady Nisha (played by Madhuri Dikshit) is getting married (pretty much against her wishes, but to keep the family happy blah-blah) to Rajesh (who’s been widowed and has a new born kid to take care of). Now Nisha loves Prem (Salman Khan, who manages to keep his shirt mostly on) who also loves her. So Nisha’s decked up for the pre-marriage ceremony but overwhelmed by sadness. Nisha gets Tuffy, the lovable family dog, to deliver a note and a necklace that was gifted to her and Prem by her sister as the lasting token of her love.

Now, if you’re still with me, starts the fun part (and the subject matter of this post). The dog bobs along and reaches the hall where the ceremony’s about to begin. He sees Prem happily playing with Rajesh’s kid. And Rajesh is getting ready for the ceremony to be married to Nisha. And the camera shows Tuffy look alternately at (what the viewers know, of course) are Rajesh and Prem. But finally he jumps up and gives the envelope and the necklace to Rajesh. And then of course, the film ends happily with Prem and Nisha united in matrimony.

What interested me in this sequence was the visual system and visual understanding and the role that context plays in it. Now of course, it was the dog’s visual system (as opposed to the human visual system) that was in play in the movie, but I feel that it has a parallel in primitive or early attempts at machine vision. The canine visual system is a ‘primitive’ visual system that perceives shapes but is not equipped with the sophisticated face recognition that we humans are endowed with. However, the context plays an important role in identifying people or understanding situations.

Humanoid Vision

The Eye

I have a question for you all. But before that let’s set the stage for our little act — let us replace the dog with a humanoid with a primitive visual system but let us say our humanoid can learn from situations and extract meaningful, if simple, concepts from them. So our humanoid sees a human playing with a baby and another human decked up to attend a ceremony. It also sees a female decked up for the same ceremony, who hands it a message to be delivered to a specific human. Now let’s assume that the “World” has just five primary characters: the four humans — Rajesh, Prem and Nisha, Rajesh’s newborn son — and the humanoid. Now the humanoid has observed interactions between the human characters amongst themselves, and learned simple concepts of relations. So the humanoid has learned the relations or associations that link the baby and Rajesh, Prem and Nisha. It has also learned the concept of a bond between Prem and Nisha. However, it doesn’t have a sophisticated face recognition engine and also doesn’t have a very … evolved brain that we humans have. Now, if given this context, it has the scene that I just described played out before it, who would it pick as Prem? That is, would it really pick (who we know is) Prem or Rajesh?

What I was trying to understand is the role of these learned associations in visual understanding. Can we “recognize” people or situations from learned simple concepts and associations aiding a (relatively) simple visual processing system? As a computer vision technologist, I’ve studied the role of context in visual scene understanding. However, here’s a real-life (well, reel-life if you please) example that kind of puts to test our understanding of the human visual system!

What do you think about this?

Advertisement

6 Responses

  1. [...] multimedia, technology « Of Wannabe Entrepreneurs and BTDT Guys – Part II Eye see you – Part II [...]

  2. I cannot believe this is true!

  3. Just want to say your article is striking. The clarity in your post is simply striking and i can take for granted you are an expert on this subject. Well with your permission allow me to grab your rss feed to keep up to date with forthcoming post. Thanks a million and please keep up the ac complished work. Excuse my poor English. English is not my mother tongue.

    • Hello,
      Sorry for the late reply, but your comment was flagged as spam. I guess that’s because of the email address you entered.
      Thanks for your comments, and glad you like the blog :)

  4. Great entry. I appreciate you for posting it. Keep up the fine site.

  5. great job keep it up

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.