Visual and Kinematic Segmentation

The kinematic segmentation algorithm breaks actions at local minima of multi-scale smoothed hand speed estimates. These segments can serve as the units of search for detection, recognition and learning. They can also be used to summarize scenes for annotation and browsing, such as with the following automatically summarized 120 frame sequence of the wearer getting a drink out the refrigerator. The automatically selected synopsis frames correspond well to reaching for the refrigerator door, opening the refrigerator door, reaching into the refrigerator to grab a drink, reaching for the refrigerator door, and closing the refrigerator door.

video synopsis using kinematic segmentation

The visual segmentation system autonomously initializes visual segments at salient locations around the kinematically estimated hand location, tracks the segments, and filters them. This visual processing is performed offline overnight on a small computer cluster due to its high computational requirements. Within the following images, the system has drawn colored outlines around the salient segments it has found.

visual segments
visual segments

The following two images show the history of salient segments tracked by the vision system over a short period of time.

visual segments
visual segments

These three images give an example of automatically selecting a series of segments whose appearance matches a previously learned appearance model. In this example, the appearance model describes the appearance of hand related segments.

visual segments