Patentable/Patents/US-9646201
US-9646201

Three dimensional (3D) modeling of a complex control object

PublishedMay 9, 2017
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

The disclosed technology automatically (e.g., programmatically) initializes predictive information for tracking a complex control object (e.g., hand, hand and tool combination, robot end effector) based upon information about characteristics of the object determined from sets of collected observed information. Automated initialization techniques obviate the need for special and often bizarre start-up rituals (place your hands on the screen at the places indicated during a full moon, and so forth) required by conventional techniques. In implementations, systems can refine initial predictive information to reflect an observed condition based on comparison of the observed with an analysis of sets of collected observed information.

Patent Claims
28 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, the method including: accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, wherein the common reference frame has a fixed point of reference and an initial orientation of axes, whereby the sensed apparent position is transformed to an actual position; searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype.

Plain English Translation

A method for automatically setting up tracking of a 3D object (like a hand or robot arm) by using observed data to figure out the object's initial state. The method involves: getting observed data as contour points from the object; transforming these points to a standard orientation. This transformation involves capturing the object's position at two different times, calculating a reference frame to account for positional differences, and transforming both positions into a common reference frame. Then, the system searches a database of known object poses (archetypes) to find the best match for the transformed contour points and initializes the object's 3D model using parameters from that matched archetype. This avoids manual setup procedures.

Claim 2

Original Legal Text

2. The method of claim 1 , wherein the common reference frame is a world reference frame that does not change.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, wherein the calculated common reference frame is a fixed world reference frame that remains constant.

Claim 3

Original Legal Text

3. The method of claim 1 , wherein the transforming the actual and apparent positions of the complex control object into the common reference frame further includes applying an affine transformation.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, includes using an affine transformation to transform the actual and apparent positions into the common reference frame. An affine transformation preserves points, straight lines, and planes.

Claim 4

Original Legal Text

4. The method of claim 1 , further including determining the orientation of the complex control object at the actual position with respect to the first reference frame.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, further includes determining the object's orientation when it was in its actual position, relative to the first reference frame.

Claim 5

Original Legal Text

5. The method of claim 1 , further including determining the orientation of the complex control object at the apparent position with respect to the second reference frame.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, further includes determining the object's orientation when it was in its apparent position, relative to the second reference frame.

Claim 6

Original Legal Text

6. The method of claim 1 , further including determining a position of the complex control object at the actual position by calculating a translation of the complex control object with respect to the common reference frame.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, further includes calculating the object's position in the actual position by calculating a translation (movement) relative to the common reference frame.

Claim 7

Original Legal Text

7. The method of claim 1 , further including determining a position of the complex control object at the apparent position by calculating a translation of the complex control object with respect to the common reference frame.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, further includes calculating the object's position in the apparent position by calculating a translation (movement) relative to the common reference frame.

Claim 8

Original Legal Text

8. The method of claim 1 , wherein the transforming further includes at least one of: applying a vector to the set of contour points; and applying a rotation matrix to the set of contour points.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, wherein transforming the contour points includes applying a vector to the points, or applying a rotation matrix to the points, or both.

Claim 9

Original Legal Text

9. The method of claim 1 , wherein the searching further includes traversing a linked data structure including the plurality of observed information archetypes.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype, involves searching a linked data structure (like a graph or tree) that stores the known poses (archetypes).

Claim 10

Original Legal Text

10. The method of claim 9 , wherein the traversing further includes: visiting a node in the data structure; comparing the transformed contour points sets to one or more pluralities of observed information archetypes associated with the node; and selecting, from the pluralities, at least one archetype having highest conformance with the transformed contour points sets of the control object.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype, where the archetype searching involves using a linked data structure, further including visiting nodes in the data structure, comparing the transformed contour points to archetypes associated with the current node, and selecting the archetype with the best match.

Claim 11

Original Legal Text

11. The method of claim 9 , wherein the linked data structure includes a plurality of nodes representing observed information archetypes in parent-child relationship and the traversing further includes: visiting a plurality of parent nodes, each parent node in the plurality identifying one or more variants of one or more poses, and calculating a ranked list of parent nodes having highest conformance with the transformed contour points sets of the control object; and visiting a plurality of child nodes related to the parent nodes in the ranked list, each child node identifying one or more variants of one or more poses different from the one or more poses of the parent nodes, and calculating a ranked list of child nodes having highest conformance with the transformed contour points sets of the control object.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype, where the archetype searching involves using a linked data structure with parent-child nodes, traverses this structure by ranking parent nodes based on how well they match the transformed contour points, then ranks child nodes (variants of the parent poses) related to top-ranked parent nodes to find the best overall archetype match.

Claim 12

Original Legal Text

12. The method of claim 1 , wherein the initializing predictive information further includes: aligning one or more model portions based at least in part upon one or more initialization parameters associated with the selected archetype.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, further involves adjusting the individual parts of the 3D model based on parameters taken from the selected archetype.

Claim 13

Original Legal Text

13. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: edge information for at least fingers of the hand.

Plain English Translation

A method for initializing a 3D predictive model of a **hand** involves several steps. First, the system accesses observed 3D contour points that define the hand's outline. These points are then normalized to a standard orientation. This normalization includes: sensing the hand's actual position at a training time (`t0`) in a first reference frame, and its apparent position at an initialization time (`t1`) in a second reference frame (even if the hand hasn't moved). A transformation is calculated to map both positions into a common, fixed reference frame, effectively correcting for discrepancies and providing a normalized view. Next, the system searches a library of pre-defined "archetypes" – these are models representing various poses and perspectives of a hand in its normalized orientation. The archetype that best matches the normalized contour points is selected. Finally, the predictive 3D model of the hand is initialized using specific parameters linked to the selected archetype. For a hand, these crucial initialization parameters include **edge information describing at least the fingers of the hand.** ERROR (embedding): Error: Failed to save embedding: Could not find the 'embedding' column of 'patent_claims' in the schema cache

Claim 14

Original Legal Text

14. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: edge information for a palm of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using edge information for the palm as the initialization parameters.

Claim 15

Original Legal Text

15. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: finger segment length information for fingers of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using finger segment lengths as the initialization parameters.

Claim 16

Original Legal Text

16. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: at least one of: one or more joint angles between finger segments of fingers of the hand; a pitch angle between finger segments of fingers of the hand; and a yaw angle between finger segments of fingers of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using joint angles between finger segments, a pitch angle between finger segments, or a yaw angle between finger segments as initialization parameters.

Claim 17

Original Legal Text

17. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: joint angle and segment orientation information of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using joint angles and segment orientation as initialization parameters.

Claim 18

Original Legal Text

18. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: a distance between adjoining base points of fingers of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using the distance between the base points of adjacent fingers as an initialization parameter.

Claim 19

Original Legal Text

19. The method of claim 1 , wherein the complex control object is a hand and the initialization parameters include: a ratio of distance between adjoining base points of fingers of the hand to minimal distance between adjoining base points of the fingers.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using the ratio of distance between adjoining base points of fingers to the minimum distance between adjoining base points of the fingers as an initialization parameter.

Claim 20

Original Legal Text

20. The method of claim 1 , wherein the complex control object is a hand and the poses identify: an angle between adjacent fingers of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using the angle between adjacent fingers as a pose identification parameter.

Claim 21

Original Legal Text

21. The method of claim 1 , wherein the complex control object is a hand and the poses identify: a joint angle between adjacent finger segments of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using the joint angle between adjacent finger segments as a pose identification parameter.

Claim 22

Original Legal Text

22. The method of claim 1 , wherein the complex control object is a hand and the poses identify: a ratio of hand's fingers' thickness to a maximal finger's thickness.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using the ratio of the hand's fingers' thickness to a maximal finger's thickness as a pose identification parameter.

Claim 23

Original Legal Text

23. The method of claim 1 , wherein the complex control object is a hand and the poses identify: span lengths between opposing sides of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using span lengths between opposing sides of the hand as a pose identification parameter.

Claim 24

Original Legal Text

24. The method of claim 1 , wherein the complex control object is a hand and the poses identify: at least one of: finger diameter length fingers of the hand; palm length of palm of the hand; palm to thumb distance of the hand; wrist length of wrist of the hand; and wrist width of wrist of the hand.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, using finger diameter length, palm length, palm to thumb distance, wrist length, or wrist width as pose identification parameters.

Claim 25

Original Legal Text

25. The method of claim 1 , wherein the complex control object is a hand and further including: using the selected archetype to determine at least one of: whether one or more fingers of the hand are extended or non-extended; one or more angles of bend for one or more fingers; a direction to which one or more fingers point; and a configuration indicating a pinch, a grab, an outside pinch, or a pointing finger.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a hand, uses the selected archetype to determine whether fingers are extended, the bend angles of fingers, the direction fingers point, or identify hand configurations such as pinch, grab, or pointing.

Claim 26

Original Legal Text

26. The method of claim 1 , wherein the complex control object is an automobile and the initialization parameters include: at least one of: cabin of the automobile; windshield to rear distance of the automobile; front bumper to rear bumper distance of the automobile; and distance between front of a tire and rear of the tire of the automobile.

Plain English Translation

The method of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype, where the control object is a car, uses the car's cabin, windshield to rear distance, front bumper to rear bumper distance, or tire spacing as initialization parameters.

Claim 27

Original Legal Text

27. A system of initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, the system including: a camera; and one or more processors coupled to the camera and to a non-transitory computer readable medium storing instructions thereon, which instructions when executed by the processors perform: accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, wherein the common reference frame has a fixed point of reference and an initial orientation of axes, whereby the sensed apparent position is transformed to an actual position; searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype.

Plain English Translation

A system for automatically setting up tracking of a 3D object (like a hand or robot arm) by using observed data to figure out the object's initial state. The system includes a camera, processors, and memory. The processors execute instructions to: get observed data as contour points from the object; transform these points to a standard orientation. This transformation involves capturing the object's position at two different times, calculating a reference frame to account for positional differences, and transforming both positions into a common reference frame. Then, the system searches a database of known object poses (archetypes) to find the best match for the transformed contour points and initializes the object's 3D model using parameters from that matched archetype. This avoids manual setup procedures.

Claim 28

Original Legal Text

28. A non-transitory computer readable memory containing computer program instructions for initializing predictive information that models a complex control object in a three dimensional (3D) sensory space, which instructions when executed by one or more processors perform: accessing observed information including a set of contour points corresponding to surface points at along an outline of a complex control object in a three dimensional (3D) sensory space; transforming the set of contour points to a normalized orientation of the control object, including: at training time t 0 , sensing an actual position of at least one complex control object in a first reference frame of the 3D sensory space; at initialization time t 1 , sensing, in the 3D sensory space, an apparent position of the complex control object different from the actual position, wherein the complex control object has not moved in the 3D sensory space between t 0 and t 1 ; calculating a second reference frame that accounts for apparent position of the complex control object; calculating a transformation that renders the actual position in the first reference frame and the apparent position in the second reference frame into a common reference frame; and transforming the actual and apparent positions of the complex control object into the common reference frame, wherein the common reference frame has a fixed point of reference and an initial orientation of axes, whereby the sensed apparent position is transformed to an actual position; searching a plurality of observed information archetypes that represent poses of the control object in the normalized orientation, the poses including arrangement of features of the complex control object and a perspective of observing the complex control object, and selecting an archetype; and initializing predictive information that models the complex control object from initialization parameters associated with the selected archetype.

Plain English Translation

A computer-readable memory storing instructions to automatically set up tracking of a 3D object (like a hand or robot arm) by using observed data to figure out the object's initial state. The instructions, when executed by processors, cause the system to: get observed data as contour points from the object; transform these points to a standard orientation. This transformation involves capturing the object's position at two different times, calculating a reference frame to account for positional differences, and transforming both positions into a common reference frame. Then, the system searches a database of known object poses (archetypes) to find the best match for the transformed contour points and initializes the object's 3D model using parameters from that matched archetype. This avoids manual setup procedures.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

June 5, 2015

Publication Date

May 9, 2017

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Three dimensional (3D) modeling of a complex control object” (US-9646201). https://patentable.app/patents/US-9646201

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-9646201. See llms.txt for full attribution policy.