US-9679570

Keyword determinations from voice data

PublishedJune 13, 2017

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Topics of potential interest to a user, useful for purposes such as targeted advertising and product recommendations, can be extracted from voice content produced by a user. A computing device can capture voice content, such as when a user speaks into or near the device. One or more sniffer algorithms or processes can attempt to identify trigger words in the voice content, which can indicate a level of interest of the user. For each identified potential trigger word, the device can capture adjacent audio that can be analyzed, on the device or remotely, to attempt to determine one or more keywords associated with that trigger word. The identified keywords can be stored and/or transmitted to an appropriate location accessible to entities such as advertisers or content providers who can use the keywords to attempt to select or customize content that is likely relevant to the user.

Patent Claims

20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A computer-implemented method, comprising: receiving speech data; determining at least one trigger word in the speech data; generating text data by performing one or more speech recognition processes on a portion of the speech data that includes the at least one trigger word and that satisfies a predetermined factor; determining at least one keyword in the text data; determining information based at least in part upon two or more of past behavior data associated with a user profile, a received update from another source, or another keyword associated with the user profile; and determining content to be provided based at least in part upon the at least one trigger word, the at least one keyword, and the information.

Plain English Translation

A computer method analyzes spoken input to target content to a user. It listens to speech, identifies "trigger words" indicating potential user interest, and then uses speech-to-text on the audio surrounding those trigger words. The resulting text is analyzed to find keywords. The system then uses those keywords, the trigger words, user profile data (past behavior, updates from other sources, or other keywords), to select content (like an advertisement or a recommendation) to show the user.

Claim 2

Original Legal Text

2. The computer-implemented method of claim 1 , further comprising providing the content within an advertisement or a recommendation.

Plain English Translation

Building upon the speech analysis method, the targeted content identified using speech analysis and user profile data is displayed within an advertisement tailored to the user's inferred interests. The system provides relevant ads based on spoken keywords, trigger words and user data.

Claim 3

Original Legal Text

3. The computer-implemented method of claim 1 , further comprising providing the content as audio data with a user-configurable volume.

Plain English Translation

Expanding on the method of content targeting based on speech data, the system presents the selected content as audio, and allows the user to adjust the volume. The volume control gives the user control of the audio content presented, based on their spoken keywords and profile data.

Claim 4

Original Legal Text

4. The computer-implemented method of claim 1 , further comprising providing the content as visual data.

Plain English Translation

Extending the method of content targeting based on speech, the selected content is presented to the user as visual data (e.g. an image or video). The visual display is dynamically selected based on the user's inferred interests from speech analysis.

Claim 5

Original Legal Text

5. The computer-implemented method of claim 1 , further comprising: capturing the speech data to during a communication session.

Plain English Translation

This method of content targeting captures spoken data during a communication session, such as a phone call or video conference. The speech is then analyzed to identify trigger words and keywords in order to present relevant content.

Claim 6

Original Legal Text

6. The computer-implemented method of claim 5 , further comprising providing the content using a second computing device.

Plain English Translation

In addition to capturing speech during communication, and using this data to target relevant content, this content is presented to the user using a second computing device (different from the device capturing speech). For example, speech from a phone call might trigger an ad on a nearby tablet.

Claim 7

Original Legal Text

7. The computer-implemented method of claim 6 , further comprising: determining the user profile based on an association between a computing device used to capture the speech data and the user profile.

Plain English Translation

The user profile, used to determine relevant content based on speech, is linked to the device capturing the spoken data. This profile association ensures personalized content delivery to the user based on their device usage and speech patterns. For example, the user's phone is known to be associated with their profile, so any speech captured by the phone is used to personalize content.

Claim 8

Original Legal Text

8. The computer-implemented method of claim 1 , further comprising causing the at least one trigger word to be stored by a first computing device or at least one remote data store.

Plain English Translation

The identified trigger words from spoken data are stored, either on the device that processed the speech or in a remote data storage system (e.g., a database). The storage mechanism allows for tracking frequently occurring trigger words and improving content selection over time.

Claim 9

Original Legal Text

9. The computer-implemented method of claim 8 , further comprising comparing the at least one trigger word with a plurality of trigger words and variants of the plurality of trigger words.

Plain English Translation

The method compares the identified trigger words in speech with a list of known trigger words and their variations. The comparison step ensures accurate identification of user interest even when the exact trigger word isn't spoken, by matching variations or synonyms.

Claim 10

Original Legal Text

10. The computer-implemented method of claim 9 , further comprising: determining additional information associated with the speech data, the additional information including at least one of a time stamp, geographic coordinates, identity information associated with the user profile, a context of the speech data, or a priority of the at least one keyword.

Plain English Translation

To improve content selection, the method also records other contextual information about the spoken data, such as the time the speech occurred, the user's location (geographic coordinates), user ID, the context of the conversation, or how important the keyword is. This additional information is then factored into the content determination process.

Claim 11

Original Legal Text

11. A system, comprising: at least one processor; and at least one memory device including instructions that, when executed by the at least one processor, cause the system to: receive speech data; determine at least one trigger word in the speech data; generate text data by performing one or more speech recognition processes on a portion of the speech data that includes the at least one trigger word and that satisfies a predetermined factor; determine at least one keyword in the text data; determine information based at least in part upon two or more of past behavior data associated with a user profile, a received update from another source, or another keyword associated with the user profile; and determine content to be provided based at least in part upon the at least one trigger word, the at least one keyword, and the information.

Plain English Translation

A system with a processor and memory identifies spoken keywords for content targeting. The system receives speech, finds trigger words, converts the surrounding audio to text, and extracts keywords. It considers past user behavior, external updates, or related keywords to refine targeting. Finally, the system determines content (advertisement, recommendation, etc.) based on the trigger words, keywords, and user information.

Claim 12

Original Legal Text

12. The system of claim 11 , wherein the content is an advertisement or a recommendation.

Plain English Translation

The content selected by the speech analysis system described above (trigger words, keywords, and user data) takes the form of either an advertisement or a recommendation.

Claim 13

Original Legal Text

13. The system of claim 11 , wherein the content is provided as audio data with a user-configurable volume.

Plain English Translation

The content provided by the keyword-based content targeting system is presented as audio, with a volume control that the user can adjust. The volume control gives users control of audio content presented based on their spoken keywords and profile data.

Claim 14

Original Legal Text

14. The system of claim 11 , wherein the content is provided as visual data.

Plain English Translation

The content provided by the speech-driven content targeting system is presented as visual data, such as images or videos, which are selected based on user's inferred interests from speech analysis.

Claim 15

Original Legal Text

15. The system of claim 11 , wherein the speech data is captured using a first computing device during a communication session.

Plain English Translation

The system captures spoken data during a communication session (e.g., phone call), analyzes it for trigger words and keywords, and uses these elements to determine and present targeted content.

Claim 16

Original Legal Text

16. The system of claim 15 , wherein the content is provided using a second computing device.

Plain English Translation

The system captures speech on one device during a communication session, and displays content triggered by that speech on a different device. For example, speech captured by a phone during a call results in a related advertisement on the user's tablet.

Claim 17

Original Legal Text

17. The system of claim 15 , wherein the first computing device is associated with the user profile.

Plain English Translation

The device that captures the spoken data is linked to the user's profile. This association ensures personalized content delivery based on device usage and speech patterns. For example, a user's phone is associated with a profile, so any speech captured by that phone personalizes the content they receive.

Claim 18

Original Legal Text

18. The system of claim 11 , wherein the at least one trigger word corresponds to a plurality of trigger words stored by a first computing device or at least one remote data store.

Plain English Translation

The trigger words used by the speech analysis system are stored in a list on the local device or on a remote server. This list allows the system to quickly and accurately identify relevant spoken keywords.

Claim 19

Original Legal Text

19. The system of claim 11 , wherein the instructions, when executed by the at least one processor, further cause the system to: compare the at least one trigger word with a plurality of trigger words and variants of the plurality of trigger words.

Plain English Translation

The system compares spoken words to a list of trigger words and their variants. This comparison handles synonyms and related terms, allowing for more robust identification of user interests.

Claim 20

Original Legal Text

20. The system of claim 11 , wherein the instructions when executed by the at least one processor, further cause the system to: determine additional information associated with the speech data, the additional information including at least one of a time stamp, geographic coordinates, identity information associated with the user profile, a context of the speech data, or a priority of the at least one keyword.

Plain English Translation

The system records additional information about the speech data, such as timestamps, location, user ID, conversation context, or keyword priority. This contextual data improves the accuracy of content determination.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G10L G06Q

Patent Metadata

Filing Date

August 17, 2015

Publication Date

June 13, 2017

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search