Cross-Lingual Audio Search

PublishedAugust 12, 2014

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

11 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method comprising: accepting a search query in a first language variety, the search query being in a form of at least one of: text and audio; accessing a corpus of material in the first language variety; determining similarity of a second language variety with respect to a first language variety; choosing the second language variety based on determining that the first language variety baseforms can be obtained via data from the second language variety, and at least one selection criterion; the at least one selection criterion comprising a ranking from among ranked pairs of language varieties, the ranked pairs being ranked on a basis of determined similarity between language varieties; obtaining first language variety baseforms via data obtained from the second language variety; thereupon building a first language variety phonetic model, based on the first language variety baseforms obtained via data obtained from the second language variety; and executing an audio search based on the accepted search query, via employing the first language variety phonetic model and the second language variety.

Plain English Translation

The audio search method accepts a search query, which can be text or audio, in a specific language (first language). It accesses a collection of text or audio data (corpus) in that same language. To improve the search, it identifies another language (second language) that is similar to the first language. The selection of the second language is based on how well data from the second language can be used to create pronunciation models (baseforms) for the first language, along with other criteria, like language similarity rankings. It then obtains these pronunciation models for the first language using data from the second language. Next, it creates a phonetic model (how words sound) for the first language using these pronunciation models. Finally, it performs the audio search using both the phonetic model of the first language and the second language data.

Claim 2

Original Legal Text

2. The method according to claim 1 , further comprising: pre-processing search data in the first language variety; said obtaining comprising obtaining first language variety baseforms via employing models trained on the second language variety; and said executing comprises searching the search data for the accepted search query via comparing the accepted search query and the search data.

Plain English Translation

The audio search method, which accepts a search query (text or audio) in a specific language (first language), accesses a corpus in that language, determines similarity to another language (second language), selects the second language based on its suitability for generating first language pronunciation models (baseforms), obtains these baseforms from second language data, builds a phonetic model for the first language, and executes the audio search using both language models, includes preprocessing the search data in the first language. Obtaining first language baseforms involves models trained on the second language. Executing the search involves comparing the search query against the preprocessed search data. This comparison finds the requested query in the data.

Claim 3

Original Legal Text

3. The method according to claim 2 , wherein: said obtaining comprises obtaining baseforms for words in the first language variety from the corpus; and said pre-processing comprises pre-processing search data in the first language variety via employing phonetic models from both the first language variety and the second language variety.

Plain English Translation

The audio search method described in Claim 2, which involves accepting a search query in a first language, accessing a corpus in that language, choosing a similar second language to obtain first language pronunciation models (baseforms), building a phonetic model for the first language, executing an audio search, preprocessing search data, and comparing the search query to the data, works as follows: Obtaining pronunciation models (baseforms) for words in the first language uses the corpus of material in the first language. Preprocessing the search data uses phonetic models from *both* the first language and the second language. This provides a more accurate representation of how words are pronounced in the first language, accounting for influences from the second language.

Claim 4

Original Legal Text

4. The method according to claim 1 , wherein said accessing comprises accessing corpus of first language text material from public sources and transliterating the text material into a script of the second language variety.

Plain English Translation

In the audio search method, which involves accepting a search query in a specific language (first language), accessing a corpus, determining similarity to another language (second language), selecting the second language to obtain pronunciation models (baseforms), building a phonetic model, and executing an audio search, accessing the corpus involves accessing first language text material from public sources. The text is then transliterated (converted) into the script or writing system of the second language. This allows the system to leverage text data even if it's initially in a different format, and utilize second language models more effectively.

Claim 5

Original Legal Text

5. The method according to claim 1 , wherein said executing comprises executing the audio search via employing a statistical model trained using dictionary data of the second language variety.

Plain English Translation

In the audio search method, which involves accepting a search query in a first language, accessing a corpus, determining similarity to a second language, selecting that language to obtain pronunciation models (baseforms), building a phonetic model, and executing an audio search, the audio search is performed by using a statistical model. This statistical model has been trained using dictionary data from the second language. This allows the search to be based on statistically significant patterns, improving accuracy.

Claim 6

Original Legal Text

6. The method according to claim 1 , wherein said executing comprises executing the audio search via employing at least one acoustic model of the second language variety and the phonetic language model of the first language variety.

Plain English Translation

In the audio search method, which involves accepting a search query in a first language, accessing a corpus, determining similarity to a second language, selecting that language to obtain pronunciation models (baseforms), building a phonetic model, and executing an audio search, the audio search is performed by using at least one acoustic model of the second language *and* the phonetic model of the first language. This combination leverages the strengths of both language models, leading to better search results.

Claim 7

Original Legal Text

7. The method according to claim 1 , wherein said executing comprises generating a phonetic index for search data.

Plain English Translation

In the audio search method, which involves accepting a search query in a first language, accessing a corpus, determining similarity to a second language, selecting that language to obtain pronunciation models (baseforms), building a phonetic model, and executing an audio search, executing the audio search includes generating a phonetic index for the search data. This index makes searching more efficient by providing a pre-computed representation of the sounds in the data.

Claim 8

Original Legal Text

8. The method according to claim 7 , wherein said generating comprises generating a phonetic lattice of the search query in the first language variety.

Plain English Translation

The audio search method described in Claim 7, which involves accepting a search query in a first language, accessing a corpus, choosing a similar second language to obtain first language pronunciation models (baseforms), building a phonetic model for the first language, executing an audio search, and generating a phonetic index for the search data, generates a phonetic lattice of the search query in the first language. A phonetic lattice is a graph representing the possible pronunciations of a word or phrase.

Claim 9

Original Legal Text

9. The method according to claim 8 , wherein the phonetic index of the search data comprises a phonetic lattice representation of the search data.

Plain English Translation

The audio search method which uses a phonetic index as described in Claim 8, where the phonetic index represents a phonetic lattice, means the phonetic index of the search data comprises a phonetic lattice representation of the search data. This means that the search data is also represented as a graph of possible pronunciations, facilitating a more flexible and accurate search.

Claim 10

Original Legal Text

10. The method according to claim 8 , wherein: said accepting comprises accepting a search query in the form of text; and said generating of a phonetic lattice comprises generating at least one baseform of the accepted search query and converting the at least one baseform to a phonetic lattice.

Plain English Translation

The audio search method using a phonetic lattice as described in Claim 8, which takes a search query in a first language, accesses a corpus, determines language similarity, selects a second language to help with the first language, builds a phonetic model, and uses a phonetic index, works like this when the search query is text: Generating the phonetic lattice includes generating at least one baseform (pronunciation) of the text query and converting that baseform into a phonetic lattice (pronunciation graph). This creates a representation of how the query might sound.

Claim 11

Original Legal Text

11. The method according to claim 8 , wherein: said accepting comprises accepting a search query in the form of audio; and said generating of a phonetic lattice comprises generating a phonetic lattice of the search query directly via employing an acoustic model of the second language variety and the phonetic language model of the first language variety.

Plain English Translation

The audio search method using a phonetic lattice as described in Claim 8, which takes a search query in a first language, accesses a corpus, determines language similarity, selects a second language to help with the first language, builds a phonetic model, and uses a phonetic index, works like this when the search query is audio: Generating the phonetic lattice involves generating a phonetic lattice of the search query *directly* by using an acoustic model of the second language *and* the phonetic model of the first language. The audio is directly converted into possible pronunciations based on knowledge from both languages.

Patent Metadata

Filing Date

Unknown

Publication Date

August 12, 2014

Inventors

Jitendra Ajmera

Verma Ashish

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search