Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method of audio processing, the method comprising: receiving a plurality of audio samples; concatenating the plurality of audio samples to form a composite audio signal; analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal, wherein analysing comprises: monitoring an energy level of the composite audio signal; monitoring a rate of change of a tracking envelope of the composite audio signal; and identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and the monitored rate of change of the energy level of the composite audio signals; compensating for the identified audio artefacts to form a corrected composite audio signal; and providing the corrected composite audio signal to a voice biometrics module.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
2. A method according to claim 1 , wherein the step of analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal comprises: identifying a pop or click in the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis, which specifically includes identifying *pops or clicks* in the composite audio signal, monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
3. A method according to claim 1 , wherein the step of monitoring an energy level of the composite audio signal comprises: forming an energy tracking envelope of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors the audio signal's energy level *by forming an energy tracking envelope of the composite audio signal*, and also monitors the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of the monitored energy level (derived from the energy tracking envelope) and the monitored rate of change of the energy level. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
4. A method according to claim 1 , wherein the step of monitoring a rate of change of the energy level of the composite audio signal comprises: forming a signal tracking envelope of the composite audio signal; and determining a rate of change of the signal tracking envelope of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors the audio signal's energy level and also monitors the rate of change of the energy level *by forming a signal tracking envelope of the composite audio signal and then determining the rate of change of this signal tracking envelope*. Audio artifacts are identified based on a combination of the monitored energy level and the monitored rate of change of the energy level (derived from the signal tracking envelope's rate of change). After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
5. A method according to claim 4 , wherein the signal tracking envelope has a faster attack time constant than the energy tracking envelope.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors the audio signal's energy level by forming an *energy tracking envelope* of the composite audio signal. It also monitors the rate of change of the energy level by forming a *signal tracking envelope* of the composite audio signal and determining its rate of change. Crucially, this *signal tracking envelope* is configured to have a *faster attack time constant* than the *energy tracking envelope*, allowing it to respond more quickly to sudden changes. Audio artifacts are identified based on both the monitored energy level (from the energy tracking envelope) and the rate of change (from the signal tracking envelope). After identification, these artifacts are compensated for, resulting in a corrected composite audio signal that is then sent to a voice biometrics module.
6. A method according to claim 4 , wherein the step of monitoring an energy level of the composite audio signal comprises forming an energy tracking envelope of the composite audio signal, and wherein the step of identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and monitored rate of change of the energy level of the composite audio signal comprises: determining whether a parameter of the energy tracking envelope exceeds a first threshold level; determining whether the rate of change of the signal tracking envelope exceeds a second threshold level; and responsive to the parameter of the energy tracking envelope not exceeding the first threshold level, and the rate of change of the signal tracking envelope exceeding the second threshold level, identifying an audio artefact.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors the audio signal's energy level by forming an *energy tracking envelope*. It also monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Audio artifacts are specifically identified when two conditions are met: first, a parameter of the *energy tracking envelope does not exceed a first threshold level*; and second, the *rate of change of the signal tracking envelope exceeds a second threshold level*. If both conditions are true, an audio artifact is identified. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal that is then sent to a voice biometrics module.
7. A method according to claim 6 , wherein the second threshold level is set based on a maximum expected slew rate of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors the audio signal's energy level by forming an *energy tracking envelope*. It also monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Audio artifacts are specifically identified when two conditions are met: first, a parameter of the *energy tracking envelope does not exceed a first threshold level*; and second, the *rate of change of the signal tracking envelope exceeds a second threshold level*. This *second threshold level is specifically set based on the maximum expected slew rate of the composite audio signal*. If both conditions are true, an audio artifact is identified. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal that is then sent to a voice biometrics module.
8. A method according to claim 4 , wherein the step of monitoring an energy level of the composite audio signal comprises forming an energy tracking envelope of the composite audio signal, and wherein the step of identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and monitored rate of change of the energy level of the composite audio signal comprises determining whether the ratio of the rate of change of the signal tracking envelope and a parameter of the energy tracking envelope exceeds a third threshold level; and responsive to the ratio of the rate of change of the signal tracking envelope and the parameter of the energy tracking envelope exceeding the third threshold level, identifying an audio artefact.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors the audio signal's energy level by forming an *energy tracking envelope*. It also monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Audio artifacts are specifically identified when a calculated *ratio* exceeds a certain threshold: this ratio is derived from the *rate of change of the signal tracking envelope divided by a parameter of the energy tracking envelope*. If this ratio *exceeds a third threshold level*, an audio artifact is identified. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal that is then sent to a voice biometrics module.
9. A method according to claim 1 , wherein the plurality of audio samples represent speech.
A method for audio processing involves receiving multiple audio samples, which *represent speech*, and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
10. A method according to claim 9 , wherein the plurality of samples representing speech are received from a speaker diarisation process.
A method for audio processing involves receiving multiple audio samples, which represent speech and are *specifically received from a speaker diarisation process*, and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
11. A method according to claim 9 , wherein the plurality of samples representing speech comprise a plurality of utterances received from multiple different sessions where an individual has provided speech to the system.
A method for audio processing involves receiving multiple audio samples, which represent speech and *comprise a plurality of utterances received from multiple different sessions where an individual has provided speech to the system*. These samples are then combined (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
12. A method according to claim 1 , wherein the step of compensating for the identified audio artefacts to form a corrected composite audio signal comprises: applying a time-variable gain to the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. Once identified, these artifacts are compensated for *by applying a time-variable gain to the composite audio signal*, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
13. A method according to claim 12 , wherein the time-variable gain comprises a Gaussian profile.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. Once identified, these artifacts are compensated for *by applying a time-variable gain to the composite audio signal, wherein this time-variable gain comprises a Gaussian profile*, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
14. A method according to claim 1 , wherein the method further comprises: using the corrected composite audio signal in a speaker enrolment process.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module, and *is further used in a speaker enrolment process*.
15. A method according to claim 1 , wherein the method further comprises: using the corrected composite audio signal in a speaker verification process.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. The analysis monitors both the audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on a combination of these monitored energy levels and their rates of change. After identification, these artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module, and *is further used in a speaker verification process*.
16. A system for audio processing, the system comprising: an input for receiving a plurality of audio samples; a processor, wherein the processor is configured for: concatenating the plurality of audio samples to form a composite audio signal; analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal, wherein analysing comprises: monitoring an energy level of the composite audio signal; monitoring a rate of change of a tracking envelope of the composite audio signal; and identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and the monitored rate of change of the energy level of the composite audio signal; compensating for the identified audio artefacts to form a corrected composite audio signal; and an output for providing the corrected composite audio signal to a voice biometrics module.
A system for audio processing includes an input for receiving multiple audio samples. A processor within the system is configured to combine (concatenate) these samples into a composite audio signal. The processor then analyzes this composite signal to identify audio artifacts caused by concatenation. This analysis involves monitoring the composite audio signal's energy level and the rate of change of a tracking envelope of the signal. Based on both the monitored energy level and its rate of change, the processor identifies any audio artifacts. It then compensates for these identified artifacts to create a corrected composite audio signal. Finally, an output provides this corrected audio signal to a voice biometrics module.
17. A system according to claim 16 , further comprising a voice biometrics module connected to said output.
A system for audio processing includes an input for receiving multiple audio samples. A processor within the system is configured to combine (concatenate) these samples into a composite audio signal. The processor then analyzes this composite signal to identify audio artifacts caused by concatenation. This analysis involves monitoring the composite audio signal's energy level and the rate of change of a tracking envelope of the signal. Based on both the monitored energy level and its rate of change, the processor identifies any audio artifacts. It then compensates for these identified artifacts to create a corrected composite audio signal. Finally, an output provides this corrected audio signal to a voice biometrics module, and the system *further includes this voice biometrics module connected to said output*.
18. A computer program product, comprising a non-transitory computer-readable medium, containing instructions for causing a suitably programmed processor to perform a method comprising: receiving a plurality of audio samples; concatenating the plurality of audio samples to form a composite audio signal; analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal, wherein analysing comprises: monitoring an energy level of the composite audio signal; monitoring a rate of change of a tracking envelope of the composite audio signal; and identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and the monitored rate of change of the energy level of the composite audio signal; compensating for the identified audio artefacts to form a corrected composite audio signal; and providing the corrected composite audio signal to a voice biometrics module.
A computer program product, stored on a non-transitory computer-readable medium, contains instructions that, when executed by a processor, cause the processor to perform an audio processing method. This method includes receiving multiple audio samples and combining them (concatenating) into a composite audio signal. The instructions then guide the processor to analyze this composite signal to identify audio artifacts caused by concatenation. This analysis involves monitoring the composite audio signal's energy level and the rate of change of a tracking envelope of the signal. Audio artifacts are identified based on both the monitored energy level and its rate of change. The processor then compensates for these identified artifacts to create a corrected composite audio signal, and finally provides this corrected signal to a voice biometrics module.
19. A method of audio processing, the method comprising: receiving a plurality of audio samples; concatenating the plurality of audio samples to form a composite audio signal; analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal; compensating for the identified audio artefacts to form a corrected composite audio signal; and providing the corrected composite audio signal to a voice biometrics module; reversing the composite audio signal; and analysing the reversed composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by *reversing it* and then *analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation*. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
20. A method according to claim 19 , wherein the step of analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal comprises: monitoring an energy level of the composite audio signal; monitoring a rate of change of a tracking envelope of the composite audio signal; and identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and the monitored rate of change of the energy level of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis involves *monitoring the audio signal's energy level, monitoring the rate of change of a tracking envelope of the composite audio signal, and identifying audio artifacts based on both the monitored energy level and the monitored rate of change of the energy level*. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module.
21. A method according to claim 20 , wherein the step of monitoring an energy level of the composite audio signal comprises: forming an energy tracking envelope of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis monitors the audio signal's energy level *by forming an energy tracking envelope of the composite audio signal*, and also monitors the rate of change of a tracking envelope of the composite audio signal. Audio artifacts are identified based on both the monitored energy level (from the energy tracking envelope) and the monitored rate of change of the energy level. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module.
22. A method according to claim 20 , wherein the step of monitoring a rate of change of the energy level of the composite audio signal comprises: forming a signal tracking envelope of the composite audio signal; and determining a rate of change of the signal tracking envelope of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis monitors the audio signal's energy level and also monitors the rate of change of the energy level *by forming a signal tracking envelope of the composite audio signal and determining a rate of change of the signal tracking envelope*. Audio artifacts are identified based on both the monitored energy level and the monitored rate of change of the energy level (derived from the signal tracking envelope's rate of change). After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module.
23. A method according to claim 22 , wherein the signal tracking envelope has a faster attack time constant than the energy tracking envelope.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis monitors the audio signal's energy level by forming an *energy tracking envelope* and monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Crucially, this *signal tracking envelope* has a *faster attack time constant* than the *energy tracking envelope*. Audio artifacts are identified based on both envelopes. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts. Once identified, all artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module.
24. A method according to claim 22 , wherein the step of monitoring an energy level of the composite audio signal comprises forming an energy tracking envelope of the composite audio signal, and wherein the step of identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and monitored rate of change of the energy level of the composite audio signal comprises: determining whether a parameter of the energy tracking envelope exceeds a first threshold level; determining whether the rate of change of the signal tracking envelope exceeds a second threshold level; and responsive to the parameter of the energy tracking envelope not exceeding the first threshold level, and the rate of change of the signal tracking envelope exceeding the second threshold level, identifying an audio artefact.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis monitors the audio signal's energy level by forming an *energy tracking envelope* and monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Audio artifacts are specifically identified when two conditions are met: first, a parameter of the *energy tracking envelope does not exceed a first threshold level*; and second, the *rate of change of the signal tracking envelope exceeds a second threshold level*. If both are true, an artifact is identified. After this initial analysis, the method reverses the composite signal and analyzes the reversed signal for additional artifacts. All identified artifacts are compensated for, producing a corrected signal sent to a voice biometrics module.
25. A method according to claim 24 , wherein the second threshold level is set based on a maximum expected slew rate of the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis monitors the audio signal's energy level by forming an *energy tracking envelope* and monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Audio artifacts are specifically identified when two conditions are met: first, a parameter of the *energy tracking envelope does not exceed a first threshold level*; and second, the *rate of change of the signal tracking envelope exceeds a second threshold level*. This *second threshold level is specifically set based on the maximum expected slew rate of the composite audio signal*. If both are true, an artifact is identified. After this initial analysis, the method reverses the composite signal and analyzes the reversed signal for additional artifacts. All identified artifacts are compensated for, producing a corrected signal sent to a voice biometrics module.
26. A method according to claim 22 , wherein the step of monitoring an energy level of the composite audio signal comprises forming an energy tracking envelope of the composite audio signal, and wherein the step of identifying audio artefacts associated with concatenation based on both the monitored energy level of the composite audio signal and monitored rate of change of the energy level of the composite audio signal comprises: determining whether the ratio of the rate of change of the signal tracking envelope and a parameter of the energy tracking envelope exceeds a third threshold level; and responsive to the ratio of the rate of change of the signal tracking envelope and the parameter of the energy tracking envelope exceeding the third threshold level, identifying an audio artefact.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. This analysis monitors the audio signal's energy level by forming an *energy tracking envelope* and monitors the rate of change of the energy level by forming a *signal tracking envelope* and determining its rate of change. Audio artifacts are specifically identified when a calculated *ratio* exceeds a certain threshold: this ratio is derived from the *rate of change of the signal tracking envelope divided by a parameter of the energy tracking envelope*. If this ratio *exceeds a third threshold level*, an audio artifact is identified. After this initial analysis, the method reverses the composite signal and analyzes the reversed signal for additional artifacts. All identified artifacts are compensated for, producing a corrected signal sent to a voice biometrics module.
27. A method according to claim 19 , wherein the plurality of audio samples represent speech.
A method for audio processing involves receiving multiple audio samples, which *represent speech*, and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
28. A method according to claim 27 , wherein the plurality of samples representing speech are received from a speaker diarisation process.
A method for audio processing involves receiving multiple audio samples, which represent speech and are *specifically received from a speaker diarisation process*, and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
29. A method according to claim 27 , wherein the plurality of samples representing speech comprise a plurality of utterances received from multiple different sessions where an individual has provided speech to the system.
A method for audio processing involves receiving multiple audio samples, which represent speech and *comprise a plurality of utterances received from multiple different sessions where an individual has provided speech to the system*. These samples are then combined (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module.
30. A method according to claim 19 , wherein the step of compensating for the identified audio artefacts to form a corrected composite audio signal comprises: applying a time-variable gain to the composite audio signal.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for *by applying a time-variable gain to the composite audio signal*, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
31. A method according to claim 30 , wherein the time-variable gain comprises a Gaussian profile.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for *by applying a time-variable gain to the composite audio signal, wherein this time-variable gain comprises a Gaussian profile*, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module for use in tasks like speaker identification or verification.
32. A method according to claim 19 , wherein the method further comprises: using the corrected composite audio signal in a speaker enrolment process.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module, and *is further used in a speaker enrolment process*.
33. A method according to claim 19 , wherein the method further comprises: using the corrected composite audio signal in a speaker verification process.
A method for audio processing involves receiving multiple audio samples and combining them (concatenating) into a single composite audio signal. This composite signal is then analyzed to detect specific audio artifacts caused by the concatenation. After this initial analysis, the method further processes the composite audio signal by reversing it and then analyzing the reversed composite audio signal again to identify additional audio artifacts associated with concatenation. Once identified, all audio artifacts are compensated for, resulting in a corrected composite audio signal. This cleaned signal is then sent to a voice biometrics module, and *is further used in a speaker verification process*.
34. A system for audio processing, the system comprising: an input for receiving a plurality of audio samples; a processor, wherein the processor is configured for: concatenating the plurality of audio samples to form a composite audio signal; analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal; and compensating for the identified audio artefacts to form a corrected composite audio signal; reversing the composite audio signal; and analysing the reversed composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal; and an output for providing the corrected composite audio signal to a voice biometrics module.
A system for audio processing includes an input for receiving multiple audio samples. A processor within the system is configured to combine (concatenate) these samples into a composite audio signal. The processor then analyzes this composite signal to identify audio artifacts caused by concatenation. It then compensates for these identified artifacts to create a corrected composite audio signal. Additionally, the processor is configured to *reverse the composite audio signal* and *analyze the reversed composite audio signal to identify further audio artifacts associated with concatenation*. Finally, an output provides this corrected audio signal to a voice biometrics module.
35. A system according to claim 34 , further comprising a voice biometrics module connected to said output.
A system for audio processing includes an input for receiving multiple audio samples. A processor within the system is configured to combine (concatenate) these samples into a composite audio signal. The processor then analyzes this composite signal to identify audio artifacts caused by concatenation. It then compensates for these identified artifacts to create a corrected composite audio signal. Additionally, the processor is configured to reverse the composite audio signal and analyze the reversed composite audio signal to identify further audio artifacts associated with concatenation. Finally, an output provides this corrected audio signal to a voice biometrics module, and the system *further includes this voice biometrics module connected to said output*.
36. A computer program product, comprising a non-transitory tangible computer-readable medium, containing instructions for causing a suitably programmed processor to perform a method comprising: receiving a plurality of audio samples; concatenating the plurality of audio samples to form a composite audio signal; analysing the composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal; compensating for the identified audio artefacts to form a corrected composite audio signal; and providing the corrected composite audio signal to a voice biometrics module; reversing the composite audio signal; and analysing the reversed composite audio signal to identify audio artefacts associated with concatenation in the composite audio signal.
A computer program product, stored on a non-transitory tangible computer-readable medium, contains instructions that, when executed by a processor, cause the processor to perform an audio processing method. This method includes receiving multiple audio samples and combining them (concatenating) into a composite audio signal. The instructions then guide the processor to analyze this composite signal to identify audio artifacts caused by concatenation and compensate for them to form a corrected composite audio signal. Furthermore, the instructions cause the processor to *reverse the composite audio signal* and *analyze the reversed composite audio signal to identify additional audio artifacts associated with concatenation*. Finally, the corrected signal is provided to a voice biometrics module.
Unknown
July 21, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.