CPC

CPC Class G10L

9 patents in CPC class G10L

33 Patents
0 Views
Updated 3/29/2026

Top Patents

Disclosed are a method for generating a dynamic image based on audio, a device, and a storage medium, relating to the field of natural human-computer interactions. The method includes: obtaining a reference image and reference audio input by a user; determining a target head pose feature and a target expression coefficient feature based on the reference image and a trained generation network model, and adjusting the trained generation network model based on the target head pose feature and the target expression coefficient feature, to obtain a target generation network model; and processing a to-be-processed image based on the reference audio, the reference image, and the target generation network model, to obtain a target dynamic image. An image object in the to-be-processed image is same as that in the reference image. In this case, a corresponding digital person can be obtained based on a single picture of a target person.

Implementations of the present disclosure relate to methods, devices, and computer program products of extracting a feature for multimedia data that comprises a plurality of medium types. In a method, a first feature is determined for a first medium type in the plurality of medium types by masking a portion in a first medium object with the first medium type. A second feature is determined for a second medium type other than the first medium type in the plurality of medium types. The feature is generated for the multimedia data based on the first and second features. With these implementations, multiple medium types are considered in the feature extraction, and thus the feature may fully reflect various aspects of the multimedia data in an accurate way.

A method for automatically annotating an intended video with at least one personalized recap video based on previously viewed videos is provided. The method may include automatically tracking user viewership of the previously viewed videos, and in response to detecting an intention to view the intended video: automatically identifying and extracting a subset of video footage from one or more of the previously viewed videos based on the tracked user viewership and based on a determined relevancy of the subset of video footage to content in the intended video; generating the at least one personalized recap video by compiling and sorting the extracted subset of video footage from the one or more previously viewed videos into a compilation video; and annotating the intended video with the at least one personalized video by presenting the at least one personalized recap video on the intended video.

A feature extraction system, method and apparatus based on neural network optimization by gradient filtering is provided. The feature extraction method includes: acquiring, by an information acquisition device, input information; constructing, by a feature extraction device, different feature extraction networks, performing iterative training on the networks in combination with corresponding training task queues to obtain optimized feature extraction networks for different input information, and calling a corresponding optimized feature extraction network to perform feature extraction according to a class of the input information; performing, by an online updating device, online updating of the networks; and outputting, by a feature output device, a feature of the input information. The new feature extraction system, method and apparatus avoids the problem of catastrophic forgetting of the artificial neural network in continuous tasks, and achieves high accuracy and precision in continuous feature extraction.

Methods and systems for using generative content to improve the ability of an individual to communicate using electronic-assisted communication.

Explore More Patents

Discover additional patents in the cpc category

Browse All Patents