Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A processing device, comprising: a first counter to increment with each cycle of the processing device in which at least one thread of threads of the processing device is active; a second counter to increment with each cycle of the processing device in which both of: execution units of the processing device are stalled for one of the threads; and an access request, from the one of the threads for which the executions units are stalled, to memory external to the processing device is pending; and a power controller unit (PCU) communicably coupled to the first counter and the second counter, the PCU to: calculate a scalability factor from values of the first counter and the second counter, the scalability factor to indicate a first portion of a current workload of the processing device that is to be scalable to a frequency change at the processing device, and the scalability factor to determine a second portion of the current workload that is not to be scalable to the frequency change; determine expected performance scores of the processing device at different frequencies based on the scalability factor; and adjust a frequency of the processing device based on the expected performance scores of the processing device at the different frequencies.
A processor predicts performance scalability based on memory stalls. It includes a first counter that increments when any thread is active and a second counter that increments when execution units are stalled due to a thread waiting for external memory access. A power controller unit (PCU) calculates a "scalability factor" from these counters. This factor indicates the portion of the current workload that benefits from frequency scaling and the portion that doesn't. The PCU then predicts performance at different frequencies using this factor and adjusts the processor frequency accordingly to optimize performance and power.
2. The processing device of claim 1 , wherein the scalability factor is calculated by subtracting the quotient of the value of the first counter and the value of the second counter from one.
The processing device calculates the scalability factor by subtracting the quotient of (the value of the first counter, which increments when any thread is active) divided by (the value of the second counter, which increments when execution units are stalled due to a thread waiting for external memory access) from one. Specifically, `scalability_factor = 1 - (first_counter_value / second_counter_value)`. This provides a normalized measure of how memory-bound the current workload is, which is then used to predict the impact of frequency changes.
3. The processing device of claim 1 , wherein the current workload comprises instructions being executed by the processing device.
The processing device's current workload, used to determine the scalability factor, is comprised of the instructions currently being executed by the processor. The scalability factor, calculated from counters tracking active threads and memory stall events, helps predict how frequency adjustments will impact the performance of these currently executing instructions.
4. The processing device of claim 1 , wherein the scalability factor is used to predict a performance change due to the frequency change at the processing device.
The processing device uses the scalability factor, calculated from counters tracking active threads and memory stall events, to predict the performance change that will result from adjusting the processor's frequency. This prediction informs decisions made by the power controller to dynamically tune the processor's operating frequency for optimal performance and power efficiency.
5. The processing device of claim 1 , wherein the first counter and the second counter are part of a performance monitoring unit (PMU) of the processing device.
The first counter (which increments when any thread is active) and the second counter (which increments when execution units are stalled due to a thread waiting for external memory access) are integrated into a Performance Monitoring Unit (PMU) within the processor. The PMU provides hardware-level performance data used to calculate the scalability factor for performance prediction.
6. The processing device of claim 1 , wherein the PCU utilizes the scalability factor as one of multiple inputs to performance optimizations performed by the PCU to adjust the frequency of the processing device.
The power controller unit (PCU) uses the scalability factor, calculated from counters tracking active threads and memory stall events, as one of several inputs when performing performance optimizations. Alongside the scalability factor, the PCU considers other metrics and algorithms to determine the optimal processor frequency, balancing performance and power consumption.
7. The processing device of claim 1 , wherein the processing device comprises multiple cores, each core comprising an instance of the first counter and the second counter.
The processing device contains multiple cores, and each core independently contains an instance of the first counter (which increments when any thread is active) and the second counter (which increments when execution units are stalled due to a thread waiting for external memory access). This allows for fine-grained, per-core performance scalability prediction and frequency adjustment based on the workload characteristics of each core.
8. A method, comprising: incrementing, by a processing device, a first counter with each cycle of the processing device in which at least one thread of threads of the processing device is active; and incrementing, by the processing device, a second counter with each cycle of the processing device in which both of: execution units of the processing device are stalled for one of the threads; and an access request, from the one of the threads for which the executions units are stalled, to memory external to the processing device is pending; calculating a scalability factor from values of the first counter and the second counter, the scalability factor to indicate a first portion of a current workload of the processing device that is scalable to a frequency change at the processing device; determining, based on the scalability factor, a second portion of the current workload that is not scalable to the frequency change; determining expected performance scores of the processing device at different frequencies based on the scalability factor; and adjusting a frequency of the processing device based on the expected performance scores of the processing device at the different frequencies.
A method for predicting performance scalability involves incrementing a first counter each cycle when at least one thread is active. A second counter is incremented each cycle when execution units are stalled due to a thread waiting for external memory. A scalability factor is calculated from these counter values, indicating the portion of the workload scalable to frequency changes. Based on this factor, the portion not scalable to frequency changes is determined. Expected performance scores at different frequencies are determined based on the scalability factor, and the processor's frequency is adjusted accordingly.
9. The method of claim 8 , wherein the scalability factor is calculated by subtracting the quotient of the value of the first counter and the value of the second counter from one.
The method for calculating a scalability factor, used to predict performance scalability, involves subtracting the quotient of (the value of the first counter, which increments when any thread is active) divided by (the value of the second counter, which increments when execution units are stalled due to a thread waiting for external memory access) from one. Specifically, `scalability_factor = 1 - (first_counter_value / second_counter_value)`. This provides a normalized measure of how memory-bound the current workload is, which is then used to predict the impact of frequency changes.
10. The method of claim 8 , wherein the scalability factor is used to predict a performance change due to adjustment of the frequency at the processing device.
The method uses the scalability factor, calculated from counters tracking active threads and memory stall events, to predict the performance change due to adjustment of the processor frequency. This predicted performance change helps determine the optimal frequency setting.
11. The method of claim 10 , wherein the first counter and the second counter are part of a performance monitoring unit (PMU) of the processing device, and wherein the PCU utilizes the scalability factor as one of multiple inputs to performance optimizations performed by the PCU to adjust a frequency of the processing device.
In the performance scalability prediction method, the first counter (which increments when any thread is active) and the second counter (which increments when execution units are stalled due to a thread waiting for external memory access) are part of a Performance Monitoring Unit (PMU). The power controller unit (PCU) utilizes the scalability factor as one of multiple inputs to performance optimizations performed by the PCU to adjust the processor frequency.
12. The method of claim 8 , wherein the processing device comprises multiple cores, each core comprising an instance of the first counter and the second counter.
The method for predicting performance scalability is applied to a processor containing multiple cores, where each core comprises an instance of the first counter (which increments when any thread is active) and the second counter (which increments when execution units are stalled due to a thread waiting for external memory access). This facilitates per-core frequency scaling based on individual workload characteristics.
13. The method of claim 8 , wherein the current workload comprises instructions being executed by the processing device.
In the performance scalability prediction method, the current workload comprises the instructions being executed by the processing device. The scalability factor, calculated from counters tracking active threads and memory stall events, helps predict how frequency adjustments will impact the performance of these currently executing instructions.
14. A system comprising: a memory; a processing device communicably coupled to the memory, the processing device comprising a plurality of cores, each of the plurality of cores comprising: a first counter to increment with each cycle of the processing device in which at least one thread of threads of the processing device is active; a second counter to increment with each cycle of the processing device in which both of: execution units of the processing device are stalled for one of the threads; and an access request, from the one of the threads for which the executions units are stalled, to memory external to the processing device is pending; and a power controller unit (PCU) communicably coupled to the first counter and the second counter, the PCU to: calculate a scalability factor from values of the first counter and the second counter, the scalability factor to indicate a first portion of a current workload of the processing device that is to be scalable to a frequency change at the processing device, and the scalability factor to determine a second portion of the current workload that is not to be scalable to the frequency change; determine expected performance scores of the processing device at different frequencies based on the scalability factor; and adjust a frequency of the processing device based on the expected performance scores of the processing device at the different frequencies.
A system predicts performance scalability by monitoring thread activity and memory stalls. It includes a memory unit and a processor with multiple cores. Each core has: a first counter to track active thread cycles, a second counter to track cycles stalled on external memory access, and a power controller unit (PCU) to calculate a scalability factor. The PCU uses this factor to determine how much of the workload benefits from frequency scaling. The PCU then predicts performance at different frequencies and adjusts the processor frequency accordingly.
15. The system of claim 14 , wherein the scalability factor is used to predict a performance change due to the frequency change at the processing device.
In the system that predicts performance scalability, the scalability factor, calculated from counters tracking active threads and memory stall events, is used to predict a performance change due to the frequency change at the processing device. This prediction helps the power controller optimize the processor's frequency.
16. The system of claim 14 , wherein the first counter and the second counter are part of a performance monitoring unit (PMU) of the processing device, and wherein the PCU utilizes the scalability factor as one of multiple inputs to performance optimizations performed by the PCU to adjust the frequency of the processing device.
The system predicting performance scalability incorporates a Performance Monitoring Unit (PMU) containing the first counter (which increments when any thread is active) and the second counter (which increments when execution units are stalled due to a thread waiting for external memory access). The power controller unit (PCU) uses the scalability factor as one of multiple inputs when optimizing performance and adjusting the processor's frequency.
17. The system of claim 14 , wherein the scalability factor is calculated by subtracting the quotient of the value of the first counter and the value of the second counter from one.
In the system for predicting performance scalability, the scalability factor is calculated by subtracting the quotient of (the value of the first counter, which increments when any thread is active) divided by (the value of the second counter, which increments when execution units are stalled due to a thread waiting for external memory access) from one. Specifically, `scalability_factor = 1 - (first_counter_value / second_counter_value)`. This indicates how much the workload is memory-bound.
18. The system of claim 14 , wherein the processing device comprises multiple cores, each core comprising an instance of the first counter and the second counter.
The system for predicting performance scalability includes a processing device with multiple cores. Each core has its own instance of the first counter (which increments when any thread is active) and the second counter (which increments when execution units are stalled due to a thread waiting for external memory access). This enables independent monitoring and scaling of each core based on its workload.
Unknown
November 28, 2017
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.