Patentable/Patents/US-11294903
US-11294903

Partition-wise processing distribution

PublishedApril 5, 2022
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A system includes determination, for a first partitioned physical query operator in a query operator tree, of a partition-wise placement cost based on a cost of each table partition associated with the first partitioned physical query operator and a partition-wise placement cost of any child physical query operator of the first partitioned physical query operator, determination of a placement cost for the first partitioned physical query operator physical query operator for each of a plurality of operator execution locations based on the determined partition-wise placement cost, determination, for a logical query operator associated with the first partitioned physical query operator, of a merged placement cost for each of the plurality of operator execution locations, and determination an execution location for the first partitioned physical query operator based on the determined partition-wise placement cost.

Patent Claims
18 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A system comprising: a memory storing processor-executable program code; and a processor to execute the processor-executable program code in order to cause the system to: determine, starting at a lowest level of a query operator tree representation of a query of data stored in one or more table partitions within a plurality of operator execution locations for a first partitioned physical query operator in the query operator tree, a partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the first partitioned physical query operator and a partition-wise placement cost of any child physical query operator of the first partitioned physical query operator, the first partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the first partitioned physical query operator; determine a placement cost for the first partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined partition-wise placement costs for the first partitioned physical query operator; determine, for a logical query operator associated with the first partitioned physical query operator, a merged placement cost for each of the plurality of operator execution locations, the merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the first logical query operator; and determine an execution location for the first partitioned physical query operator based on the determined partition-wise placement cost for the first partitioned physical query operator by traversing the query operator tree starting from a highest level down to the lowest level of the query operator tree.

Plain English Translation

The system optimizes query execution in distributed database environments by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing queries on partitioned data across distributed systems. The system analyzes a query operator tree representing a query on partitioned table data to determine optimal placement of operators. Starting at the lowest level of the tree, it calculates partition-wise placement costs for each operator, considering both local processing costs and transfer costs between partitions. These costs are consolidated to determine the overall placement cost for each operator at different execution locations. For logical query operators, the system merges costs from their physical counterparts, including transfer costs, to identify the most cost-effective execution strategy. The system then traverses the query operator tree from the highest to the lowest level to finalize the execution location for each operator, ensuring minimal overall query execution cost. This approach improves query performance by reducing unnecessary data transfers and optimizing resource utilization across distributed execution locations.

Claim 2

Original Legal Text

2. A system according to claim 1 , the processor to execute the processor-executable program code in order to cause the system to: determine, for a first non-partitioned physical query operator in the query operator tree, a second placement cost for each of the plurality of operator execution locations based on placement costs of any child physical operators of the first non-partitioned physical query operator; determine, for a second logical query operator associated with the first non-partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the second logical query operator; and determine an execution location for the first non-partitioned physical query operator based on the determined second placement cost for the second non-partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by determining optimal placement of non-partitioned physical query operators within a query operator tree. The system addresses the challenge of efficiently distributing query processing across multiple execution locations while minimizing costs associated with data transfer and computation. The system includes a processor that executes program code to analyze a query operator tree, which represents a structured breakdown of a database query into individual operations. For a first non-partitioned physical query operator in the tree, the processor calculates a second placement cost for each available execution location, considering the placement costs of its child operators. Additionally, for a second logical query operator linked to the first non-partitioned operator, the system computes a second merged placement cost for each location, incorporating the lowest cost value of the physical operator both with and without partition-wise transfer costs. The execution location for the first non-partitioned operator is then determined by traversing the query operator tree from the highest to the lowest level, ensuring optimal placement decisions are made hierarchically. This approach improves query performance by systematically evaluating and selecting the most cost-effective execution locations for non-partitioned operators, reducing overall query execution time and resource usage.

Claim 3

Original Legal Text

3. A system according to claim 2 , the processor to execute the processor-executable program code in order to cause the system to: determine, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is a parent of the first partitioned physical query operator, a third partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator and the partition-wise placement cost of the first partitioned physical query operator, the second partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost for each of the plurality of operator execution locations and a transfer cost for the second partitioned physical query operator; determine a third placement cost for the second partitioned physical query operator physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined third partition-wise placement cost for the second partitioned physical query operator; determine, for a third logical query operator associated with the second partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the second partitioned physical query operator; and determine an execution location for the second partitioned physical query operator based on the determined third partition-wise placement cost for the second partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

The system optimizes query execution in distributed database environments by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing partitioned queries in distributed systems, where data is divided into partitions and processed across different nodes. The system evaluates placement costs for query operators in a query operator tree, where each operator processes one or more table partitions. For a parent partitioned physical query operator in the tree, the system calculates a partition-wise placement cost based on the cost attributes of its associated table partitions and the partition-wise placement cost of its child operator. This cost includes both the execution cost at each operator execution location and the transfer cost for moving data between locations. The system then consolidates these partition-wise costs to determine an overall placement cost for the parent operator at each execution location. Additionally, the system computes a merged placement cost for a logical query operator associated with the parent operator, considering the lowest cost values with and without transfer costs. The execution location for the parent operator is determined by traversing the query operator tree from the highest to the lowest level, ensuring optimal placement decisions are made hierarchically. This approach minimizes overall query execution costs by balancing computational and data transfer expenses across distributed nodes.

Claim 4

Original Legal Text

4. A system according to claim 3 , the processor to execute the processor-executable program code in order to cause the system to: determine, for a third partitioned physical query operator in the query operator tree where the third partitioned physical query operator is associated with the logical query operator, a fourth partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the third partitioned physical query operator; determine a fourth placement cost for the third partitioned physical query operator for each of the plurality of operator execution locations based on the determined fourth partition-wise placement cost, the third partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the fourth partition-wise placement cost for each partition including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the third partitioned physical query operator; and determine one of the first partitioned physical query operator and the third partitioned physical query operator to execute the logical query operator based on the merged placement cost for the first partitioned physical query operator and the third partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree, wherein the first partitioned physical query operator and the third partitioned physical query operator determined to have the lowest cost value is selected to execute the logical query operator.

Plain English Translation

This invention relates to optimizing query execution in a distributed database system by efficiently placing partitioned physical query operators within a query operator tree. The system addresses the challenge of minimizing execution costs when processing queries that involve partitioned tables, where different partitions may be distributed across multiple execution locations. The system evaluates multiple candidate physical query operators for executing a logical query operator, considering both partition-wise placement costs and transfer costs between execution locations. For a given partitioned physical query operator, the system calculates a partition-wise placement cost based on cost attributes of the associated table partitions. It then determines a placement cost for each possible execution location by combining the partition-wise cost with transfer costs for moving data between locations. The system compares these costs across candidate operators, traversing the query operator tree from the highest to the lowest level, and selects the operator with the lowest overall cost to execute the logical query operator. This approach ensures optimal query performance by accounting for both computational and data transfer costs in a distributed environment.

Claim 5

Original Legal Text

5. A system according to claim 1 , the processor to execute the processor-executable program code in order to cause the system to: determine, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is a parent of the first partitioned physical query operator, a second partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator and the partition-wise placement cost of the first partitioned physical query operator, the second partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the second partitioned physical query operator; determine a second placement cost for the second partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined second partition-wise placement costs for the second partitioned physical query operator; determine, for a second logical query operator associated with the second partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the second partition-wise placement cost; and determine an execution location for the second partitioned physical query operator based on the determined second partition-wise placement cost for the second partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing the execution of database queries in distributed systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing complex queries in distributed environments, where data is partitioned across multiple nodes or locations. The system determines optimal placement for partitioned physical query operators in a query operator tree by analyzing partition-wise placement costs. For a parent operator in the tree, the system calculates a second partition-wise placement cost based on the cost attributes of its associated table partitions and the partition-wise placement cost of its child operator. This cost includes both the cost of processing at each execution location and the transfer cost between locations. The system then consolidates these costs to determine the overall placement cost for the parent operator across all execution locations. Additionally, the system evaluates the merged placement cost for the logical query operator associated with the parent operator, considering the lowest cost options with and without transfer costs. The execution location for the parent operator is determined by traversing the query operator tree from the highest to the lowest level, ensuring optimal placement decisions are made hierarchically. This approach minimizes overall query execution costs by balancing processing and transfer costs across distributed nodes.

Claim 6

Original Legal Text

6. A system according to claim 1 , the processor to execute the processor-executable program code in order to cause the system to: determine, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is associated with the logical query operator, a second partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator; determine a second placement cost for the second partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined second partition-wise placement costs for the first partitioned physical query operator; and determine one of the first partitioned physical query operator and the second partitioned physical query operator to execute the logical query operator based on the merged placement cost by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree, wherein the first partitioned physical query operator and the second partitioned physical query operator determined to have the lowest cost value is selected to execute the logical query operator.

Plain English Translation

The system optimizes query execution in database management by efficiently placing partitioned physical query operators within a query operator tree. The system addresses the challenge of determining the most cost-effective execution strategy for logical query operators by evaluating multiple partitioned physical query operators associated with the same logical query operator. For each partitioned physical query operator, the system calculates a partition-wise placement cost based on cost attributes of the table partitions involved. These partition-wise costs are then consolidated to determine an overall placement cost for each operator execution location. The system compares the placement costs of the first and second partitioned physical query operators, selecting the one with the lowest cost to execute the logical query operator. This selection is made by traversing the query operator tree from the highest to the lowest level, ensuring optimal performance by minimizing execution costs at each level. The system dynamically evaluates and merges costs to determine the most efficient execution path for the query.

Claim 7

Original Legal Text

7. A computer-implemented method, comprising: determining, starting at a lowest level of a query operator tree representation of a query of data stored in one or more table partitions within a plurality of operator execution locations for a first partitioned physical query operator in the query operator tree, a partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the first partitioned physical query operator and a partition-wise placement cost of any child physical query operator of the first partitioned physical query operator, the first partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost for each of the plurality of operator execution locations and a transfer cost; determining a placement cost for the first partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined partition-wise placement costs for the first partitioned physical query operator; determining, for a logical query operator associated with the first partitioned physical query operator, a merged placement cost for each of the plurality of operator execution locations, the merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost; and determining an execution location for the first partitioned physical query operator based on the determined partition-wise placement cost for the first partitioned physical query operator by traversing the query operator tree starting from a highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing queries on partitioned data stored across different locations, which can lead to inefficient query performance. The method involves analyzing a query operator tree representing a query on data stored in partitioned tables across multiple execution locations. For a given partitioned physical query operator in the tree, the method calculates a partition-wise placement cost for each table partition associated with the operator. This cost includes both the computational cost at each execution location and the transfer cost of moving data between locations. The method then aggregates these partition-wise costs to determine the overall placement cost for the operator at each execution location. Additionally, the method evaluates the logical query operator associated with the partitioned physical query operator, determining a merged placement cost for each execution location. This merged cost identifies the physical query operator with the lowest cost, considering both scenarios with and without transfer costs. Finally, the method determines the optimal execution location for the partitioned physical query operator by traversing the query operator tree from the highest to the lowest level, using the calculated costs to guide placement decisions. This approach ensures efficient query execution by minimizing computational and transfer costs across distributed systems.

Claim 8

Original Legal Text

8. A method according to claim 7 , further comprising: determining, for a first non-partitioned physical query operator in the query operator tree, a second placement cost for each of the plurality of operator execution locations based on placement costs of any child physical operators of the first non-partitioned physical query operator; determining, for a second logical query operator associated with the first non-partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the second logical query operator; and determining an execution location for the first non-partitioned physical query operator based on the determined second placement cost for the second non-partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing the execution of database queries by determining the most cost-effective placement of query operators in a distributed database system. The problem addressed is efficiently distributing query processing across multiple execution locations while minimizing data transfer costs and computational overhead. The method involves analyzing a query operator tree, which represents the logical and physical operations required to execute a database query. For a non-partitioned physical query operator in the tree, the method calculates a placement cost for each available execution location. This cost is derived from the placement costs of its child operators, accounting for data transfer costs if the operator is partitioned. The method then evaluates a logical query operator associated with the non-partitioned physical operator, computing a merged placement cost for each execution location. This merged cost indicates the lowest-cost physical operator option, considering both the operator's inherent cost and any partition-wise transfer costs. The execution location for the non-partitioned operator is determined by traversing the query operator tree from the highest to the lowest level, ensuring optimal placement decisions are made hierarchically. This approach balances computational efficiency and data transfer costs, improving overall query performance in distributed environments. The method is particularly useful in large-scale database systems where query execution must be distributed across multiple nodes.

Claim 9

Original Legal Text

9. A method according to claim 8 , further comprising: determining, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is a parent of the first partitioned physical query operator, a third partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator and the partition-wise placement cost of the first partitioned physical query operator, the second partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost for each of the plurality of operator execution locations and a transfer cost for the second partitioned physical query operator; determining a third placement cost for the second partitioned physical query operator physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined third partition-wise placement costs for the second partitioned physical query operator; determining, for a third logical query operator associated with the second partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the second partitioned physical query operator; and determining an execution location for the second partitioned physical query operator based on the determined third partition-wise placement cost for the second partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

The invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational cost and inefficiency in determining optimal placement of query operators in a distributed environment, particularly when dealing with partitioned tables and complex query operator trees. The method involves analyzing a query operator tree to determine the most cost-effective placement of partitioned physical query operators. For a second partitioned physical query operator in the tree, which is a parent of a first partitioned physical query operator, the method calculates a third partition-wise placement cost. This cost is based on the cost attributes of the table partitions associated with the second operator and the partition-wise placement cost of the first operator. The second operator operates on these partitions across multiple execution locations, and the partition-wise placement cost includes both the cost for each execution location and the transfer cost for the second operator. The method then determines a third placement cost for the second operator by consolidating the partition-wise placement costs across all execution locations. For a third logical query operator associated with the second operator, a second merged placement cost is calculated for each execution location, indicating the physical operator with the lowest cost, both with and without the transfer cost. Finally, the execution location for the second operator is determined by traversing the query operator tree from the highest to the lowest level, using the determined partition-wise placement costs. This approach ensures efficient query execution by minimizing costs

Claim 10

Original Legal Text

10. A method according to claim 9 , further comprising: determining, for a third partitioned physical query operator in the query operator tree where the third partitioned physical query operator is associated with the logical query operator, a fourth partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the third partitioned physical query operator; determining a fourth placement cost for the third partitioned physical query operator for each of the plurality of operator execution locations based on the determined fourth partition-wise placement cost, the third partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the fourth partition-wise placement cost for each partition including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the third partitioned physical query operator; and determining one of the first partitioned physical query operator and the third partitioned physical query operator to execute the logical query operator based on the merged placement cost for the first partitioned physical query operator and the third partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree, wherein the first partitioned physical query operator and the third partitioned physical query operator determined to have the lowest cost value is selected to execute the logical query operator.

Plain English Translation

This invention relates to optimizing query execution in database systems, particularly for partitioned tables. The problem addressed is efficiently determining the optimal placement of query operators in a distributed database environment to minimize execution costs, including data transfer and processing costs. The method involves analyzing a query operator tree representing a database query, where the tree includes partitioned physical query operators associated with logical query operators. For a specific partitioned physical query operator in the tree, the method calculates a partition-wise placement cost for each table partition associated with that operator. This cost includes a cost attribute for each possible operator execution location and a transfer cost for moving data between partitions. The method then determines the total placement cost for the operator across all possible execution locations based on these partition-wise costs. To optimize query execution, the method compares the placement costs of multiple partitioned physical query operators that can execute the same logical query operator. The operator with the lowest combined cost is selected. This selection process involves traversing the query operator tree from the highest to the lowest level, ensuring that the most cost-effective operator placement is chosen at each step. The goal is to minimize overall query execution time and resource usage by strategically distributing query operations across available execution locations.

Claim 11

Original Legal Text

11. A method according to claim 7 , further comprising: determining, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is a parent of the first partitioned physical query operator, a second partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator and the partition-wise placement cost of the first partitioned physical query operator, the second partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the second partitioned physical query operator; determining a second placement cost for the second partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined second partition-wise placement costs for the second partitioned physical query operator; determining, for a second logical query operator associated with the second partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the second partition-wise placement cost; and determining an execution location for the second partitioned physical query operator based on the determined second partition-wise placement cost for the second partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing partitioned queries in distributed environments, where data is split into partitions and processed across different nodes. The method involves analyzing a query operator tree to determine optimal placement of partitioned physical query operators. For a second partitioned physical query operator, which is a parent of a first partitioned physical query operator, the method calculates a second partition-wise placement cost. This cost is based on the cost attributes of the table partitions associated with the second operator and the partition-wise placement cost of its child operator. The second operator operates on these partitions across multiple execution locations, and the partition-wise placement cost includes both the cost attribute for each location and the transfer cost for the operator. The method then determines a second placement cost for the second operator at each execution location by consolidating the partition-wise placement costs. For the logical query operator associated with the second partitioned operator, a second merged placement cost is calculated for each location, indicating the lowest-cost physical operator with and without transfer costs. Finally, the execution location for the second operator is determined by traversing the query operator tree from the highest to the lowest level, ensuring optimal placement decisions are made hierarchically. This approach minimizes overall query execution costs by considering both computational and data transfer expenses.

Claim 12

Original Legal Text

12. A method according to claim 7 , further comprising: determining, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is associated with the logical query operator, a second partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator; determining a second placement cost for the second partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined second partition-wise placement costs for the first partitioned physical query operator; and determining one of the first partitioned physical query operator and the second partitioned physical query operator to execute the logical query operator based on the merged placement cost by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree, wherein the first partitioned physical query operator and the second partitioned physical query operator determined to have the lowest cost value is selected to execute the logical query operator.

Plain English Translation

This invention relates to optimizing query execution in database systems by efficiently placing partitioned physical query operators within a query operator tree. The problem addressed is the challenge of determining the most cost-effective placement of query operators across multiple execution locations, particularly when dealing with partitioned tables. The method involves analyzing partitioned physical query operators associated with a logical query operator to determine optimal execution locations. For a first partitioned physical query operator, a partition-wise placement cost is calculated based on cost attributes of its associated table partitions, and these costs are consolidated to determine an overall placement cost for each possible execution location. Similarly, for a second partitioned physical query operator linked to the same logical query operator, a second partition-wise placement cost is computed and consolidated. The query operator tree is then traversed from the highest to the lowest level to compare the merged placement costs of the first and second partitioned physical query operators. The operator with the lowest cost value is selected to execute the logical query operator, ensuring efficient resource utilization and performance. This approach optimizes query execution by dynamically evaluating and selecting the most cost-effective operator placements.

Claim 13

Original Legal Text

13. A database node comprising: a compilation server to: determine, starting at a lowest level of a query operator tree representation of a query of data stored in one or more table partitions within a plurality of operator execution locations for a first partitioned physical query operator in the query operator tree, a partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the first partitioned physical query operator and a partition-wise placement cost of any child physical query operator of the first partitioned physical query operator, the first partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost attribute for each of the plurality of operator execution locations and a transfer cost; determine a placement cost for the first partitioned physical query operator for each of a plurality of operator execution locations based on a consolidation of the determined partition-wise placement costs for the first partitioned physical query operator; determine, for a logical query operator associated with the first partitioned physical query operator, a merged placement cost for each of the plurality of operator execution locations, the merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost; and determine an execution location for the first partitioned physical query operator based on the determined partition-wise placement cost for the first partitioned physical query operator by traversing the query operator tree starting from a highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing queries on partitioned data in distributed environments, where suboptimal placement of query operators can lead to inefficient data processing and excessive data movement. The system includes a compilation server that analyzes a query operator tree representing a database query. Starting at the lowest level of the tree, the server calculates a partition-wise placement cost for each table partition involved in a first partitioned physical query operator. This cost considers both the cost attribute of each partition and the placement cost of any child operators. The server then consolidates these partition-wise costs to determine the overall placement cost for the operator across multiple execution locations. For the logical query operator associated with the physical operator, the server computes a merged placement cost, which includes the lowest-cost physical operator option, accounting for transfer costs. Finally, the server determines the optimal execution location for the physical operator by traversing the query operator tree from the highest to the lowest level, ensuring cost-effective query execution. This approach minimizes data transfer and computational overhead in distributed database systems.

Claim 14

Original Legal Text

14. A database node according to claim 13 , the compilation server further to: determine, for a first non-partitioned physical query operator in the query operator tree, a second placement cost for each of the plurality of operator execution locations based on placement costs of any child physical operators of the first non-partitioned physical query operator; determine, for a second logical query operator associated with the first non-partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the second logical query operator; and determine an execution location for the first non-partitioned physical query operator based on the determined second placement cost for the second non-partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing query operators across multiple execution locations. The problem addressed is the challenge of minimizing execution costs in distributed query processing, particularly when dealing with non-partitioned operators that do not inherently align with data partitioning schemes. The system involves a compilation server that analyzes a query operator tree to determine optimal placement for each operator. For a non-partitioned physical query operator, the server calculates placement costs for each possible execution location, considering the costs of its child operators. It then evaluates the operator's associated logical query operator, computing a merged placement cost that accounts for both the lowest-cost physical operator and the potential transfer costs of partition-wise placement. The execution location is selected by traversing the query operator tree from the highest to the lowest level, ensuring cost-effective placement decisions. This approach improves query performance by balancing computational and data transfer costs across distributed nodes.

Claim 15

Original Legal Text

15. A database node according to claim 14 , the compilation server further to: determine, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is a parent of the first partitioned physical query operator, a third partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator and the partition-wise placement cost of the first partitioned physical query operator, the second partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost for each of the plurality of operator execution locations and a transfer cost for the second partitioned physical query operator; determine a third placement cost for the second partitioned physical query operator physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined third partition-wise placement costs for the second partitioned physical query operator; determine, for a third logical query operator associated with the second partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement costs including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the partition-wise placement cost for the second partitioned physical query operator; and determine an execution location for the second partitioned physical query operator based on the determined third partition-wise placement cost for the second partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing partitioned queries in distributed environments, where data is divided into partitions and processed across different nodes. The system involves a compilation server that analyzes a query operator tree to determine optimal placement of query operators. For a second partitioned physical query operator in the tree, which is a parent of a first partitioned physical query operator, the server calculates a third partition-wise placement cost. This cost is based on the cost attributes of the table partitions associated with the second operator and the partition-wise placement cost of the first operator. The second operator processes these partitions across multiple execution locations, and the partition-wise placement cost includes both the cost of processing at each location and the transfer cost between locations. The server then determines a third placement cost for the second operator by consolidating the partition-wise placement costs across all execution locations. For a third logical query operator linked to the second operator, the server calculates a second merged placement cost for each location, indicating the lowest-cost physical query operator option, with or without transfer costs. Finally, the server determines the execution location for the second operator by traversing the query operator tree from the highest to the lowest level, ensuring optimal cost efficiency. This approach minimizes data transfer and processing overhead in distributed query execution.

Claim 16

Original Legal Text

16. A database node according to claim 15 , the compilation server further to: determine, for a third partitioned physical query operator in the query operator tree where the third partitioned physical query operator is associated with the logical query operator, a fourth partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the third partitioned physical query operator; determine a fourth placement cost for the third partitioned physical query operator for each of the plurality of operator execution locations based on the determined fourth partition-wise placement cost, the third partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the fourth partition-wise placement cost for each partition including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the third partitioned physical query operator; and determine one of the first partitioned physical query operator and the third partitioned physical query operator to execute the logical query operator based on the merged placement cost for the first partitioned physical query operator and the third partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree, wherein the first partitioned physical query operator and the third partitioned physical query operator determined to have the lowest cost value is selected to execute the logical query operator.

Plain English Translation

This invention relates to optimizing query execution in a distributed database system by evaluating and selecting the most cost-effective placement of partitioned physical query operators within a query operator tree. The system addresses the challenge of efficiently distributing query processing across multiple execution locations in a partitioned database environment, where different partitions of tables may be stored in different locations, leading to varying costs for data transfer and processing. The invention involves a compilation server that analyzes a query operator tree to determine the optimal placement of physical query operators for executing a logical query operator. For a given partitioned physical query operator, the server calculates a partition-wise placement cost based on cost attributes of the associated table partitions. This cost includes factors such as the processing cost at each operator execution location and the transfer cost for moving data between locations. The server then evaluates multiple candidate partitioned physical query operators, comparing their merged placement costs to select the one with the lowest overall cost. The selection process involves traversing the query operator tree from the highest to the lowest level, ensuring that the most efficient operator placement is chosen at each stage. This approach optimizes query performance by minimizing data transfer and processing overhead in a distributed database system.

Claim 17

Original Legal Text

17. A database node according to claim 13 , the compilation server further to: determine, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator is a parent of the first partitioned physical query operator, a second partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator and the partition-wise placement cost of the first partitioned physical query operator, the second partitioned physical query operator operating on the one or more table partitions within the plurality of operator execution locations and the partition-wise placement cost for each of the one or more table partitions including a cost attribute for each of the plurality of operator execution locations and a transfer cost for the second partitioned physical query operator; determine a second placement cost for the second partitioned physical query operator for each of the plurality of operator execution locations based on the determined second partition-wise placement costs for the second partitioned physical query operator; determine, for a second logical query operator associated with the second partitioned physical query operator, a second merged placement cost for each of the plurality of operator execution locations, the second merged placement cost including an indication of the physical query operator having a lowest cost value with and without the transfer cost of the second partition-wise placement cost; and determine an execution location for the second partitioned physical query operator based on the determined second partition-wise placement cost for the second partitioned physical query operator by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators across multiple execution locations. The problem addressed is the high computational and transfer costs associated with executing partitioned query operators in a distributed environment, where data is partitioned across multiple nodes. The invention provides a method for determining optimal execution locations for query operators by calculating partition-wise placement costs, which include both cost attributes of table partitions and transfer costs between execution locations. For a second partitioned physical query operator in a query operator tree, the system determines a second partition-wise placement cost based on the cost attributes of its associated table partitions and the partition-wise placement cost of a child operator. The system then calculates a second placement cost for each execution location and a second merged placement cost for the associated logical query operator, considering the lowest cost values with and without transfer costs. The execution location for the second partitioned physical query operator is determined by traversing the query operator tree from the highest to the lowest level, ensuring optimal placement decisions are made hierarchically. This approach minimizes overall query execution costs by balancing computational and transfer costs across distributed nodes.

Claim 18

Original Legal Text

18. A database node according to claim 13 , the compilation server further to: determine, for a second partitioned physical query operator in the query operator tree where the second partitioned physical query operator associated with the logical query operator, a second partition-wise placement cost based on a cost attribute of each of the one or more table partitions associated with the second partitioned physical query operator; determine a second placement cost for the second partitioned physical query operator for each of the plurality of operator execution locations based on a consolidation of the determined second partition-wise placement costs for the first partitioned physical query operator; and determine one of the first partitioned physical query operator and the second partitioned physical query operator to execute the logical query operator based on the merged placement cost by traversing the query operator tree starting from the highest level down to the lowest level of the query operator tree, wherein the first partitioned physical query operator and the second partitioned physical query operator determined to have the lowest cost value is selected to execute the logical query operator.

Plain English Translation

This invention relates to optimizing query execution in distributed database systems by efficiently placing partitioned physical query operators. The problem addressed is the challenge of determining the most cost-effective placement of query operators across multiple execution locations in a distributed environment, particularly when dealing with partitioned tables and logical query operators that can be implemented by multiple physical operators. The system involves a compilation server that analyzes a query operator tree representing a database query. For a partitioned physical query operator associated with a logical query operator, the server calculates a partition-wise placement cost for each table partition linked to the operator, based on cost attributes such as data size, network latency, or computational overhead. These partition-wise costs are consolidated to determine an overall placement cost for the operator at each possible execution location. The server then compares the costs of different physical operators (e.g., a first and second partitioned physical query operator) that can implement the same logical query operator. The decision is made by traversing the query operator tree from the highest to the lowest level, selecting the operator with the lowest merged placement cost. This ensures optimal query performance by minimizing execution costs while accounting for data distribution and partitioning.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

April 3, 2019

Publication Date

April 5, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Partition-wise processing distribution” (US-11294903). https://patentable.app/patents/US-11294903

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-11294903. See llms.txt for full attribution policy.

Partition-wise processing distribution