Systems, Methods, And Devices For Managing Data Skew In A Join Operation

PublishedApril 6, 2021

Assigneenot available in USPTO data we have

InventorsFlorian Andreas Funke Thierry Cruanes Benoit Dageville Marcin Zukowski

Technical Abstract

Patent Claims

21 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method for managing data skew, the method comprising: detecting, by a local server comprising at least one hardware processor, data skew on a probe side of a join operation at a runtime of the join operation using a lightweight sketch data structure; identifying, by the local server, a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation; identifying, by the local server, a frequent build-side row having a frequent build-side join key corresponding to the identified frequent probe-side join key; asynchronously distributing, by the local server in response to detecting the data skew on the probe side of the join operation, the identified frequent build-side row to a plurality of remote servers that frequently transmitted the frequent build-side join key; asynchronously receiving, by each of the plurality of remote servers, the identified frequent build-side row; and altering, by each of the plurality of remote servers, an input link to route a frequent probe-side row comprising the identified frequent probe-side join key to a local instance of the join operation.

Plain English Translation

This invention addresses data skew in distributed join operations, a common problem in large-scale data processing where uneven key distribution causes performance bottlenecks. The method dynamically detects and mitigates skew during runtime using lightweight sketch data structures to monitor key frequencies. During the probe phase of a join operation, a local server identifies frequent probe-side join keys and corresponding frequent build-side rows with matching keys. In response to detected skew, the local server asynchronously distributes these frequent build-side rows to remote servers that frequently transmitted the build-side join key. Each remote server receives the distributed row and modifies its input routing to direct probe-side rows containing the frequent key to a local join operation instance. This approach reduces skew by localizing processing of high-frequency keys, improving join operation efficiency and load balancing across distributed systems. The solution operates without pre-processing or static partitioning, adapting dynamically to runtime conditions.

Claim 2

Original Legal Text

2. The method of claim 1 , wherein each of the plurality of remote servers is configured to generate a separate hash table for the identified frequent build-side row.

Plain English translation pending...

Claim 3

Original Legal Text

3. The method of claim 1 , further comprising: computing, by the local server, a hash value for the join operation; selecting, by the local server, a rowset comprising a plurality of rows of the join operation; and probing, by the local server, each of the plurality of rows of the rowset into a space saving algorithm using the hash value for the join operation.

Plain English translation pending...

Claim 4

Original Legal Text

4. The method of claim 3 , further comprising: updating, by the local server, the space saving algorithm based on incoming data; and identifying, by the local server for each update to the space saving algorithm, a frequency indicating how frequently the identified frequent probe-side join key is probed as a side-effect of updating the space saving algorithm.

Plain English translation pending...

Claim 5

Original Legal Text

5. The method of claim 4 , wherein asynchronously distributing the identified frequent build-side row to the plurality of remote servers is also in response to the identified frequency exceeding a predetermined threshold.

Plain English translation pending...

Claim 6

Original Legal Text

6. The method of claim 3 , further comprising: calculating, by the local server, a total number of rows of the join operation that have been probed into the space saving algorithm; calculating, by the local server, a threshold per worker thread based on the total number of rows of the join operation that have been probed into the space saving algorithm; and determining, by the local server based on the threshold per worker thread, whether the frequent build-side join key is frequent among all threads of at least one server among the local server and the plurality of remote servers.

Plain English Translation

This invention relates to optimizing database join operations in distributed systems, particularly focusing on improving efficiency in space-saving algorithms for frequent build-side join keys. The problem addressed is the inefficiency in distributed join operations where certain join keys appear frequently across multiple worker threads and servers, leading to redundant processing and memory usage. The method involves a local server calculating the total number of rows of a join operation that have been processed by a space-saving algorithm. Based on this total, the local server computes a threshold per worker thread. This threshold is then used to determine whether a particular build-side join key is frequent across all threads of at least one server, either the local server or any of the remote servers in the distributed system. By identifying frequent join keys, the system can optimize subsequent join operations, reducing unnecessary computations and memory overhead. The approach ensures that the space-saving algorithm adapts dynamically to the workload, improving overall performance in distributed database environments.

Claim 7

Original Legal Text

7. The method of claim 1 , wherein asynchronously distributing the identified frequent build-side row to the plurality of remote servers occurs only after determining, by the local server to a threshold confidence level, that the identified frequent probe-side join key is frequent on the local server.

Plain English translation pending...

Claim 8

Original Legal Text

8. Non-transitory computer readable storage media storing instructions that, when executed by one or more hardware processors, cause the one or more hardware processors to perform operations comprising: detecting, by a local server, data skew on a probe side of a join operation at a runtime of the join operation using a lightweight sketch data structure; identifying, by the local server, a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation; identifying, by the local server, a frequent build-side row having a frequent build-side join key corresponding to the identified frequent probe-side join key; asynchronously distributing, by the local server in response to detecting the data skew on the probe side of the join operation, the identified frequent build-side row to a plurality of remote servers that frequently transmitted the frequent build-side join key; asynchronously receiving, by each of the plurality of remote servers, the identified frequent build-side row; and altering, by each of the plurality of remote servers, an input link to route a frequent probe-side row comprising the identified frequent probe-side join key to a local instance of the join operation.

Plain English Translation

This invention relates to optimizing database join operations by dynamically addressing data skew during runtime. Data skew occurs when certain join keys are disproportionately frequent, causing performance bottlenecks in distributed database systems. The invention detects skew on the probe side of a join operation using a lightweight sketch data structure, which efficiently tracks key frequencies without excessive overhead. During the probe phase, the system identifies frequent probe-side join keys and corresponding frequent build-side rows with matching join keys. To mitigate skew, the system asynchronously distributes these frequent build-side rows to remote servers that frequently transmit the same join key. Each remote server receives the row and alters its input routing to direct probe-side rows containing the frequent join key to a local join operation instance. This redistribution balances the workload, reducing skew-related delays and improving overall join performance. The approach leverages asynchronous communication to minimize runtime disruption while dynamically adapting to skew patterns during execution.

Claim 9

Original Legal Text

9. The non-transitory computer readable storage media of claim 8 , the operations further comprising: computing, by the local server, a hash value for the join operation, selecting, by the local server, a rowset comprising a plurality of rows of the join operation; and probing, by the local server, each of the plurality of rows of the rowset into a space saving algorithm using the hash value for the join operation.

Plain English translation pending...

Claim 10

Original Legal Text

10. The non-transitory computer readable storage media of claim 9 , the operations further comprising: updating, by the local server, the space saving algorithm based on incoming data; and identifying, by the local server for each update to the space saving algorithm, a frequency indicating how frequently the identified frequent probe-side join key is probed as a side-effect of updating the space saving algorithm.

Plain English translation pending...

Claim 11

Original Legal Text

11. The non-transitory computer readable storage media of claim 10 , wherein asynchronously distributing the identified frequent build-side row to the plurality of remote servers is also in response to the identified frequency exceeding a predetermined threshold.

Plain English Translation

The invention relates to distributed data processing systems, specifically optimizing the distribution of frequently accessed data across multiple remote servers to improve efficiency. The problem addressed is the inefficiency in data processing when frequently accessed data is not optimally distributed, leading to redundant computations and increased latency. The system identifies frequent build-side rows in a data processing pipeline, where these rows are data elements that are repeatedly accessed or processed. Once identified, these frequent rows are asynchronously distributed to a plurality of remote servers. The distribution occurs in response to the identified frequency of access exceeding a predetermined threshold, ensuring that only the most relevant data is propagated to the servers. This reduces redundant processing and improves overall system performance by minimizing latency and computational overhead. The system may also include mechanisms to track the frequency of access for each row, compare it against the threshold, and trigger the asynchronous distribution process. The remote servers can then utilize the distributed data for further processing, such as joins or aggregations, without repeatedly fetching the same data from a central source. This approach enhances scalability and efficiency in large-scale data processing environments.

Claim 12

Original Legal Text

12. The non-transitory computer readable storage media of claim 9 , the operations further comprising: calculating, by the local server, a total number of rows of the join operation that have been probed into the space saving algorithm; calculating, by the local server, a threshold per worker thread based on the total number of rows of the join operation that have been probed into the space saving algorithm; and determining, by the local server based on the threshold per worker thread, whether the frequent build-side join key is frequent among all threads of at least one server among the local server and the plurality of remote servers.

Plain English translation pending...

Claim 13

Original Legal Text

13. The non-transitory computer readable storage media of claim 8 , wherein each of the plurality of remote servers is configured to generate a separate hash table for the identified frequent build-side row.

Plain English translation pending...

Claim 14

Original Legal Text

14. The non-transitory computer readable storage media of claim 8 , wherein asynchronously distributing the identified frequent build-side row to the plurality of remote servers occurs only after determining, by the local server to a threshold confidence level, that the identified frequent probe-side join key infrequent on the local server.

Plain English translation pending...

Claim 15

Original Legal Text

15. A system for managing data skew, the system comprising: a local server comprising: at least one local-server hardware processor; and one or more local-server non-transitory computer readable storage media containing local-server instructions that, when executed by the at least one local-server hardware processor, cause the at least one local-server hardware processor to perform local-server operations comprising: detecting data skew on a probe side of a join operation at a runtime of the join operation using a lightweight sketch data structure; identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation; identifying a frequent build-side row having a frequent build-side join key corresponding to the frequent probe-side join key; and asynchronously distributing, in response to detecting the data skew on the probe side of the join operation, the identified frequent build-side row to a plurality of remote servers that frequently transmitted the frequent build-side join key; and the plurality of remote servers each comprising: at least one remote-server hardware processor; and one or more remote-server non-transitory computer readable storage media containing remote-server instructions that, when executed by the at least one remote-server hardware processor, cause the at least one remote-server hardware processor to perform remote-server operations comprising: asynchronously receiving the identified frequent build-side row; and altering an input link to route a frequent probe-side row comprising the identified frequent probe-side join key to a local instance of the join operation.

Plain English Translation

The system addresses data skew in distributed join operations, a common issue in large-scale data processing where uneven key distribution causes performance bottlenecks. During a join operation, a local server detects skew on the probe side using a lightweight sketch data structure, which efficiently tracks key frequencies without full data scans. The system identifies frequent probe-side join keys and corresponding frequent build-side rows during the probe phase. Upon detecting skew, the system asynchronously distributes the frequent build-side row to remote servers that frequently transmitted the matching probe-side join key. Each remote server receives the row and adjusts its input routing to direct frequent probe-side rows to a local join instance, balancing the workload. This dynamic redistribution mitigates skew by ensuring frequent keys are processed locally, reducing network overhead and improving join efficiency. The approach leverages asynchronous communication to minimize latency and maintains scalability across distributed environments.

Claim 16

Original Legal Text

16. The system of claim 15 , the local-server operations further comprising: computing a hash value for the join operation; selecting a rowset comprising a plurality of rows of the join operation; and probing each of the plurality of rows of the rowset into a space saving algorithm using the hash value for the join operation.

Plain English translation pending...

Claim 17

Original Legal Text

17. The system of claim 16 , the local-server operations further comprising: updating the space saving algorithm based on incoming data; and identifying, for each update to the space saving algorithm, a frequency indicating how frequently the identified frequent probe-side join key is probed as a side-effect of updating the space saving algorithm.

Plain English translation pending...

Claim 18

Original Legal Text

18. The system of claim 17 , wherein asynchronously distributing the identified frequent build-side row to the plurality of remote servers is also in response to the identified frequency exceeding a predetermined threshold.

Plain English translation pending...

Claim 19

Original Legal Text

19. The system of claim 16 , the local-server operations further comprising: calculating a total number of rows of the join operation that have been probed into the space saving algorithm; calculating a threshold per worker thread based on the total number of rows of the join operation that have been probed into the space saving algorithm; and determining, based on the threshold per worker thread, whether the frequent build-side join key is frequent among all threads of at least one server among the local server and the plurality of remote servers.

Plain English Translation

This invention relates to optimizing join operations in distributed database systems, particularly focusing on efficient handling of frequent join keys across multiple servers. The problem addressed is the computational overhead and resource inefficiency when processing large-scale join operations in distributed environments, where certain join keys appear frequently, leading to redundant processing. The system includes a local server and multiple remote servers that collaborate to execute a join operation. The local server performs operations to optimize the join by tracking and analyzing the frequency of build-side join keys across all worker threads in the distributed system. Specifically, the local server calculates the total number of rows of the join operation that have been processed by a space-saving algorithm designed to reduce memory usage. Based on this total, a threshold is computed for each worker thread. The system then determines whether a frequent build-side join key is prevalent across all threads of at least one server, either the local server or any of the remote servers. This determination helps in identifying and prioritizing frequently accessed keys, allowing for more efficient join processing by reducing redundant computations and improving overall system performance. The approach ensures that the system dynamically adapts to the workload, optimizing resource utilization in distributed join operations.

Claim 20

Original Legal Text

20. The system of claim 15 , wherein each of the plurality of remote servers is configured to generate a separate hash table for the identified frequent build-side row.

Plain English translation pending...

Claim 21

Original Legal Text

21. The system of claim 15 , wherein asynchronously distributing the identified frequent build-side row to the plurality of remote servers occurs only after determining, by the local server to a threshold confidence level, that the identified frequent probe-side join key is frequent on the local server.

Plain English Translation

This invention relates to distributed database systems, specifically optimizing join operations in a distributed environment. The problem addressed is inefficient data distribution during join operations, where frequent join keys are not effectively leveraged across servers, leading to redundant data transfers and processing delays. The system includes a local server and multiple remote servers. The local server identifies frequent join keys from a probe-side table and determines their frequency on the local server. Only when a join key meets a predefined confidence threshold is it distributed asynchronously to the remote servers. This ensures that only relevant, high-frequency join keys are propagated, reducing network overhead and improving join performance. The system also includes mechanisms for tracking join key frequency, managing asynchronous distribution, and coordinating join operations across servers. By dynamically adjusting data distribution based on frequency analysis, the system minimizes unnecessary data transfers while ensuring efficient join processing. This approach is particularly useful in large-scale distributed databases where join operations are frequent and performance is critical.

Patent Metadata

Filing Date

Unknown

Publication Date

April 6, 2021

Inventors

Florian Andreas Funke

Thierry Cruanes

Benoit Dageville

Marcin Zukowski

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search