An improved selection-based parallel range-join algorithm in hypercubes
Author(s)
Shen, Hong
Griffith University Author(s)
Year published
1994
Metadata
Show full item recordAbstract
The range-join of two sets R and S is the set that contains all tuples (r, s) satisfying e1⩽|r-s|⩽e2, r?R and s?S. For computing the range-join of R and S in a hypercube of p processors, this paper presents an improved selection-based parallel algorithm which reduces the local memory from O(n) repaired in the previous algorithm to O(m+n/p), where |R|=m, |S|=n and p⩽max{m,n}. The new algorithm also reduces the best-case time complexity from O(m/p log2 p+n/p log m) of the previous result to O(m+n/p log2p) when m⩾plog, while maintaining the cost optimality in the worst case. Unlike the previous algorithm, our ...
View more >The range-join of two sets R and S is the set that contains all tuples (r, s) satisfying e1⩽|r-s|⩽e2, r?R and s?S. For computing the range-join of R and S in a hypercube of p processors, this paper presents an improved selection-based parallel algorithm which reduces the local memory from O(n) repaired in the previous algorithm to O(m+n/p), where |R|=m, |S|=n and p⩽max{m,n}. The new algorithm also reduces the best-case time complexity from O(m/p log2 p+n/p log m) of the previous result to O(m+n/p log2p) when m⩾plog, while maintaining the cost optimality in the worst case. Unlike the previous algorithm, our algorithm works by selecting the median of RUS to evenly partition the whole data set for divide-and-conquer join in the next phase. We present an upper bound of time complexity of the algorithm in the general case and show that the best-case time complexity of the algorithm is better than permutation-based range-join when n⩾plogp+1
View less >
View more >The range-join of two sets R and S is the set that contains all tuples (r, s) satisfying e1⩽|r-s|⩽e2, r?R and s?S. For computing the range-join of R and S in a hypercube of p processors, this paper presents an improved selection-based parallel algorithm which reduces the local memory from O(n) repaired in the previous algorithm to O(m+n/p), where |R|=m, |S|=n and p⩽max{m,n}. The new algorithm also reduces the best-case time complexity from O(m/p log2 p+n/p log m) of the previous result to O(m+n/p log2p) when m⩾plog, while maintaining the cost optimality in the worst case. Unlike the previous algorithm, our algorithm works by selecting the median of RUS to evenly partition the whole data set for divide-and-conquer join in the next phase. We present an upper bound of time complexity of the algorithm in the general case and show that the best-case time complexity of the algorithm is better than permutation-based range-join when n⩾plogp+1
View less >
Publisher URI
Subject
Environmental Sciences