CS 467/567, Assignment 4
Due on 22 April at 11:59 pm
This assignment can be solved in teams of no more than two students. Submit your solutions by
email as a document typeset to PDF. Include all your references (if any) and also cite them in the
text as appropriate.
Late submissions will incur a penalty of 20% per day late.
Problem 1: Divide and Conquer Prefix Computations
We use again divide and conquer method to perform prefix computation (with ◦ as binary operator) on a sequence S of size n. This time the running time will be O(log log n) time using
O(n/ log log n) processors. This algorithm would be optimal, faster than the one mentioned in
the class notes, but also using more processors. To obtain such an algorithm we proceed as follows:
(1) Divide the sequence S into n
subsequences of size n
each. Perform prefix computation
on each sequence recursively using n
1/2 processors. Let the result of the computation on the
i-th sequence be hs(i, 1),s(i, 2), . . . ,s(i, n
(2) Perform a prefix computation on the sequence hs(1, n
), . . . ,s(n
1/2 − 1, n
)i using n processors. Let the result of this computation be hs
), . . . ,s
(3) For all 1 ≤ i < n
1/2 use ◦ to combine s
) with all the elements of hs(i + 1, 1),s(i +
1, 2), . . . ,s(i + 1, n
)i using n
1/2 processors per subsequence.
1. Show that the algorithm described above uses n processors to run in O(log log n) time.
2. Show how the number of processors can be reduced to O(n/ log log n) while maintaining
the O(log log n) running time, thus achieving the optimal cost O(n).
3. Discuss the implications of the result obtained in Question 2 on related computations such
as array packing.
Problem 2: Descending in a Hypercube
Let N = 2
g data be stored in the processors of a hypercube, one value per processor (so that the
is stored in Pi
, 0 ≤ i < N). A technique with wide applicability to hypercube algorithms
is called DESCEND and consists of g iterations. During the j-th iteration a basic binary operation
, Pl) is performed on data whose indices differ by 2j
. Here is the pseudocode for
this process, with ig−1ig−2 . . . i1i0 the binary representation of index i:
for j = g − 1 down to 0 do
for i = 0 to 2
g − 1 do in parallel
if ij = 0 then OPERATION(Pi
If OPERATION requires constant time then DESCEND runs in O(log N) time. A dual to DESCEND
is ASCEND, in which j in the outer loop varies from 0 to g − 1.
1. Show how DESCEND can be used to obtain an algorithm that computes the sum of all the
values xi and places the result in P0.
2. Use either ASCEND or DESCEND to broadcast the datum held by some processor Pk
to all the
3. Suggest other problems that can be solved using these paradigms.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: firstname.lastname@example.org 微信号:vipnxx