这是一个美国的C语言数据库系统编程代写
overview
In this project, you are asked to implement in C algorithms for various versions of binary search and to use them to implement an in-memory band-join algorithm. Grading will be based on both your code, and your analysis of the performance results you generate on Google cloud machines. You will work in groups of two or three. For three person groups, there is an additional programming requirement.
By default we’ll be using the same groups as for project 1. If your partner from project 1 dropped the class, or if you wish to work with different people this time, the TAs will facilitate the matching of students. The TAs will post details of the matching process on Ed. If for some reason (e.g., a partner drops the class well into the semester) a student ends up working alone, the student is expected to complete the entire project expectations for 2-person groups.
The project consists of several parts, each with its own submission deadline. You can download the code templates from courseworks under the “Files” tab. You will be running the code on Google cloud machines. The TAs will provide each of you with a $50 coupon for the use of the Google cloud. The TAs will separately provide documentation for how to use the Google cloud for this project. In particular, it will be important for you to select the right class of machines that support the full AVX512 SIMD instruction set.
Details of exactly how to submit the code and pdf files will be provided later.
Part 1: Data–Dependent Binary Search
In this part of the project, you will explore the conversion of control dependencies to data dependencies to improve performance. You are supplied with a fully functional C function lower bound that performs a binary search where you want the the index of the first key that is bigger than or equal to the search key. You are initially expected to write two C functions lower bound nb arithmetic and lower bound nb mask that replace the if test in lower bound with code that does not use either the if statement or the C idiom condition ? x : y. As implied by the names of the functions, lower bound nb arithmetic should use arithmetic operations to avoid if statements, while lower bound nb mask should use bit manipulation operations (e.g., &,|,~) and masking. Hint: you may use one subtraction in the lower bound nb mask function, and the 64-bit representation of -1 is 0xFFFFFFFFFFFFFFFF.
You should now write a function lower bound nb mask 8x that tries to do 8 binary searches in parallel, hoping that the underlying system will overlap the cache miss latencies for concurrent searches. Inside this routine, there should be an outer while loop to check a termination condition, and an inner for loop to iterate through each of the 8 searches. The inner loop should use the same mask-based computation as your lower bound nb mask algorithm. These is a subtlety in this parallelization in that some of the 8 searches may need more iterations than others. In such a case your loop should continue until all searches are complete and should make sure that “extra” loops of a completed search do not change the search result.
Finally, you should write a function lower bound nb mask 8x AVX512 that uses AVX512 SIMD instruc- tions to do 8 binary searches in parallel. In AVX512, 512-bit registers can hold eight 64-bit values and operate on the component values in parallel. You can find a set of functions called “intrinsics” that will be useful for the project here. For our purposes, you will primarily need functions that have a mm512 prefix and a suffix epi64 meaning that they operate on 64-bit signed integers. Within these instructions, you’ll probably want to use some combination of the following types of instructions:
setl These instructions allow you to load all elements of a vector with a scalar value.
cmp.. These are various comparison instructions such as cmpge and cmplt that compare two registers and
set a scalar bitmap as the result.
srli Shift right. Handy for dividing by 2.
mask blend Combine data from two different registers into a third register depending on the mask bitmap.
i64gather Load 8 values from 8 different offsets from a base address.
add,sub,… There are various arithmetic functions available.
You will profile the code using the provided bulk binary search and bulk binary search 8x functions. Simply uncomment the variant of the code that you want to measure within bulk binary search and bulk binary search 8x. You will run the code by invoking
./db4112 N X Y Z R
For this part of the project you can ignore X, Y and Z and supply dummy values. N is the number of elements in the array being searched, which is also the same as the number of probes. For this test, the provided code searches the array for each of its elements in a random order. R is the number of repeats of the experiment within the timing loop, which you should set to an appropriate number to deliver stable results. (If this number is too low, then various overheads of setting up the search may dominate the measured time; if this number is too high, your experiment will take too long.) For example, when I run
./db4112 10000 1 1 1 1000
to profile lower bound on a local machine, the output is
Time in bulk_binary_search loop is 1016150 microseconds or 0.101615 microseconds per search Time in bulk_binary_search_8x loop is …
For this part of the project you will submit your code for the lower bound nb arithmetic, lower bound nb mask, lower bound nb mask 8x and lower bound nb mask 8x AVX512 functions. You will also submit a pdf doc-ument containing a graph showing the measured performance of the five routines as you varied N from 10 to 107 . So N should be on the x-axis in log-scale, time-per-search should be on the y-axis in linear-scale, and your graph should have five labeled curves, one per function. Make sure to measure enough N values to generate reasonable resolution in your results. In the pdf document, you should:
- Statethe machine characteristics for the machine on the Google cloud that you used. See the companion Google cloud documentation for how to identify the machine characteristics.
- Statewhichof the methods performs best in your experiments.
- Inoneto two paragraphs, explain the shape of your measured performance curves. For example, you might notice an increase in the time taken at particular values of N. Explain the significance of these particular N values.
程序辅导定制C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: vipdue@outlook.com 微信号:vipnxx
如果您使用手机请先保存二维码,微信识别。如果用电脑,直接掏出手机果断扫描。
