In assignment 2, we are looking for you to be able to demonstrate a few different things:
Using pthread locks and signals
Managing concurrency with pthread locks and signals, and FIFOs
Starting, running, and managing processes.
Comparisons of runtimes using simple statistics
Submissions that violate any of the following will not be evaluated (i.e., you will receive a score of 0 for the assignment):
All solutions must be written in C. No other languages are accepted.
All solutions must include a working Makefile.
Forget how to make a Makefile? Never knew how to make a Makefile?
- Use the Makefile that has been distributed to you in class (you have permission to use this!), or
- You can go to https://makefiletutorial.com and find some very easy to use basic examples.
Your Makefile should contain a clean rule. When make clean is run in your directory, all the temporary files, including the executable should be removed (e.g., rm -f my_prog).
All solutions must be compiled by issuing the command make in the directory containing the Makefile. No other build commands are acceptable, even if documented in the README.md.All solutions must include a Markdown-formatted README.md file that minimally describes how to run your submission.
Forget how to make a README.md? Never knew how to make README.md? Thankfully, you can go to https://readme.so/ and build a nice, Markdown-formatted README.md with a nice preview and some handy buttons to help you build it.
All solutions must run to successful completion. Premature termination for any reason (no obvious output, Segmentation Fault, Bus Error, etc) is considered to be a solution implementation that does not run.
For programs that are expected to terminate on their own (without your intervention),programs should complete in a reasonable amount of time (e.g., < 30 seconds), or have output indicating that it has not deadlocked.
All solutions must compile. Code that does not compile will not be evaluated.
Programs must produce no errors when compiled with the flags
-Wall -Wpedantic -Wextra -Werror
Note that -Werror prevents your code from being compiled when warnings are present.
If these flags are not in your Makefile, your submission will be treated as though it does not compile.
Your code must compile and run on a machine on aviary.cs.umanitoba.ca.
This assignment is subject to the late policy that is outlined in the ROASS. Submissions will not be accepted by e-mail.
Reminder: All submitted code will be evaluated using an automated similarity testing tool,alongside commonly available online solutions for problems and students submissions from past terms.
We want to explore concurrency in operating systems: Does it improve processing speed? What are different ways to use concurrency tools that operating systems provide us?
To do this, we are going to:
take many text files
in each text file, read the file word-by-word place all the letters that start with ‘a’ in a file named a.txt, b in b.txt, and so onwords may start with letters that can be in upper or lower case. Both upper and lower case words should go to the same file. Example: Aardvark and aardvark would both go to a.txt.
any words that do not start with a letter (“‘*\[\]() etc.) go to other.txt And that’s it? What could possibly go wrong? Except that writing to a file isn’t thread-safe.
Order does not matter in our output files, so the only issue with writing the output files with multiple threads is losing or interleaving data. The output files should have 1 word per line that is not mangled.
Question 1: Baseline – single threading
Write a C program that does the above. Use no threading, or multiprocessing of any kind.
./myQ1 all.txt the.txt files.txt
There are a number of input files that are available for your testing. Don’t unzip it in your home folder on the unix system. If you want to test in aviary, you can use the path ~comp3430/shared/a2_files
You may want to write your outputs to the local machine’s /tmp folder, otherwise you may fill up your home folder’s space. Check your current usage with quota -s.
Question 2: Multiple threads
Create a work pool by using a configurable number pthreads.
Your command-line arguments should be:
./myQ2 number_of_threads all.txt the.txt files.txt
./myQ2 number_of_threads inputs/*.txt
./myQ2 number_of_threads ~comp3430/shared/a2_files/*.txt
The threads should be requesting work from the main process by using semaphores/condition variables by passing file names in shared memory.The threads must lock the output files for writing, otherwise they will not successfully write the data to the files. Use pthread locks for locking the output text files.
Question 3: FIFOs with threads
Now, do it once more, but using a different way of writing the data.
Use the same work pool that passes filenames for processing the files, but instead of using pthread locks, pass the data to a FIFO (named pipe). Have 26 processes or threads that are reading from the pipes and writing the files.
./myQ3 number_of_worker_threads all.txt the.txt files.txt
./myQ3 number_of_worker_threads inputs/*.txt
./myQ3 number_of_worker_threads ~comp3430/shared/a2_files/*.txt
You may choose to use 1 program with many threads to do this, or create and launch 2 programs – one that handles the file writer threads/processes that read from FIFOs and write the files, one that handles the reading and writing to the FIFOs. The UNIX way would be to have “one program that does its job well”, therefore the latter would be a “better” solution.
First, we need to find the optimal number of worker threads to use for parts 2 and 3. Run your programs many times with 1 to 100 worker threads. Create a scatterplot or line graph with the xaxis the number of threads, and the y-axis the time taken to do the processing. Provide the graphs, with 1 paragraph (100 words or less) explaining why you think the data looks like it does,and what you believe the optimal number of threads to be.
Write a report describing your findings. Which hypothesis is true:
H0 : Single threading is the fastest
H1 : Using multiple worker threads with locks on files to handle concurrency is the fastest(using your optimal number of threads)
H2 : Using FIFOs and to handle concurrency is the fastest (using your optimal number of worker threads)
H3 : As long as we are multi-tasking, it’s about the same
Do do this, you will need to run your programs many times so that you can create a mean and standard deviation. Run your programs at least 100 each to create the data for your analysis.Provide a boxplot that describes the runtimes of your programs.
For our purposes, if the whiskers (3rd standard deviations) do not overlap, it is statistically significant.
Write 1 paragraph of what you conclude. 200 words maximum. Provide an explanation of the results. Was the result significantly different between the approaches, why or why not?
Notes about graphs, data collection, files, and libraries
Don’t manually run your programs 100 times each. That’s dumb. Write a loop in your program do run the code 100 times (exec if you’re feeling cool). Or, use a bash for loop.
for myCount in `seq 100`
echo Change the below to run the program $myCount
Have your program print its runtime to a file. There’s provided code for getting the runtime in milliseconds.
Then, import that data into Excel – the university pays for your Office 365 license! (With your money, so free but not)
You should use the appropriate way of adding images for your provided graphs.
Consider not polluting your code space! Your makefile can create folders outputs and pipes wherein you put… your outputs…. and pipes. This makes them much easier to clean up.
I used a number of libraries to make my program go. I’ve documented why I imported them – you may want to use similar libraries. The string library is particularly helpful in … dealing with strings.
本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: email@example.com 微信号:vipnxx