A Smurf

Go Smurfs Project

by: burt rosenberg
at: october 2022

Overview

This project continues goals the exploration of concurrency and concurrency harnessing primitives.

We will be using a barrier construct to coordinate the completion of several tasks by our friendly cohort of Smurfs. Each Smurf carries out a sequence of tasks, which each Smurf taking a various amount of time for each task. However, all Smurfs must finish their first task before any Smurf begins its second task. And so on: all Smurfs must finish their i-th task before beginning their i+1-th task.

The barrier construction described in Allen B. Downey's The Little Book of Semaphores gives us the mechanism. It uses the class PV semaphore. The previous project gave exercises in the mutex and the condition variable upon which signal/wait can be performed. Here we look at semaphores, a concurrency primitive introduced as the PV semaphore by computer science Edgar Dijkstra in 1963 (see references).

The goals for this project are,

  1. The PV semaphore, and the POSIX version.
  2. The barrier pattern for concurrency.
  3. Techniques of sharing data among threads in a single process space, including,
    1. Common data: the data location is shared by all threads and,
    2. Thread-local data: which have separate data locations for each thread.

Man page

NAME
    smurf --- smurfs do their tasks
      
SYNOPSIS

    smurf [-v] [-k number-of-tasks] number-of-smurfs

DESCRIPTION

    A number of smurfs complete a number of tasks. All the smurfs must finish 
    the i-th task before any begin the i+1-th task. The amount of time a smurf 
    finishes a given task depends on both the smurf and the task.
 
OPTIONS

    -v verbose
    -k the number of tasks. The default is 5 tasks.
     
ARGUMENTS

    The number of smurfs.

OUTPUT
  
    The output for the non-verbose mode is given precisely in the template.
    
HISTORY

    New for csc421-231.

    

POSIX semaphores

The implementation in this project will be based on POSIX semaphores. The three POSIX semaphore operations are,

  1. sem_init to initialize the semaphore,
  2. sem_wait to preform the P operation,
  3. sem_post to perform the V operation.
The POSIX semaphore is defined slightly different than the classical PV semaphore. If a thread will wait on the semaphore, the classical PV semaphore decrements then waits whereas the POSIX semaphore waits then decrements. As a result, a count which could be associated to a POSIX semaphore is never negative.

Data structures

The data structures that share information among threads are,

  1. The smurf_start_g global.
  2. The struct Barrier which includes all variables to implement the barrier,
  3. The struct T_arg which is the argument to the thread.


extern int smurf_start_g ;

struct Barrier {
	unsigned int count ;
	unsigned int n ;
 	sem_t mutex ;
 	sem_t turnstile ;
 	sem_t turnstile2 ;
} ;

struct T_arg {
 	unsigned int smurf ;
 	unsigned int task_num ;
 	struct Barrier * barrier ;
} ;

Some data is common to all threads. The threads all start when the value of smurf_start_g is made non-zero. All threads can access this data cell because,

All threads access a common barrier structure through a pointer to the structure that is copied into the barrier field of every struct T_arg.

The barrier structure contains,

These are given the same name as in the reusable barrier solution, section 3.7.5 of Downey's book.

Some data is per-thread. The struct T_arg has the field smurf which is set to the particular smurf number.

The template files.

Find in [repo]/class/proj3 the files,

  1. The Makefile,
  2. The header file smurf.h,
  3. The main C file smurf.c,
  4. The helper C file smurf-util.c,
  5. The output reference smurf.ref.
(see this in github)

The project requires that you complete the smurf() function in the in file smurf.c. The reference, header and helper files probably do not need changes. You might be editing the Makefile to ease your project development and test.

The required output

The line,
   printf("smurf %d has finished task %d\n",t_arg->smurf,task) ;
in the template, should remain, so that the required output is identical for all working programs.

Here is an example of the output. The left is a run from the template files as given in the class folder, your starting point. To the right is a completed solution. Depending on how the scheduler schedules your threads your milage may vary.


THE RESULT WITHOUT BARRIERS
---------------------------
Get ready, Smurfs!
3
2
1
GO SMURFS!!!
smurf 2 has finished task 1
smurf 5 has finished task 1
smurf 2 has finished task 2
smurf 3 has finished task 1
smurf 1 has finished task 1
smurf 2 has finished task 3
smurf 4 has finished task 1
smurf 5 has finished task 2
smurf 4 has finished task 2
smurf 3 has finished task 2
smurf 4 has finished task 3
smurf 2 has finished task 4
smurf 1 has finished task 2
smurf 5 has finished task 3
smurf 4 has finished task 4
smurf 3 has finished task 3
smurf 2 has finished task 5
smurf 4 has finished task 5
smurf 5 has finished task 4
smurf 1 has finished task 3
smurf 1 has finished task 4
smurf 1 has finished task 5
smurf 3 has finished task 4
smurf 3 has finished task 5
smurf 5 has finished task 5


THE RESULT WITH BARRIERS
------------------------
Get ready, Smurfs!
3
2
1
GO SMURFS!!!
smurf 2 has finished task 1
smurf 5 has finished task 1
smurf 3 has finished task 1
smurf 1 has finished task 1
smurf 4 has finished task 1
smurf 4 has finished task 2
smurf 2 has finished task 2
smurf 5 has finished task 2
smurf 3 has finished task 2
smurf 1 has finished task 2
smurf 4 has finished task 3
smurf 2 has finished task 3
smurf 5 has finished task 3
smurf 3 has finished task 3
smurf 1 has finished task 3
smurf 1 has finished task 4
smurf 4 has finished task 4
smurf 2 has finished task 4
smurf 5 has finished task 4
smurf 3 has finished task 4
smurf 3 has finished task 5
smurf 1 has finished task 5
smurf 4 has finished task 5
smurf 2 has finished task 5
smurf 5 has finished task 5

Hints! How to get your types right.

The data structures in the project are slightly sophisticated, with struct's that contain pointers to other structures. Thinking through the data structure is good, but one can also just let the types guide you. Some people misrepresent C as a weakly typed language. Maybe they mean is that C supports unlimited coercion. The type system is clear and explicit, with exact rules that will help you find the right syntax.

For instance, the call to sem_post requires an argument of type sem_t *. Following syntax rules, you can figure out what to code:

  1. Suppose:
    t_a is of type struct T_arg *,
  2. apply the indirection operator:
    *t_a is of type struct T_arg,
  3. apply the selection operator:
    (*t_a).barrier is of type struct Barrier *,
  4. rewrite using the component selection operator:
    (*t_a).barrier rewritten as t_a->barrier
  5. apply the indirection operator:
    *t_a->barrier is of type struct Barrier,
  6. apply the selection operator:
    (*t_a->barrier).turnstile is of type sem_t,
  7. rewrite using the component selection operator:
    (*t_a->barrier).turnstile rewritten as t_a->barrier->turnstile,
  8. apply the address operator:
    &t_a->barrier->turnstile is of type sem_t *.

I have made use of the equivalence of the component section operator which combines the indirection operator with the selection operator,

    p->f is always the same as (*p).f (Harbison & Steele 7.4.2 p 149 2nd Ed., p 213 5th Ed.)

Let's talk about linking–loading.

A runnable is a combination of compiled code in modules and libraries. The code modules customarily have extension .o and the libraries have extension either .a, for archive, or .so (dot es oh), for shared object. The program ld creates the runnable, and is called the loader. Some older people would call it the linking-loader, as there are two steps to creating the runnable, linking and loading.

For simple programs the loader is almost invisible. It is combined by default with the compiler, and all that some younger people might know of the loader is the ‐l option that specifies libraries in some mysterious manner.

The compiler works with the loader to compile any dot-c's into dot-o's, and to search down libraries appearing as options. This is very importantly done left to right, and if you not name things in a good order, your load will fail.

Because both modules smurf and smurf-util will use functions in the pthread library, the option -lpthread is named to the right of both. To lean more about how this option binds to a library refer to man ldconfig. On AWS Ubuntu, the pthread library resolved to,

    /lib/x86_64-linux-gnu/libpthread.a
and was statically linked, as ldd smurf does not show a run-time dependency (on a .so).

References.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

author: burton rosenberg
created: 15 oct 2022
update: 16 oct 2022