CITS2002 Systems Programming, Lecture 20,

CITS2002 Systems Programming

CITS2002

CITS2002 schedule

Introduction to multi-threaded programming

Contemporary computers have the ability to execute multiple operations at the same time - or at least they appear to do so.

We know from earlier lectures that operating systems address the concepts of allocating and sharing a single CPU (process scheduling) and RAM (memory management), but we still have competing goals:

From the perspective of the operating system, the goal is to fairly, securely, and efficiently shared resources.
From the perspective of each process, the goal is to have its required task completed as quickly as possbly, both in terms of speed of execution and responsiveness.

Our laptop computers, for example, will be 'executing' a few hundred processes (most of them sleeping!). As we type a document, we can listen to music, download files, and manipulate a graphical interface via peripherals such as mouse, tablet, and keyboard. For example, when a program does I/O (say to a disk), the CPU may be switched such that it is able to work on some other task while the I/O is taking place. But, as our laptops are considered single-user devices, we accept the conflicts between the competing goals of the operating system and the user.

In this and the next lecture, we'll casually examine the concept of concurrency - that independent sets of operations can be occurring at the same time. Operating systems allow the user to both manage concurrency and to exploit concurrency to improve performance.

The focus here will be more on the user and their programming, than the operating system and its resources.

CITS2002 Systems Programming, Lecture 20, p1, 9th October 2023.

Some definitions

Threads are a programming abstraction designed to allow programmers to control concurrency and asynchrony within a program.

In some programming languages, like Java, threads are "first-class citizens" - they are part of the language specification itself. For others languages, like C and C++, threads have not traditionally been part of the language specification, and were implemented as a library of functions linked with, and called by, a program.

This changed with C11 which now defines portable threads as part of its standard.

The differences between having threads "in the language" and threads "as a library" are subtle and important. For example, a C compiler need not take into account thread control (leaving that to the programmer), while a Java compiler must. Analogously, we see that C enables fine-grain control of memory mangement, while Java strictly defines it as part of its standard.

In the library case, it is possible to choose from different thread libraries to provide different semantics. In contrast, (all conforming) Java and C11 programs are required to support their single thread model and API.

What is a thread?

We'll informally define threads within a program to include three things:

a sequence of instructions to be executed,
a set of local variables that "belong" to the thread, and
a set of shared global variables that all threads can read and write.

This definition corresponds roughly to the C language model of sequential execution and variable scoping. Operating systems are still significantly written in C and, thus, thread libraries for C are easiest to understand and implement when they conform to a familiar model.

And why?

Threads provide two important opportunities:

they allow us to deal with asynchronous events synchronously and efficiently, and
they allow us to get parallel performance on a shared-memory multiprocessor.

CITS2002 Systems Programming, Lecture 20, p2, 9th October 2023.

The motivation for threads

Threads are very useful in modern programming whenever a process has multiple 'tasks' to perform independently of the others. This is particularly true when one of the tasks may block, but allows the other tasks can proceed without blocking.

Consider a programming IDE. One background thread may check syntax while a foreground thread receives user input (keystrokes), a third thread may even compile the (incomplete?) program, while a fourth thread periodically saves the code so that we may revert to earlier versions.
Consider a web client (browser). One thread renders a webpage on a graphical interface, another waits for keyboard input, another for mouse input, while other threads download embedded images from different remote websites, and even download the content from the 'top' links on the current webpage - anticipating that we will soon click on them.

And the benefits

There are four core benefits to multi-threading:

Responsiveness - one thread may provide rapid response while other threads are blocked or slowed down undertaking I/O or intensive calculations.
Resource sharing - by default threads share common code, data, and other resources, allowing multiple tasks to be performed simultaneously in a single, smaller, address space.
Efficiency - creating and managing threads (and context switching between them) is very much faster than performing the same tasks for processes. Because per-process resources are shared amongst threads, they don't need to be saved and restored when threads are suspended, scheduled, and resumed.
Scalability (utilization of multiprocessor architectures) - a single threaded process can only run on one CPU (one core), no matter how many may be available. In contrast, the execution of a multi-threaded application may be split amongst available cores. Note that single threaded processes still benefit from multi-processor architectures when other processes contend for other cores.

CITS2002 Systems Programming, Lecture 20, p3, 9th October 2023.

Differences between processes and threads

To date we've seen that a compiled C program becomes a process when passed to the execve() system call. Like our informal definition of threads, a C program also defines a sequence of instructions, local variables, and global variables, so what's the difference?

The primary difference is the degree of independence between processes, and between threads.

a standard C program, when executing, is a process with just one thread (often termed single-threaded).
however, a thread library enables us to write C programs defining multiple threads within a single process.

Furthermore, these threads are logically independent and thus may be executed concurrently.

CITS2002 Systems Programming, Lecture 20, p4, 9th October 2023.

Characteristics of processes and threads

Processes

Threads

An operating system process requires a lot of 'overhead':

process-ID, process group-ID, user-ID, and group-ID
environment variables
current working directory
program instructions
registers, stack space, heap space,
file descriptors
asynchronous signal actions
libraries shared with other processes
inter-process communication channels (message queues, pipes, semaphores, shared memory).

While threads all share their process's resources, they can be scheduled and execute independently because they duplicate only the essential resources:

registers
stack pointer (to own stack), heap is shared
scheduling properties (such as a timeslice or a priority)
set of pending and blocked signals
thread specific data.

CITS2002 Systems Programming, Lecture 20, p5, 9th October 2023.

POSIX threads and the pthreads library

As with many early concepts in computing, historically, different hardware vendors implemented their own proprietary versions of threads. These implementations differed substantially from each other making it difficult for programmers to develop portable threaded applications.

In order to take full advantage of the capabilities provided by threads, a standardized programming interface was required.

For UNIX-based systems, this interface has been specified by the IEEE POSIX 1003.1c standard (1995). Implementations conforming to this standard are referred to as POSIX threads, or pthreads. Most operating system vendors now offer pthreads in addition to their proprietary APIs.

pthreads are defined as a set of C language datatypes and function calls, declared with a <pthread.h> header file and defined in a thread library
(requested with -lpthread) -
although this library may be embedded in another library, such as libc, on some platforms (in the same way that Apple includes the maths library in its libc).

Some useful references:

CITS2002 Systems Programming, Lecture 20, p6, 9th October 2023.

Creating and terminating POSIX threads

Initially, each process comprises a single, default thread. All other threads must be explicitly created at run-time.

The function pthread_create() creates a new thread and marks it as executable. pthread_create() can be called any number of times from anywhere within your code (including from within running threads). pthread_create() accepts the arguments:

thread: an opaque, unique identifier for the new thread,
attr: an opaque attribute object used to set thread attributes. You can specify a thread attributes object, or NULL for default values.
start_routine: the C function (name) that the thread will execute once created.
arg: A single argument that may be passed to start_routine(). It is passed by reference as a 'pointer of type void', or NULL used if no argument is passed.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define NUM_THREADS     5

void *hello_world(void *threadid)
{
  printf("Hello World, from thread #%li!\n", (long)threadid);
  pthread_exit(NULL);
}

int main(int argc, char *argv[])
{
  pthread_t threads[NUM_THREADS];

  for(long tid=0 ; tid < NUM_THREADS ; tid++) {
    printf("In main(): creating thread %li\n", tid);

    int err = pthread_create(&threads[tid], NULL, hello_world, (void *)tid);

    if(err != 0) {
      printf("ERROR; return code from pthread_create() is %d\n", err);
      exit(EXIT_FAILURE);
    }
  }

  pthread_exit(NULL);       // as main() is a thread, too
  return 0;
}

prompt> mycc -o try try.c -lpthread
prompt> ./try
In main(): creating thread 0
In main(): creating thread 1
Hello World, from thread #0!
In main(): creating thread 2
Hello World, from thread #1!
In main(): creating thread 3
Hello World, from thread #2!
In main(): creating thread 4
Hello World, from thread #3!
Hello World, from thread #4!

CITS2002 Systems Programming, Lecture 20, p7, 9th October 2023.

Passing initial arguments to POSIX threads

Only a single argument may be passed to a new thread but, as that argument is a pointer, we may pass reference to any amount of data pointed to by that pointer.

In our previous example, we simply passed the address of a single integer. In this example we pass the address of a structure, and each new thread receives its own instance of a structure:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define NUM_THREADS     5

struct thread_data {
   int  thread_id;
   int  sum;
   char *message;
};

struct thread_data thread_data_array[NUM_THREADS];

void *hello_world(void *threadarg)
{
   struct thread_data *my_data = (struct thread_data *) threadarg;
   ...

   taskid    = my_data->thread_id;
   sum       = my_data->sum;
   hello_msg = my_data->message;
   ...
}

int main(int argc, char *argv[])
{
   ...
   for(long tid=0 ; tid < NUM_THREADS ; tid++) {
      thread_data_array[t].thread_id = t;
      thread_data_array[t].sum       = sum;
      thread_data_array[t].message   = messages[t];

      err = pthread_create(&threads[t], NULL, hello_world, (void *) &thread_data_array[t]);
      ...
   }
}

CITS2002 Systems Programming, Lecture 20, p8, 9th October 2023.

Joining and detaching POSIX threads

"joining" is the simplest way to accomplish synchronization between threads. The pthread_join() function blocks the calling thread, perhaps just main(), until the specified threadID thread terminates.

As with traditional Linux processes and the wait() system-call, we are waiting for the requested thread to terminate, and are able to receive termination information when it terminates.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *worker(void *threadarg)
{
    ....
    printf("thread #%li terminating.\n", (long)threadarg);
    pthread_exit((void*) threadarg);
}

int main(int argc, char *argv[])
{
    ....
    pthread_attr_t attr;

//  INITIALIZE AND SET THREAD DETACHED ATTRIBUTE
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);  

    err = pthread_create(&thread[tid], &attr, worker, (void *)tid);
    ....

//  BLOCK main() UNTIL THE worker() THREAD TERMINATES
    err = pthread_join(thread[t], &status);
    ....

    printf("main() resumes execution\n");
    ....
}

When a thread is created, one of its attributes defines whether it is joinable or detached. Only joinable threads can be joined(!). If a thread is created as detached, it can never be joined. If we know in advance that a thread will never need to join with another thread, it is usually created in the detached state. Some system resources may be able to be freed.

CITS2002 Systems Programming, Lecture 20, p9, 9th October 2023.

Stack management of POSIX threads

The POSIX thread standard does not define the size of each thread's stack. Different operating systems will determine the default size (execute ulimit -a).

A process's stack is used to receive parameters, to provide space for local variables, and store stack frames when other functions are called. Choosing the appropriate stack size for each thread can be very important - too small and code execution will accidentally overwrite another thread's storage, too large and we'll possibly be wasting storage.

If we know ahead of time that a particular thread requires, say, a large per-thread stack - perhaps because it has many or large local variables, or will need to make many 'deep' function calls - then we should not rely on the default stack size, but should specify the required size at time of thread creation.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define  NUM_INT_RESULTS       100000

void *worker(void *threadarg)
{
    int  results[NUM_INT_RESULTS];

    ....
    pthread_exit((void*) threadarg);
}

int main(int argc, char *argv[])
{
    ....
    pthread_attr_t attr;
    size_t         stacksize;

//  Initialize and set thread detached attribute
    pthread_attr_init(&attr);

    pthread_attr_getstacksize(&attr, &stacksize);
    printf("default stack size = %li\n", (long)stacksize);

    stacksize += NUM_INT_RESULTS * sizeof(int);
    printf("stack size needed for worker thread = %li\n", stacksize);  
    pthread_attr_setstacksize(&attr, stacksize);

    err = pthread_create(&thread[tid], &attr, worker, (void *)tid);  
    ....
}

It is notable that we have this fine-grained control when creating individual functions as threads, but not when fork()ing new processes.

CITS2002 Systems Programming, Lecture 20, p10, 9th October 2023.