Operating-Systems-Notes

Threads and Concurrency

Thread:

Process vs Thread

processvthread

Why are threads useful?

What do we need to support threads?

Concurrency control and Coordination

Threads and Threads creation

t1 = fork(proc, args)   
child_result = join(t1)   

Example:

Thread  t1;
Shared_List list;
t1 = fork(safe_insert, 4);
safe_insert(6);
join(t1); //Optional

The list can be accessed by reading shared variable.

Mutual Exclusion

lock(mutex){
	//Critical Section
    //Only one thread can access at a time
}
unlock(mutex)

mutex

Producer Consumer problem

What if the processing you wish to perform with mutual exclusion needs to occur under certai conditions?

For e.g. The producer appends items to a list until the list is full, and the consumer has to print out all the items of the list once the list if full and then empty the list. Thus we have to execute the Consumer thread only under a certain condition (here- when the list becomes empty, print items).

Solution: Use Condition Variables

producerconsumer

Readers / Writer problem

if ((read_count == 0) & (read_count == 0))
	R okay, W okay
if (read_count > 0)
	R okay    
if (read_count == 1)
	R not-okay, W not-okay    

State of shared resource:

Thus essentially we can apply mutex on the new proxy ‘resource_counter’ variable that represents the state of the shared resource.

Avoiding common mistakes

Spurious(Unnecessary) Wake ups

When we wake up threads knowing they may not be able to proceed.

Deadlocks

Two or more competing threads are said to be in a deadlock if they are waiting on each other to complete, but none of them ever do.

deadlock

Here T1 and T2 are in deadlock.

How to avoid this?

  1. Unlock T1 before locking T2
    • Fine-grained locking but T1 nad T2 may both be required
  2. Use one mega lock, get all locks upfront, then release at end
    • For some applications this may be ok. But generally its too restrictive and limits parallelism
  3. Maintain lock order
    • first m_T1
    • then m_T2 - this will prevent cycles in wait graph

A cycle in wait graph is necessary and sufficient for deadlock to occur.
(thread-waiting-on-resource —edge—> thread-owning-resource)

Kernel vs User level Threads

kernelvuserthread

Three types of models:

1. One to One model:

onetoone

Advantages:

Disadvantages:

2. Many to One model:

manytoone

Advantages:

Disadvantages:

3. Many to Many model:

manytomany

Advantages:

Disadvantages:

Multithreading patterns

1. Boss-Workers pattern

Throughput of system is limited by boss thread. Hence boss thread must be kept efficient.

Throughput = 1/boss-time-orders

Boss assigns works by:

  1. Directly signalling specific works
    • + workers don’t need to sync
    • - boss must keep track of everyone
  2. Placing work in queue
    • + boss doesn’t neeed to know details about workers
    • - queue synchronization

How many workers?

Advantages:

Disadvantages:

1B. Boss-Workers pattern variant

Advantages:

Disadvantages:

2. Pipeline pattern

3. Layered pattern

Advantages:

Disadvantages:

Example:

Q) For 6 step toy order application we have 2 solutions:

  1. Boss-workers solution
  2. Pipeline solution

Both have 6 threads. In the boss-workers solution, a worker produces a toy order in 120 ms. In the pipeline solution, each of 6 stages take 20 ms.

How long will it take for these solutions to complete 10 toy orders and 11 toy orders?

A) 6 threads means for Boss-workers, 1 thread is for boss, 5 for workers. In pipeline 6 threads are equally used.

For 10 toy orders:

Boss-workers(10) = 120 + 120 = 240 ms
Pipeline(10) = 120 + (9*20) = 300 ms

Here Boss-workers is better than Pipeline.

For 11 toy orders:

Boss-workers(11) = 120 + 120 + 120 = 360 ms
Pipeline(11) = 120 + (10*20) = 320 ms

Here Pipeline is better than Boss-workers.

This proves that choosing a better pattern depends on the number of threads and the work required to be done.

PThreads

PThreads == POSIX Threads

POSIX = Portable OS interface

Compiling PThreads

  1. #include in main file
  2. Compile source with -lpthread or -pthread
    gcc -o main main.c -lpthread
    gcc -o main main.c -pthread
    
  3. Check return values of common examples

PThread mutexes

Safety tips

Thread Design Considerations

Kernel vs User Level Threads

userlevelvkernellevel

threadds

Hard vs Light Process states

PCB is divided into multiple data structures classified as follows:

Rationale for Multiple Data Structures:

Single PCB Multiple DS
Large continuos DS Smaller DS
Private for each entity Easier to share
Saved and restored on each context switch Save and Restore only what needs to change on context switch
Update for any changes User lever library need to only update portion of the state

Comparison of Interrupts and Signals

Interrupts Signals
Events generated externally by components other than CPU (I/O devices, timers, other CPUs) Events triggered by CPU and software running on it
Determined based on physical platform Determined based on OS
Appear asynchronously Appear synchronously or asynchronously

An interrupt is like a snowstorm alarm
A signal is like a low battery warning

Interrupts

interrupts

Signals

signals

Handlers / Actions

disableis

Here PC: First instruction in handler
SP : thread stack

To prevent deadlock,

  1. Keep handler code simple
    • avoid mutex
    • - too restrictive
  2. Control interruptions by handler code
    • Use interrupt/signal masks
    • 0011100110.. (0: disabled, 1: enabled)
clear_field_in_mask(mask)
lock(mutex)
{

#disabled => remaining pending

}
unlock(mutex)
reset_field_in_mask(mask)

#enabled => execute handler code

Types of Signals

  1. One-shot Signals
    • “n signals pending == 1 signal pending” : atleast once
    • must be explicitly re-enabled
  2. Realtime Signals
    • “if n signals raised, then handler is called n times”

Handling interrupts as threads

interruptsasthreads

but dynamic thread creation is expensive!

Threads and Signal Handling

tshandling

Case 1 :

case1

Case 2 :

case2

Case 3 :

case3

Case 4 :

case4

Optimize common case

Multi-processing vs Multi-threading

How to best provide concurrency?

Multi-Processing (MP)

Advantages

Disadvantages

Multi-Threading (MP)

Advantages

Disadvantages

Event Driven model

eventdrivenmodel

Features:

Dispatcher : acts as a state machine and accepts any external events

When call handler => jump to code

The handler:

Concurrent execution in Event-driven models

Advantages

Disadvantages

Asynchronous I/O operations

Asynchronous I/O operations fit well with Event-driven models

Since asynchronous calls are not easily avalible, helpers can be used to implement the async call functionality:

Asymmetric Multi-Process Event Driven model (AMPED & AMTED)

Advantages

Disadvantages

Eg Apache Web Server

apachewebserver.png