Threading in C using pthreads πŸ‘‹

Threads are light weight but be very cautious in using it since it paves way for longer execution time than a sequential execution if used in places where it is not needed.

Introduction

Anything relevant to threads where implemented in pthread.h and all the functions, data type and constants are prefixed with pthread_. Now let’s straight away jump into a simple code and learn from it.

Consider the below code,

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>  /* for POSIX threads */
#include <unistd.h>  /* for pause() and sleep() */

//Thread callback function
static void*
thread_fn_callback(void* arg) {
    char* input = (char *) arg;
    int a = 0;
    while(a < 10) {
        printf("Input string = %s\n", input);
        sleep(1);
        a = a + 1;
        if ( a == 5) {
            pthread_exit(0);
        }
    }
}


void 
thread1_create() {
    pthread_t pthread1; //pthread_t is the DS that helps us to create a thread; opaque DS
    // Always pass persistant memory as an argument and not any local variables or stack memory values. ONLY IN HEAP OR STATIC
    static char *thread_input1 = "I am thread num 1";
    //Forking a new thread -----> birth to the child thread
    int rc = pthread_create(&pthread1, 
                            NULL,
                            thread_fn_callback,
                            (void *) thread_input1);

    if(rc != 0) {
        printf("Error occured, thread could not be created, errno = %d\n", rc);
        exit(0);
    }
}

int 
main(int argc, char **argv) {
    thread1_create();
    printf("main fn paused\n");
    pthread_exit(0);
    printf("main fn paused\n");
    //pause();

    //return 0;
}

pthread_t is an opaque data type that helps us to create a thread. It is an unsigned long datatype but since it is opaque it could also be of other data types and holds the ID of the thread. This ID is assigned by the pthread API itself and the OS is not aware of that. This same thread has different ID which would be assigned by the OS itself and that could be fetched and printed by printf("%d\n", (pid_t) syscall(SYS_gettid). Don’t forget to add the necessary header file in case you are testing this. Anyways this is not the scope of this article. Back to our topic, We pass the address of pthread_t to create a thread. pthread_create is used to create the thread in POSIX as shown in the above program.

Check the size of directory

As shown in the above image, the first parameter is thread identifier, second parameter is a set of thread attributes and if it is NULL then the default settings are considered. The third parameter is the function that the thread should run and the last parameter is the argument passed to the function that the thread executes.

Next is about how to exit the thread. There are many ways to do this and all of them are covered here. pthread_exit(0) will just exit the thread. It could be used in the main thread if it is not expecting any return value from the child thread. And this could also be used in the child thread if it is not returning any value to the main thread. In the example above this (pthread_exit()) is what we have used to exit the main thread as well as all the child thread we spin up.

Deep dive

Can we pass pthread_t pointer as an argument to the first parameter in pthread_create?

Absolutely NOT. Here’s why. Consider the below example,

    pthread_t* thr;
    pthread_create(thr, NULL, &function_to_execute_by_thread, NULL); // Leads to segmentation fault

In the above code, a pointer to the pthread_t is declared and the address of the pointer is passed as the first parameter. But if you closely notice, it will try to write in the pointer *thr which doesn’t have a defined location or in other words it is not pointing to any allocated memory. So the above code leads to segmentation fault.

Now let’s allocate a memory and let our pthread_t* point to that memory.

    pthread_t thread;
    pthread_t* thread_pntr = &thread;
    pthread_create(thread_pntr, NULL, &function_to_execute_by_thread, NULL);

In the above code, we allocate the memory and then we pass that address to the pointer and finally we pass the address of the pointer to the pthread_create as the first parameter. This is acceptable because we declared the storage.

Another interesting way is to allocate memory by ourselves in the program using malloc() function as shown in the below example:

    pthread_t* thread_handles;
    thread_handles = malloc(thread_count * sizeof(pthread_t));
    for (thread = 0; thread < thread_count; thread++) {
      pthread_create(&thread_handles[thread], NULL, &function_to_execute_by_thread, NULL);
    }

pthread_exit() - Why should we use this and when?

    #include <pthread.h>
    noreturn void pthread_exit(void *retval); //Compile and link with -pthread.
pthread_exit() function terminates the thread and returns a value (only if the thread is joinable) to another thread in the same process. In case if we are creating detached threads and NOT joinable threads, main thread can terminate itself using pthread_exit() since all the other threads created are independent of the main thread and won’t report back a return value to the main thread.

Certain things that happen before executing the pthread_exit() and certain things does not as explained below:

(1) Any clean-up handlers established by pthread_cleanup_push() that have not yet been popped should be popped out.

(2) When a thread terminates, process shared resources are not released by itself. (i.e) mutexes, condition variables, semaphores and file descriptors.

When NOT to use pthread_exit()

One of my team mates answered it in stackoverflow so here it is:

Using pthread_exit in the main thread(in place of pthread_join), will leave the main thread in defunct(zombie) state. Since not using pthread_join, other joinable threads which are terminated will also remain in the zombie state and cause resource leakage.

Failure to join with a thread that is joinable (i.e., one that is not detached), produces a “zombie thread”. Avoid doing this, since each zombie thread consumes some system resources, and when enough zombie threads have accumulated, it will no longer be possible to create new threads (or processes).

Another point is keeping the main thread in the defunct state, while other threads are running may cause implementation dependent issues in various conditions like if resources are allocated in main thread or variables which are local to the main thread are used in other threads.

Also, all the shared resources are released only when the process exits, it’s not saving any resources. So, I think using pthread_exit in place of pthread_join should be avoided.

Joinable threads

Let’s learn in depth about joining threads using the below example:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <errno.h>


/*
 * compile using :
 * gcc -g -c joinable_example.c -o joinable_example.o
 * gcc -g joinable_example.o -o joinable_example.exe -lpthread
 * Run : ./joinable_example.exe
 */


pthread_t pthread2;
pthread_t pthread3;


static void* 
thread_fn_callback(void* arg) {
    int th_id = *(int*) arg;
    free(arg);
    int counter = 0;
    printf("Thread doing some work");
    while (counter != th_id) {
       printf("Thread %d doing some work\n", th_id);
       sleep(1);
       counter++; 
    }

    int *result = calloc(1, sizeof(int));
    *result = th_id * th_id;
    return (void*) result;
}

void
thread_create(pthread_t *pthread_handle, int th_id) {
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); //2nd argument is passed into the first variable.

    int *_th_id = calloc(1, sizeof(int));
    *_th_id = th_id;
    //Fork a new value
    int rc = pthread_create(pthread_handle,
                    &attr,
                    thread_fn_callback,
                    (void*)_th_id);

    if(rc != 0) {

		printf("Error occurred, thread could not be created, errno = %d\n", rc);
		exit(0);
	}


}


int 
main(int argc, char **argv) {

    void* result_pthread2;
    void* result_pthread3;

    thread_create(&pthread2, 2);
    thread_create(&pthread3, 50);
    printf("main thread blocked to join with pthread 2\n");

    pthread_join(pthread2, &result_pthread2);
    // Address could be recieved but address of the send address is recieved in the pthread_join. &(addr)
    if (result_pthread2) {  
        printf("Return result from the thread 2 = %d\n", *(int*)result_pthread2);
        free(result_pthread2);
        result_pthread2 = NULL;
    }

    pthread_join(pthread3, &result_pthread3);
    
    if(result_pthread3) {
        printf("Return result from the thread 2 = %d\n", *(int*)result_pthread3);
        free(result_pthread3);
        result_pthread3 = NULL;
    }

    return 0;
}

In the thread_create() function we declare pthread_attr_t which is used to set the child threads as joinable and this attribute list is passed as the second parameter when we create a thread.

    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); //2nd argument is passed into the first variable.

pthread_attr_t is a opaque datatype and it is added to the initiator. Now the below functions could be called to set the value to it to make change to the default attributes passed in while creating the thread. pthread_attr_init(&attr); is used to initiate and state that as a programmer you are going to change the default settings while creating the thread. pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE); is to set that the thread is joinable. In our example it is joining back to the parent thread. PTHREAD_CREATE_JOINABLE is the default setting. Another option pthread library allows us to set is PTHREAD_CREATE_DETACHED. Most importantly, we pass the the pthread_attr_t object that we created to the function that creates the thread pthread_create.

When we declare a thread a joinable one then once it returned a value the main thread should recieve it as shown in the below code using pthread_join().

The first parameter is the pthread_t object and the second parameter is a void* that recieves the return value of the thread that intends to join.

    void* result_pthread2;
    pthread_join(pthread2, &result_pthread2);
    // Address could be recieved but address of the send address is recieved in the pthread_join. &(addr)
    if (result_pthread2) {  
        printf("Return result from the thread 2 = %d\n", *(int*)result_pthread2);
        free(result_pthread2);
        result_pthread2 = NULL;
    }

See the POSIX specifications for the below attribute functions if you are interested to know more about attributes that we can set to pass to a thread when creating it.

pthread_attr_destroy()

pthread_attr_setdetachstate()

pthread_attr_setguardsize()

pthread_attr_setinheritsched()

pthread_attr_setschedparam()

pthread_attr_setschedpolicy()

pthread_attr_setscope()

pthread_attr_setstack()

pthread_attr_setstacksize()

pthread_attr_setaffinity_np()

pthread_attr_setsigmask_np()

pthread_attr_setstackaddr()

pthread_getattr_np()

pthread_setattr_default_np()

Conclusion

So now you got to know the basics of threading, how to create a thread, how to create detached one, joinable one, how to set your own attributes when creating the thread, how to exit a thread, when to use pthread_exit() function, etc. In the next article, let’s go bit deeper and explore the threading world. Enjoy learning!