29 Apr 2023

Python Concurrency: Threads, Processes, and Async

Python is a popular high-level programming language known for its simplicity, readability, and powerful capabilities. With its robust support for concurrency, Python enables developers to write efficient, high-performance code that can execute multiple tasks simultaneously. In this blog, we will explore the different types of concurrency in Python and their respective advantages and disadvantages: threads, processes, and async.

Threads

A thread is a lightweight unit of execution within a process. A single process can contain multiple threads, and each thread can run concurrently with other threads within the same process. Threads share the same memory space and resources, making it easy to share data between them. Python provides a built-in module called threading to work with threads.

To create a thread, you can define a function and then instantiate a Thread object, passing the function as a target argument:

import threading

def worker():
    print('Worker thread started')
    # do some work here
    print('Worker thread finished')

thread = threading.Thread(target=worker)
thread.start()

The start() method of the Thread object initiates the thread, and the code within the worker() function will execute concurrently with the main thread.

One advantage of using threads is that they are lightweight and can be created quickly. However, because threads share the same memory space, it can lead to synchronization issues and race conditions. To avoid these problems, you can use locks, semaphores, and other synchronization primitives provided by the threading module.

Processes

A process is a standalone unit of execution with its own memory space, resources, and operating system process ID (PID). Unlike threads, processes do not share the same memory space and must communicate through inter-process communication (IPC) mechanisms such as pipes, queues, and sockets. Python provides a built-in module called multiprocessing to work with processes.

To create a process, you can define a function and then instantiate a Process object, passing the function as a target argument:

import multiprocessing

def worker():
    print('Worker process started')
    # do some work here
    print('Worker process finished')

process = multiprocessing.Process(target=worker)
process.start()

The start() method of the Process object initiates the process, and the code within the worker() function will execute concurrently with the main process.

One advantage of using processes is that they are isolated from each other and do not share memory space, making it easier to write concurrent code without worrying about synchronization issues. However, processes are heavier than threads and can take longer to create.

Async

Async is a programming paradigm that allows you to write concurrent code that executes asynchronously, without using threads or processes. Instead, async uses a single thread and cooperative multitasking to switch between tasks when they are waiting for I/O or other blocking operations. Python provides a built-in module called asyncio to work with async.

To create an async function, you can define a function with the async keyword and use the await keyword to wait for a coroutine to complete:

import asyncio

async def worker():
    print('Worker task started')
    # do some work here
    await asyncio.sleep(1)
    print('Worker task finished')

async def main():
    tasks = [worker() for _ in range(3)]
    await asyncio.gather(*tasks)

asyncio.run(main())

The asyncio.run() method initiates the event loop and runs the async function, and the asyncio.gather() method waits for all tasks to complete.

One advantage of using async is that it is more efficient than using threads or processes, because it uses a single thread and cooperative multitasking. However, async requires a different programming mindset and can be more complex to write than using threads or processes.

Another advantage of using async is that it supports non-blocking I/O, which means that your code can continue to execute while waiting for I/O operations to complete. This can improve the performance of your code, especially when dealing with network-bound or I/O-bound applications.

In conclusion, Python provides different ways to write concurrent code, each with its own advantages and disadvantages. Threads are lightweight and easy to create, but can lead to synchronization issues. Processes are isolated and don't share memory, but are heavier to create. Async uses cooperative multitasking and non-blocking I/O, but requires a different programming mindset. Understanding the different types of concurrency in Python and choosing the right one for your application can help you write efficient, high-performance code.