As technologies develop, we need certain ways to achieve high responsiveness to ensure a great user experience. Multithreading is for sure among our options. So our Node.js developer, Saro, discusses the techniques used to achieve multithreading in Node.js to show you the bigger picture. Let's jump into business.
• What is a thread?
• Deep dive into CPU processes (core)
• Concurrency VS Parallelism
• Multithreading explained with the previous 3 points
• Node.js multithreading - libuv
• Worker threads
• Example codes in Node.js (for example crypto file)
What is a thread?
Basically, you can imagine a thread as a virtual processor. The most significant point is to understand that it is created with the help of a program. It is not something you can touch. It is the virtualization of an actual core that, on the other hand, is a physical part of the CPU. An easy explanation is that when we enter a program, we get one process and the main thread. This thread contains the main instructions to run the desired program. As time passes, the main thread can create other threads within the same process. It is also vital to understand that threads share some resources that are important to the currently running application.
Deep dive into CPU processes (core)
Of course, the performance that we have today would not exist without multicore CPUs. With computers becoming more and more important, it was necessary to increase their performance. Eventually, people came up with a better idea for managing CPUs. They decided to install more computing units inside a single CPU instead of multiple CPUs in one computer. These kinds of CPUs are called multicore CPUs. And of course, this changed the course of history by giving people better performance.
Concurrency VS Parallelism
To better understand the difference between single-core and multi-core CPUs, we’ll discuss the concepts of concurrency and parallelism. The former is easier to understand by thinking about single-core CPUs. The idea of concurrency helped us to run multiple programs without a significant delay in either of the programs running simultaneously. Of course, the idea of not having notable delays is debatable and is not always true. It depends on several crucial factors such as the number of programs running, the clock speed, etc. On the other hand, the idea of parallelism describes a situation where multiple operations can get completed simultaneously. Here, the concept of a multi-core CPU comes in handy. The core is the brain of a CPU, and it lets us do multiple computations at once.
Multithreading explained with the previous 3 points
As we observed, the multithreading technique allows us to complete tasks without interfering with the ones already running. Basically, we instruct the computer to complete the task independently from the main thread by commanding the application to do so. This, of course, increases the application’s performance and overall user experience by not interrupting users during their activities.
Node.js multithreading - libuv
At this point, we understand the concept of multithreading in general. It is time to talk about its role in the lifecycle of Node.js applications. First, let's briefly talk about a very important dependency of the Node.js platform, libuv. It is a library developed in C language to provide Node.js with mechanisms for dealing with some important features. Some of them are file systems, DNS, networks, and so on. If we call file system APIs provided by Node, libuv is the library that will deal with our request. In addition, if we want to utilize multithreading in Node.js fully, we need to use asynchronous APIs. Otherwise, we will end up blocking our main thread, the Event Loop. By the way, it is also presented to us by the libuv library. It is also good to understand whether the package we use utilizes those APIs whenever necessary. If not, the codes will most probably end up not being as optimal and efficient as expected.
Worker threads
As we discussed earlier, asynchronous APIs of Node.js can help do our job in a separate thread. For instance, reading files or encrypting some data can be done in separate threads. What if we need a custom functionality to get completed in a separate thread without blocking our main thread? Node.js has a core module called “Worker threads” that can help us in these situations. We have to use the module “worker_threads”.
For demonstration purposes, I created a function to measure the execution time of the functions.
const calculateMiddleWorkTime = (func, args = null, isArrayArgument = false) => { let sumOfSec for (let i = 0; i < 10; i++) { const before = Date.now(); let res; if (isArrayArgument) res = func(...args); else res = func(args); const after = Date.now(); const eachFuncSec - before) / 1000; sumOfSeconds += eachFuncSeconds; } //get the average time in seconds! return sumOfSeconds / 10; } module.exports = calculateMiddleWorkTime;
This is a simple function that takes a function as an argument and calculates the average execution time in seconds. I used this function to explain the logic of worker threads in Node.js. So without further ado, let's dive into the examples. Suppose we have two functions: calculating the factorial of a number and string manipulations.
const {arguments: {argOfSplit, argOfFactorial}} = require('./shared_constants'); console.log('Start of the measurements!'); //execution measuring function (in seconds) const timeCalculator = require('./timeCalculator'); //functions const strSplitFor = require('./str-split'); const factorial = require('./factorial'); //split function's measurements const averageTimeSplit = timeCalculator(strSplitFor, argOfSplit); const toFixedSplit = Number.parseFloat(averageTimeSplit).toFixed(2); //factorial function's measurements const averageTimeFactorial = timeCalculator(factorial, argOfFactorial); const toFixedFactorial = Number.parseFloat(averageTimeFactorial).toFixed(2); //function to measure functions' execution time together const syncTogether = () => { strSplitFor(argOfSplit); factorial(argOfFactorial); }; const averageTogether = timeCalculator(syncTogether); const togetherToFixed = Number.parseFloat(averageTogether).toFixed(2); console.log('end of the measurements'); console.log('sync togetherToFixed', togetherToFixed); console.log('averageTimeSplit', toFixedSplit); console.log('averageTimeSum', toFixedFactorial);
This is all the code you need to read for now. To get a bigger picture, you are welcome to check out my codes on my repo on GitHub. Okay, back to the code. You can experiment with the code and easily find the result.
The results tell us that the factorial function takes 1.12 seconds on average to complete with given arguments. The string function, on the other hand, takes up to 2.43 seconds. And they together will be complete in 4.36 seconds, it makes sense, right? Two synchronous functions running after each other have to be executed one at a time. So we end up having the sum of the execution times of those functions.
Now let’s look at another example. Here I took advantage of the APIs produced by “worker_threads” in Node.js.
const {arguments: {argOfSplit, argOfFactorial}} = require('./shared_constants'); const {Worker} = require("worker_threads"); const strSplitFor = require('./str-split'); const before = Date.now(); //async/new thread logic! const worker = new Worker('./complex_caller.js', { workerData: {n: argOfFactorial} }); //this works after calling the new Worker() function! strSplitFor(argOfSplit); worker.on('message', (m) => { console.log('result of the function called via worker -> ', m); const after = Date.now(); console.log('seconds with workers: ', Number.parseFloat((after - before) / 1000).toFixed(2)); }); worker.on('error', (err) => { console.log('error occurred:', err); });
So here I call a JavaScript file that calls the factorial function via the worker threads. It lets us complete the code in a multithreaded fashion. The script called via the worker will run in a separate thread. After that, we call the string function synchronously. The results show that the same two functions altogether are completed in 2.60 seconds. That is the completion time of the longest running function among those two. This example illustrates the workflow and the advantages of the worker_threads module. We see that two functions almost took as much time as the string function that we had.
In Node.js, we have yet another way to utilize multithreading. As we discussed, we can call asynchronous APIs of some core modules to get the job done in another thread. Let’s see another example of multithreading. This time I used the crypto module of Node.js.
const measureAsyncHash = async () => { let sec asyncMeasure(asyncHashFunction, ['secret', 'salt', 1000000, 64, 'sha512']); sec console.log('seconds of asyncHash IN AVERAGE', seconds); }; let syncHash = timeCalculator(syncHashFunction, ['secret', 'salt', 1000000, 64, 'sha512'], true); syncHash = Number.parseFloat(syncHash).toFixed(2); console.log('seconds of syncHash IN AVERAGE', syncHash); measureAsyncHash(); const asyncTogetherSec () => { const before = Date.now(); const arr = []; arr.push(asyncHashFunction('secret', 'salt', 1000000, 64, 'sha512')) arr.push(asyncHashFunction('secret', 'salt', 1000000, 64, 'sha512')) const res = await Promise.all(arr); const after = Date.now(); let sec - before) / 1000; sec console.log('seconds of ASYNC functions running together', seconds); }; const syncTogetherSec => { const before = Date.now(); const syncRes1 = syncHashFunction('secret', 'salt', 1000000, 64, 'sha512') const syncRes2 = syncHashFunction('secret', 'salt', 1000000, 64, 'sha512') const after = Date.now(); let sec - before) / 1000; sec console.log('seconds of SYNC functions running together', seconds); }; syncTogetherSeconds(); asyncTogetherSeconds();
This time I compare the same function’s different versions, namely async and sync versions. The logic is the same, though. We see that, on average, the synchronous hashing function takes up to 0.65 seconds when run separately. The execution of two synchronous functions sequentially finishes in 1.33 seconds. The reason is obvious; one function needs to be done for the next function to get called. Let’s now study the asynchronous versions of those functions. Note that it took approximately as much time to finish those two asynchronous functions as it takes to finish one synchronous function.
Conclusion
We studied the topic of multithreading as a whole. Surely, there are a lot of places to dive into if you are interested and passionate. In terms of Node.js, we found out that there are two main ways to achieve multithreading, calling asynchronous APIs of core modules and utilizing the APIs of the module “worker_threads”.