Can you avoid unleashing Zalgo in NodeJS?
You probably can, but first you need to know what it even means to unleash Zalgo.
By the way, the term Zalgo comes from a meme. Zalgo is an Internet legend about an ominous entity believed to cause insanity, death and destruction of the world. Zalgo is often associated with scrambled text on web pages and photos of people whose eyes and mouths have been covered in black.
You might be wondering how this is related to NodeJS.
The relation is symbolic. Developers often make mistakes while building applications. These mistakes often lead to bugs. While most of the bugs are easy to fix after proper testing, some of them can be extremely troublesome. Unleashing Zalgo is one of those types of mistakes.
The concept of Zalgo in NodeJS was first talked about by Isaac Z. Schlueter (one of the core developers on the NodeJS project) in one of his blog posts. Isaac’s post was in turn inspired by a post about callbacks on Havoc’s blog.
In this post, we will learn how to identify a Zalgo-like situation in our code and how we can avoid falling into this trap.
Synchronous or Asynchronous Function = Unpredictable
If the Reactor Pattern is the engine of NodeJS, the callback pattern is the fuel on which it runs.
The callback pattern makes it possible for JavaScript to handle concurrency despite being single-threaded (technically).
However, callbacks in NodeJS can be asynchronous as well as synchronous.
In the post about NodeJS Callback Pattern, we saw how the order of instructions can change radically depending on whether a function is synchronous or asynchronous. This has significant consequences for the correctness and efficiency of our programs.
Of course, it is not hard to handle a function that clearly states whether it is synchronous or asynchronous. The problem arises when a function behaves inconsistently.
What do you call a function that runs synchronously in certain conditions and asynchronously in some other conditions?
Such a function is an unpredictable function.
Unpredictable functions and the APIs they expose create problems that are hard to detect and reproduce. By writing such functions in your code, you are unleashing Zalgo.
Unleashing Zalgo in NodeJS
Let us understand the concept of unleashing Zalgo in NodeJS with the help of an example.
Check out the below code:
let cache = {};
function getStringLength(text, callback) {
if (cache[text]) {
callback(cache[text])
} else {
setTimeout(function() {
cache[text] = text.length;
callback(text.length)
}, 1000)
}
}
The getStringLength()
function is inherently evil. Though it simply calculates the length of a string, it has two faces.
If the string and its length are available in the cache, the function behaves synchronously by returning the data from the cache.
Otherwise, it calculates the length of the string and stores the result in the cache before triggering the callback. However, all of this is done asynchronously using a setTimeout()
.
The use of setTimeout()
is to force an asynchronous behaviour in our example. You can replace it with any other asynchronous activity such as reading a file or making an API call. The idea is to demonstrate that a function can have different behaviour in different situations.
“But how does it unleash Zalgo?” you may ask.
Let us write some more logic to actually use this unpredictable function. See the completed code below:
function sleep(milliseconds) {
return new Promise(resolve => setTimeout(resolve, milliseconds));
}
let cache = {};
function getStringLength(text, callback) {
if (cache[text]) {
callback(cache[text])
} else {
setTimeout(function() {
cache[text] = text.length;
callback(text.length)
}, 1000)
}
}
function determineStringLength(text) {
let listeners = []
getStringLength(text, function(value) {
listeners.forEach(function(listener) {
listener(value)
})
})
return {
onDataReady: function(listener) {
listeners.push(listener)
}
}
}
async function testLogic() {
let text1 = determineStringLength("hello");
text1.onDataReady(function(data) {
console.log("Text1 Length of string: " + data)
})
await sleep(2000);
let text2 = determineStringLength("hello");
text2.onDataReady(function(data) {
console.log("Text2 Length of string: " + data)
})
}
testLogic();
Pay special attention to the function determineStringLength()
. It is a wrapper around the getStringLength()
function.
Basically, the determineStringLength()
function creates a new object that acts as a notifier for the string length calculation. When the string length is calculated by the getStringLength()
function, the listener functions registered in determineStringLength()
are invoked.
To test this concept, we have the testLogic()
function at the end. The test function calls determineStringLength()
function twice with the same input string “hello”. Between the two calls, we pause the execution for 2 seconds using the sleep()
function to introduce time lag between the two calls.
Running the program gives the below output
Text1 Length of string: 5
The callback for the second operation was never invoked.
For
text1
, thegetStringLength()
function behaves asynchronously since the data is not available in the cache. Therefore, we were able to register our listener function properly and hence, the output was printed.Next, we have
text2
that is created in an event loop cycle that already has the data in the cache. This timegetStringLength()
behaves synchronously. Hence, the callback that is passed togetStringLength()
invokes immediately. This invokes all the registered listeners synchronously. However, registration of the listener happens later and is never invoked.
The root of this problem is the unpredictable nature of getStringLength()
function. Instead of providing consistency, it increases the unpredictability of our program.
A bug like this is not easy to identify and can result in nasty defects. Just like unleashing Zalgo.
Avoid Zalgo in NodeJS using Deferred Execution
So, how do we avoid Zalgo in NodeJS?
It’s actually pretty simple. We make sure our functions behave consistently. They should be either synchronous or asynchronous. Not both at the same time.
In our contrived example, we can fix the issue by making the getStringLength()
function purely asynchronous for all scenarios.
See below:
function getStringLength(text, callback) {
if (cache[text]) {
process.nextTick(function() {
callback(cache[text]);
});
//callback(cache[text])
} else {
setTimeout(function() {
cache[text] = text.length;
callback(text.length)
}, 1000)
}
}
Instead of directly triggering the callback, we wrap it inside the process.nextTick()
. This defers the execution of the function until the next event loop phase.
If you are confused by the execution of process.nextTick()
, I recommend going through this comprehensive post on event loop phases in NodeJS. The post will make things absolutely clear about when a certain callback is executed by the event loop.
Conclusion
Subtle reasons can cause nasty bugs. Unleashing Zalgo is one of them and hence, an interesting name was given to this situation.
As I mentioned earlier, the term was first used in the context of NodeJS by Isaac Z. Schlueter which was also inspired by a post on Havoc’s Blog. Below are the links to those posts:
You can check out those posts to get more background.
I hope the example in this post was useful in understanding the issue on a more practical level.
Do share your views in the comments section below.