How to Install and Use Facebook's Zstandard

  Programming

Introduction

Recently, Facebook released version 1.0 of Zstandard, popularly known as just Zstd. It is an implementation of a new data compression algorithm developed by Yann Collet, the same guy who developed LZ4 and xxHash. I believe that Zstandard might very well become the new de facto compression library of the future because it overcomes most of the limitations present in zlib, and offers markedly better compression ratios.

In this tutorial, I’ll be showing you how to install Zstd and use it both as a command line tool, and as a C library.

Installation

The source code of Zstandard is available on GitHub. So, all you need to do is download it as a ZIP file. Once you have downloaded the file and extracted it, you must compile it. Doing so is easy if you are using Ubuntu. Just get inside the zstd-1.0.0 directory, and run make.

cd zstd-1.0.0
make install

Note that if you want to install zstd in a directory of your choice, you can use the PREFIX option. For example, if you want to install zstd inside /tmp/zstd, you would type in the following:

cd zstd-1.0.0
make install PREFIX=/tmp/zstd

Of course, you would then have to add the zstd directory to your PATH variable manually.

If everything went well during the compilation, you’ll now have a binary file called zstd. To check if it is working, type in the following command:

zstd --version

You should see output that looks similar to this:

*** zstd command line interface 32-bits v1.0.0, by Yann Collet ***

Using Zstandard as a Compression Tool

Using Zstandard is easy. You simply pass a file name to it, and it compresses it, generating a new file with a .zst extension.

For example, here’s how you would compress a file called myfile.txt:

zstd myfile.txt

To decompress a file, you can use the -d option. Here’s how:

zstd -d myfile.txt.zst

Alternatively, you could use the unzstd command.

unzstd myfile.txt.zst

Usually, you would want to compress more than just one file. However, you can’t pass a directory to zstd directly. To overcome this limitation, you can simply use the tar command along with zstd.

tar -cvf mydirectory.tar /mydirectory
zstd mydirectory.tar

Zstandard offeres 19 different levels of compression. The default level is 3. Here’s how you use a higher compression level:

zstd -15 myfile.txt

Of course, a higher compression level also means the time taken to compress is longer.

Using Zstandard as a Library

Let me now show you how to use Zstandard in a C program. I’ll be using Ubuntu as my operating system. Therefore, we’ll be working with gcc.

To be able to use Zstandard in your program, you must include the zstd.h header file. To follow this tutorial, you will also need stdio.h, stdlib.h, string.h.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <zstd.h>

int main() {

}

In this tutorial, I’ll be showing you how to compress a string. You can, however, use this knowledge to compress other forms of data too.

char* my_data = "aaaaaaabbbbbbbbbccccccccddddddddeeeeeeeefffffffffggggggggg";
size_t size_of_data = strlen(my_data);

We must now create a buffer for the compressed data, and also decide how big it should be. zstd.h has a helper function called ZSTD_compressBound() that lets you know what the buffer size would have to be in the worst case.

size_t estimated_size_of_compressed_data = ZSTD_compressBound(size_of_data);
void* compressed_data = malloc(estimated_size_of_compressed_data);

To compress your data, you must call the ZSTD_compress() function. It returns the actual size of the compressed data.

size_t actual_size_of_compressed_data = 
        ZSTD_compress(compressed_data, estimated_size_of_compressed_data, 
            my_data, size_of_data, 19);

Note that the last argument to the function is the compression level, which can range from 1(lowest) to 22(highest).

At this point, the compressed_data buffer stores the data in a compressed format. You can easily write it to a file using fwrite(). The output file must, of course, be opened in binary mode.

FILE *fp = fopen("/tmp/mydata.zst", "wb");
fwrite(compressed_data, actual_size_of_compressed_data, 1, fp);
fclose(fp);

The program is ready. To compile it, you can use gcc as follows:

gcc myprogram.c -lzstd

You can check if the compressed data can be decompressed by passing mydata.zst, which is the output of our program, to zstdcat.

zstdcat /tmp/mydata.zst

Conclusion

You now know how to use Zstandard. Although we focused on using it using the C language only, Zstandard has bindings for several languages.

If you found this article useful, please share it with your friends and colleagues!