May Contain Blueberries

the sometimes journal of Jeremy Beker


It still suprises me sometimes when I run into a clear parallel between experiences I have had in the real world and those in computing. Years ago in an old job I ran a software team that designed software for running large warehouse systems full of robotics and automated conveyor belts. In these systems we were responsible for routing boxes or pallets around a system efficiently. In order to do so, one had to take into account the capacity of the various conveyor belts as well as how quickly the robotics could move things to their shelves. It was a balancing act of different throughputs.

In my current job I have a similar problem that I have been slowly optimizing that is entirely electronic in nature. The situation can be simplified like this:

  • A large file is generated on a server online once a day (on the order of 100GB)
  • It needs to be restored to a local development server for me to work with
  • This restoration is done when I need new data, so I initiate it when needed
  • Getting the absolute latest backup restored is not important. When I do a restore, if the data is 24-48 hours old, that is fine

A few other items of note:

  • My home internet connection is 100 Mbit/s
  • My internal home network is 1000 Mbit/s
  • The destination machine is writing to an SSD that has a write throughput way faster than everything else

The process that has existed for doing this was created before I came to the company and has evolved over the years with a burst of activity recently (because I was bored/annoyed/curious to make it better).

1: Copy Then Restore

This is the first method that was used and is the simplest. It was a two step process that was not optimized for speed. The process was:

  • [remote] the backup was taken and compressed on the server using xz at some point prior.
  • [local] download.sh: The script would download the file from the remote host and store it on the local machine
  • [local] restore.sh: This would remove the old data, xz -d the file into tar and extract it into the destination directory

This had the benefit in that you could download the big file once and then retore with it as many times as you wanted. It also took advantage of xz’s very high compression ratios so the file transfered over the slowest link was as short as possible. The downside was that it was still slow to get a new file to your machine before you could do the restore. It is highly dependent on your internet speed at the time of getting a new backup file.

2: Stream restore

This was the first major optimization that was made to the system. It took the assumption that doing multiple restores from the same backup was unlikely and that by the time you wanted to restore a second time you also wanted a new, more up to date, backup. It also dealt with the issue that storing the compressed backup file prior to restoring took up disk space that was getting non-trivial.

  • [remote] the backup was taken and compressed on the server using xz at some point prior.
  • [local] stream_restore.sh: This would initiate an ssh session with the remote server, and cat the file across it directly into xz -d and then tar and extract it into the destination directory

This removed the “store and restore” problem of the first solution and since everything locally being done was faster than my internet connection, the transfer time became the bottleneck.

3: Local copy, stream restore xz compressed data (24 minutes)

Realizing that the bottleneck was now transfering the backup from our remote system to my house, I wondered if I could just get rid of that step from the critical path. I am lucky in that I have a server that is running 24/7 at home that I could schedule things on, so I realized that I could get a copy of the backup to my house overnight before I needed it. This became a combination of #1 & #2.

  • [remote] the backup was taken and compressed on the server using xz at some point prior.
  • [local server] download.sh: The script would download the file from the remote host and store it on a server at my house. This was scheduled to run every night so I always have a recent copy available
  • [local] stream_restore.sh: This would initiate an ssh session with my local server, and cat the file across it directly into xz -d and then tar and extract it into the destination directory

This had a lot of benefits in that now I could restore as fast as I could move data across my local network. I used this solution for quite a while before trying to improve it. It is also where I started keeping track of how long it took as I made improvements.

4: Local copy, stream restore uncompressed data (16 minutes)

With this solution, I started monitoring the throughput of data through the various pipelines and I was suprised to find out that the bottleneck was not actually my local network. It turns out it was in decompressing the xz compressed file on the destination server before it could be run through tar. It turns out that while xz has very high compression rates, it can’t sustain high data rates even when decompressing.

So, I figured why not add the decompression to the overnight task?

  • [remote] the backup was taken and compressed on the server using xz at some point prior.
  • [local server] download.sh: The script would download the file from the remote host and store it on a server at my house. This was scheduled to run every night so I always have a recent copy available
  • [local server] decompress the xz file on the local server and just store the raw tar file
  • [local] stream_restore.sh: This would initiate an ssh session with my local server, and cat the file across it directly tar and extract it into the destination directory

This had a significant benefit and got my restore time down to 16 minutes, which was a nice bump. However, I was now restricted by my home network as I was saturating that network link.

5: Local copy, stream restore zstd compressed data (7 minutes)

Knowing that I could saturate my network link, I saw that I was not utilizing the full write speeds of the SSD. I knew that the only way to get more data into the SSD was to have compressed data over the wire. However as I learned in #3, xz was not fast enough. I had read a few articles about zstd as a compression algorithm that was both CPU efficient and optimized for high throughput. So I figured if I could compress the data across the wire it would expand on the destination system to faster than wire speeds.

  • [remote] the backup was taken and compressed on the server using xz at some point prior.
  • [local server] download.sh: The script would download the file from the remote host and store it on a server at my house. This was scheduled to run every night so I always have a recent copy available
  • [local server] decompress the xz file on the local server and just store the raw tar file
  • [local] stream_restore.sh: This would initiate an ssh session with my local server, compress the file using zstd (using the fast option) as it was sent over the ssh connection then decompress using zstd on the destination server before piping it into tar and extract it into the destination directory

This got me really close and down to 7 minutes. But still I wondered if I could do better. I couldn’t use a very high compression setting for zstd when it was inline and keep the network saturated.

6: Local copy, recompress using zstd, stream restore compressed data (5 minutes)

Given that I wanted to send zstd compressed data over the wire, there was no reason to do that compression at the time of restore. I could do it overnight. This had the benefits of being able to use a higher compression level and remove it from the time critical path.

  • [remote] the backup was taken and compressed on the server using xz at some point prior.
  • [local server] download.sh: The script would download the file from the remote host and store it on a server at my house. This was scheduled to run every night so I always have a recent copy available
  • [local server] decompress the xz file on the local server and then recompress it using zstd
  • [local] stream_restore.sh: This would initiate an ssh session with my local server, and cat the compressed file across, run it through zstd and then into tar and extract it into the destination directory

This is, I think, the best that I can do. I am playing with the compression levels of the overnight zstd run, but higher levels don’t seem to be doing much better (and may be impacting decompression speeds). I am seeing about a 3.5-4x reduction in file size.

I think at this point I have moved the bottleneck all the way down to tar and the SSD itself, so I’m quite happy.

Day to day operations

Ironically doing these retores very quickly isn’t as big a deal as they used to be. This used to be the only way that we could restore our development systems to a pristine state if we were making large changes. It was a pain and people dreaded having to do this (and would therefore work with messy data).

To alleviate that problem, I changed our local development systems from using ext4 filesystems for the data storage of the uncompressed, un-tard data to zfs. One of the many awesome things about zfs is filesystem snapshots. So now, once the various restore scripts finish restoring a pristine set of the data, they mark a snapshot of the filesystem at that point in time. Then, whenever we need to reset our local machines, we can just tell the filesystem to roll back to that snapshot. And this takes less than a minute. So on a daya to day basis when one of our developers needs to clean there system, but doesn’t need to update their data, they can do so very quickly. This has been a game changer for us but I still wanted to make the full restore go faster (for me at any rate).

Conclusions and asides

To be clear, these speed improvements aren’t “free.” I am basically throwing CPU resources (and therefore electricity) at the problem. I am using one algorithm xz to get the file as small as possible for the slowest link, then switching to zstd because of its fast decompression speed. I am also trying to not break compatibility with other people who use the xz compressed file and don’t want/have the infrastructure and setup to run this as a multistep process.

I also found as part of this that my CPU cooler on the server that was storing/recompressing the archives was not properly seated, so I kept overheating the CPU until I fixed that. But once fixed, I was confident it could handle any high loads.

Thanks for coming along on this mostly pointless journey.