Sunday, March 31, 2019

How I Transcoded 11TB of Media Under A Week

The answer to the question "how did I transcode 11TB of media in under a week" will not shock you. In fact, there's really only one answer: I threw a disgusting amount of compute power at the problem. Now, I know what you're thinking. Throw enough computer at something and it will break, but the design used to solve a problem is just as important as the power you have behind it. If you don't use the power correctly, then you don't truly have power.
So let's get the discussion of the hardware out of the way. Remember that blade I made into a standalone server? Well, that wasn't the only one. The difference being that the other ones actually worked.


One of the nice things about going to college is that you can use their electricity. Those who recognize the content of these racks will most certainly begin weeping at the thought of the energy bills that are associated with them (running and cooling them). Luckily, tuition covers it, and the club at school that runs it allowed me to take two of these C7000s over during a transitional period to transcode a massive amount of media.

This media is nothing special: It's a very very large collection of movies and TV shows. This collection, however, takes up so much space. So a while ago, I wanted to transcode all of our media into HEVC to save space with minimal quality loss. This massive transcode happened across two batches. The first batch happened on one C7000 a little under a year ago and didn't involve any TV shows. This transcode batch involved 11TB of movies and TV shows (most of which were in H.264).

This transcode has many benefits. Once the content is merged into the master library and we're working with homogenous HEVC content, we'll be using about 60% of the disk space with minimal perceivable quality loss. Not only that, but 4K media and 1080p media is then able to be direct streamed to devices through Plex much easier because of the decreased bitrate (not to mention direct streaming is far and away better than having Plex transcode things, especially when you have multiple people playing content from your server).

Each of these blades are discrete computers. How was I able to coordinate all of these different jobs across multiple computers. I decided to load up filenames in a RabbitMQ queue. A transcode worker would receive a filename and begin the process of converting it. All blades had an NFS share mounted to the main file server, so it copied the file into memory (most of these blades had more RAM than local disk space, believe it or not), asked FFMPEG to transcode the file, and copied it to the output folder. All of this was managed in Docker so I didn't have to bother configuring each of the servers.

The actual worker code, which I call Transcode Joe, is a simple Java program that invokes FFMPEG on the command line and waits for it to finish. It went through several iterations, so the code isn't perfect, but it does include support for UTF-8 filenames and multiple threads inside a docker container (which I don't advise). On the completion of a file, it adds the filename to a done queue, and adds any failures to a fail queue for manual intervention.

Our first run took about 4 days using all of the compute nodes. I ran four transcoders per blade, giving me 88 simultaneous transcodes at once. Once that was done, I was left with about 400 legitimate errors (some of them were files that should have not been attempted like subtitle files, etc. that I forgot to filter out of the list). After much diagnosis, I was able to resolve most errors by tweaking the FFMPEG command. Sometimes a transcode got unlucky and ended up on a blade that didn't have enough RAM to hold it in memory, so it would fail. However, reducing the number of jobs per blade fixed this issue.

Since this was mainly an architecture discussion, I don't think I'll post the code to this (you can ask for it, but it's fairly simple to implement yourself. It's a Ubuntu docker image with Java and FFMPEG installed in the Dockerfile, and a queue watcher written in Java). Using queues to distribute jobs across any amount of horizontal scale is not a new idea, but it sure is an effective way to get a massive amount of work done in an organized fashion.

No comments:

Post a Comment