Thursday, June 27, 2019

Cloud Gaming - Part 1: The Proof of Concept and The Plan

Cloud gaming. Everybody's doing it. NVIDIA, Google, Microsoft, Sony... you get the picture. Big tech companies are doing it. Am I willing to say it's the future of gaming? Not quite, but it is an interesting service that clearly has a market. I wanted to see what it took to build a service like this in the cloud as well as the bandwidth it required to play things at a higher resolution, framerate, and quality than my computer.

So here's the overall goal of this project: we want to achieve 4K gaming at the highest framerate we can muster. This has to be coupled with streaming it over the internet and ending up on my television in the same resolution and framerate with minimal artifacting. The input lag has to be negligible too. Yikes, so how are we going to do this?

I'm going to go with the Google Cloud ecosystem, but you can probably do this with any cloud provider that offers GPUs. I chose Google Cloud because I like them and their VMs in my experience perform a little better.

As a proof of concept, I tried installing a few games on a virtual machine with a Tesla K80. I started with Rocket League since it's very popular, requires low lag input, has a bunch of motion, and I own it. I was disappointed when I was met with a graphics error.

Image from Driver Easy since I didn't take a screenshot of it.
I figured this was because the NVIDIA Drivers, while installed, didn't support the virtual screen I was using over RDP. So I killed this one and launched one with a P4 Grid card. I installed those drivers and... well... I got the same error.

Odd, I decided to try two other games. Goat Simulator said the same thing. But ClusterTruck started up just fine. One thing that many of the websites suggested was to force the game to start in windowed mode by adding a command flag to the Steam Launcher. So I did. And, wouldn't you know it! It actually worked. Goat Simulator started but froze before it got to any rendering. Rocket League, on the other hand, started and played with no problem at all. I was able to play it without much issue.



However, there were some drawbacks. I was playing this over Microsoft RDP connection (the kind that ships with windows that supports H264). While the lag didn't make it unplayable, it was still quite noticeable. I'm not very good at Rocket League (or any game, for that matter), but I'm sure a seasoned player would find this amount of lag intolerable. And I'm only at 1080p (windowed, so slightly less). Bandwidth wasn't the issue. I have gigabit download and Google has an unholy amount of bandwidth. I think this is just asking too much. I was downloading at about 40-50Mbps. That's a lot of data. We can't have that.

So how do we overcome this? Well now we start thinking about how we want to do this. I'm thinking a very-compressed H265 should give us the low bitrate we need to make this happen. On the server side, we can use hardware accelerated encoding offered by the GPU so that it doesn't impact the game (hardware accelerated encoding is discrete from the actual graphics parts of the GPU so, aside from PCIe bandwidth, it shouldn't hurt us too much). We can probably get away with the Tesla K80 since it's more than powerful enough and has the required hardware (and it's the cheapest one Google offers). We'll talk about the other end of the connection in a moment.

I wanted to try streaming H265 using VLC (because I didn't have a better option at this point). I'll cut right to the chase: that didn't work. I knew it was doing something - the CPU was maxed out - but I couldn't open the stream locally. I tried streaming it to a file and it only saved a few seconds, so I suspect that something is broken. It doesn't matter what, because we can't afford to use CPU cycles on transcoding.

The next thing to try was FFMPEG. I cross compiled FFMPEG for Windows using this script. It took forever, but the result was a fully featured FFMPEG with support baked in for basically everything. I can't redistribute it, unfortunately, but you can compile it yourself in a docker container or a regular Linux machine. It comes with support for both AMD and NVIDIA acceleration, so it's probably a good executable to have on hand anyway.

The problem with FFMPEG is that it's not a server. We can stream to something, but we can't listen for connections from clients. So we need some sort of proxy that negotiates both sides of the video transmission. At first, I used the nginx-rtmp module but discovered that RTMP uses FLV to mux the streams, and it doesn't support HEVC. I was able to get a few rtmp servers to stream my desktop to VLC, but they all had a horrific lag to them.

So I then compiled FFMPEG to accept an HEVC stream inside of it. I assigned it an ID of 12 (I had read a couple of places this was common in China). Unfortunately, VLC was unable to figure out what this meant. I might also have to break whatever is decoding it at the other end to force it to read these non-standard FLV files.

The player, in an ideal world, is anything with a screen. But, just because I'm interested in it, I'm going to leverage the 4K HEVC hardware decoding capabilities of the H3 on the Orange Pi. Sunxi maintains the hardware decoding library for that platform. I have yet to look into it, but it appears you can compile this and use it in something like mplayer. I can probably break mplayer to accept these non-standard FLV files, but that remains to be seen.

But what worries me the most is the latency. I'm not sure if that's the fault of GDIGrab, DirectShow, or RTMP itself. I was getting decent performance over H264 and RDP, but this would be far too much lag, so I have to fix that as well.

This will all be fixed at a later date, if ever. I wanted to get a start on this project and, truth be told, I got further than I thought. If you told me a week ago that I'd have cross-compiled a fully hardware accelerated version of FFMPEG for Windows, I wouldn't have believed you. So that's something, I guess.

No comments:

Post a Comment