058 - Right Back on my Feet

March 15, 2020

In #052 I started working on refactoring the game's client side to be well tested and use specs ECS, motivated by a refactor of the backend at the end of last year that significantly sped up my ability to add functionality to the backend.

In #056, over a month later, I was feeling a bit drained from going too long without shipping anything.

This week we finally merged the refactor branch and deployed - it felt great.

Client side refactor PR I've been working on this branch since the end of January, save for two weeks I spent on vacation. I'm very glad to be able to merge and start deploying frequent improvements again.

The first glimmer of hope

My major goal for this passed week was to deploy something, even if it was severely lacking. I knew I could iterate quickly with the new code and get something more compelling up in short order. I just needed to release something and get back the sanity and bliss that frequent deploys provide.

I felt a major feeling of joy and relief on Tuesday when I finally got all of the new systems in the the game client hooked up.

The client would load, download a map that described where it could find different assets, and then dynamically load the assets that it needed. The RenderSystem would push these assets onto the GPU and create a RenderJob every frame that the WebGlRenderer could use to render the game.

Seeing something Wow. 17:06 on Tuesday and you have no idea how much of a smile this image brought to my face. This image confirms that the game's WebAssembly binary is running without any hitches. The fact that I can see those little white squares means that everything will be okay. I'm unreasonably pumped right now - let's keep pushing and get some real assets in there!

It felt really good.

Asset Compilation Rewrite

I have a command line interface that I use to automate different tasks and aspects of working on AKigi, powered by StructOpt.

$ ak --help
ak 0.0.1
Chinedu Francis Nwafili <frankie.nwafili@gmail.com>
The Akigi Development CLI

USAGE:
    ak <SUBCOMMAND>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

SUBCOMMANDS:
    asset               Work with our PSD, Blender and other asset files.
    bash-completions    Generate bash completions in $HOME/.tw/
    build               Build non-optimized versions of applications / assets for local development
    db                  Work with our database
    deploy              Deploy our different production services
    dist                Build applications / assets for distribution
    help                Prints this message or the help of the given subcommand(s)
    test                Run unit/integration tests

The command ak asset compile takes all of our asset source files such as .blend, .png and .psd and exports the data into formats that the game expects.

The previous code backing this command was untested and hard to work with. As with all older parts of the codebase I've learned quite a lot about writing maintainable Rust code since I first wrote it and it was in need of a good tune up.

The new code is well tested and much more DRY.

Most of the test cases use the tempfile crate to generate temporary directories, place a .blend or other asset files into this directories, and then verify that the output directory contains what I expected to have been exported.

// An example asset export test

/// Verify that we export the armatures from a .blend file.
///
/// We provide a source Blend file then check the output dir and verify that the output
/// file deserialized back into a valid BlenderArmature.
#[test]
fn exports_armatures_from_blender() -> Result<(), anyhow::Error> {
    let (source_dir, cache_dir, out_dir) = three_temp_dirs()?;

    copy_file_to_source_dir(&mesh_and_armature_blend(), &source_dir)?;

    let content_hashes = process_blender_files(&source_dir, &out_dir, None, None)?;

    let (armature_name, armature_hash) = content_hashes.armatures.iter().next().unwrap();
    assert_eq!(armature_name.as_str(), "SomeArmature");

    let out_dir = out_dir.into_path();
    let armature = std::fs::read(out_dir.join(armature_hash))?;

    let armature: BlenderArmature = bincode::deserialize(armature.as_slice())?;

    assert_eq!(armature.actions.len(), 1);
    assert!(armature.actions.get(&"SomeAction".to_string()).is_some());

    Ok(())
}

The rectangle-pack crate that I wrote a week or two ago worked like a charm.

I'm packing a texture atlases as 2048x2048, 4096x4096, 8192x8192 and 16384x16384 then the client downloads the appropriate atlases based on their device's max texture size.

The asset compilation process generates several different atlases per size to accomodate different detail levels. Higher detail levels use larger textures, at the cost of more GPU memory. Lower detail atlases pack in more textures at smaller resolutions but sacrifice visual quality.

I want the game to be playable on older hardware so having the ability to select the quality level of textures that the game uses should come in handy.

// Here's a snippet with the initialization function for the web client
// When we initialize the `web-client` we get the max texture size.
// The `game-app` crate then uses this when determining which assets
// to download.
//
// ---

impl WebClient {
    /// Create a new instance of the WebClient.
    ///
    /// This typically happens once.
    ///
    /// TODO: Pass in the canvas' id instead of hard coding it so that we can embed the
    /// web client on any site should we want to (i.e. I could embed it into the dev journals).
    #[wasm_bindgen(constructor)]
    pub fn new() -> WebClient {
        #[cfg(not(feature = "production"))]
        console_log::init_with_level(Level::Debug);

        console_error_panic_hook::set_once();

        let canvas = get_akigi_canvas();
        let gl = get_webgl_context(&canvas);

        let max_texture_size = gl
            .get_parameter(WebGlRenderingContext::MAX_TEXTURE_SIZE)
            .unwrap()
            .as_f64()
            .unwrap() as u32;

        let app = App::new(ClientSpecificResources {
            device_info: DeviceInfoResource::new_with_device_max_texture_size(max_texture_size),
            game_server_connection: ClientSpecificGameServerConn(Box::new(
                WebClientGameServerConn::new(),
            )),
            renderer: ClientSpecificRenderer(Box::new(WebGlRenderer::new(gl).unwrap())),
            asset_loader: ClientSpecificAssetLoader(Box::new(WebClientAssetLoader::new())),
            system_time: ClientSpecificSystemTimeFn(Box::new(|| {
                perf_to_system(performance().now())
            })),
        });

        WebClient {
            app: Rc::new(RefCell::new(app)),
        }
    }
}

Terrain chunks are loaded dynamically based on the camera's location.

Right now the terrain is powered by files that look like this:

ls game-assets/terrain/
wh_15_c_0_0.blendmap.png        wh_15_c_0_0.chunk.yml           wh_15_c_0_0.heightmap.png

The asset compiler takes these and generates TerrainChunk structs that pack everything that you need to know about a piece of terrain and the tiles within that terrain.

I'm not worrying about levels of detail at the moment - but I built things in a way to make that easy to add in later if I need to. I'm not yet sure if this will be necessary.

Deploying

I first introduced continuous deployment in #038 back in July 2019.

It's been great, except for one big caveat.

The website, authentication server and payment server used AWS ECS, but the game-server was deployed to EC2.

The ECS deployments were easy, the EC2 deployments weren't.

This was because the game-server runs a websocket connection which isn't a good fit for ECS. The game-server also has special needs. I can't just update it while players are connected - so the game server needs to know ahead of time when it's going to be terminated so that it can notify connected players.

I had a couple of crates and terraform scripts that handled provisioning the EC2 instance and deploying new versions of the game server, but it just never felt as easy as pushing a container to ECS and being done.

I also was weary of investing further in that process if there were already things out there that could handle my deployment needs.

When we deployed our code this week the game-server was crashing whenever a user logged out.

This was odd because I couldn't replicate it locally and that area of the code had both unit and integration tests that were passing. It was a classic "works on my machine" issue.

I spent a few hours investigating and then eventually decided that it was time to invest in true dev-prod parity.

I turned to Kubernetes, something I've been considering for a couple of months, and after a lot of struggling and troubleshooting I now have a cluster running locally and am close to deploying one to production.

So from now on our game server will run in Docker and be deployed to a Kubernetes cluster on AWS EKS.

I'm happy about this because it means that I won't be inventing my own deploy processes - I can leverage existing tried and true work that is much more robust than my hodge podge of scripts.

Other Progress

  • I learned a boat load about networking while setting up the Kubernetes cluster.

Next Week

Top of the week I'm focused on deploying the game to AWS EKS and tweaking our CI/CD process now that we no longer use AWS ECS.

I'll start the process of moving away from CircleCI towards GitHub actions - which I like much better.

Then mid week I'll be back to working on the actual game. I'm starting off focusing on the start area of the game and adding back some things that I haven't done yet in the new refactored client.

After that I'll take a couple of hours to take stock an prioritize what to work on, with a goal of launching on April 9th and iterating publicly from there on out.


Cya next time!

- CFN