Author Archives: clemenswinter

The Simple Beauty of XOR Floating Point Compression

I recently implemented a small program to visualize the inner workings of a scheme that compresses floating point timeseries by XORing subsequent values. The resulting visualizations are quite neat and made it much easier for me to understand this beautiful algorithm than any of the explanations that I had previously encountered.

Hacker News (280 points, 68 comments)

The algorithm

The algorithm¹ is simple. We start by writing out the first floating point number in full. All subsequent numbers are XORed with the previous number and then encoded in one of three ways:

Continue reading →

Entity-Based Reinforcement Learning

Leave a reply

Deep reinforcement learning is a powerful technique for creating effective decision-making systems, but its complexity has hindered widespread adoption. Despite the perceived cost of RL, a wide range of interesting applications are already feasible with current techniques. The main barrier to broader use of RL is now the lack of accessible tooling and infrastructure. In this blog post, I introduce the Entity Neural Network (ENN) project, a set of libraries designed to simplify the process of applying RL to complex simulated environments, significantly reducing the required engineering effort, computational cost, and expertise. The enn-trainer RL framework can be seamlessly integrated with complex simulators without requiring users to write custom training code or network architectures. Furthermore, the framework can train RL policies at a rate of more than 100,000 samples/s on a single consumer GPU, achieves similar performance to the IMPALA network architecture while using 50x fewer parameters and 30x-600x less FLOPS, and produces RL agents that are efficient enough to be run in real-time on the CPU inside web browsers.

Continue reading →

Machine Learning and the Future of Video Games

1 Reply

The rapid progress in deep reinforcement learning (RL) over the last few years holds the promise of fixing the shortcomings of computer opponents in video games and of unlocking entirely new regions in game design space. However, the exorbitant engineering effort and hardware investments required to train neural networks that master complex real-time strategy games might lead to the impression that the commercial viability of deep RL is still far in the future. To the contrary, I will argue in Part 1 of this essay that these techniques are imminently practical and may see widespread adoption within the next decade. Part 2 presents a case study in which I use deep RL to improve the design of a real-time strategy game. Finally, in Part 3, I speculate about the many ways in which machine learning will impact video games in the years to come.

Continue reading →

Conjuring a CodeCraft Mind

Leave a reply

Within these pages are recorded my attempts to wield the highest arcane art and conjure minds that play the game of CodeCraft.

Continue reading →

My Reinforcement Learning Learnings

Leave a reply

I spent a good chunk of my time over the last two years applying deep reinforcement learning techniques to create an AI that can play the CodeCraft real-time strategy game. My primary motivation was to learn how to tackle nontrivial problems with machine learning and become proficient with modern auto-differentiation frameworks. Thousands of experiment runs and hundreds of commits later I have much to learn still but like to think that I have picked up a trick or two. This blogpost gives an overview of the workflows and intuitions I adopted over the course of working on CodeCraft in the hope that they will prove useful to anyone else looking to pursue similar work. For a very different take on the same material that provides motivating examples for many of the ideas summarized here, check out my dark fantasy machine learning poem “Conjuring a CodeCraft Mind“.

Continue reading →

Mastering Real-Time Strategy Games with Deep Reinforcement Learning: Mere Mortal Edition

2 Replies

The capabilities of game-playing AIs have grown rapidly over the last few years. This trend has culminated in the defeat of top human players in the complex real-time strategy (RTS) games of DoTA 2 [1] and StarCraft II [2] in 2019. Alas, the exorbitant engineering and compute resources employed by these projects has made their replication difficult. As a result, the application of deep reinforcement learning methods to RTS games has remained disappointingly rare. In an attempt to remedy this sad state of affairs, this article demonstrates how you can use deep reinforcement learning to train your very own sociopaths for a nontrivial RTS game within hours on a single GPU. We achieve this by employing an array of techniques that includes a novel form of automatic domain randomization, curricula, canonicalization of spatial features, an omniscient value function, and a network architecture designed to encode task-specific invariants.

Continue reading →

Removing Algae from my Water Loop

2 Replies

This post documents the eradication of algae from my water cooled 4 GPU build.

Continue reading →

Building the Ultimate Deep Learning Workstation

Leave a reply

This blogpost documents the addition of a full custom water loop and two GTX 2080 Ti GPUs to my previous build. This work was done back in November/December of 2019 but didn’t get around to doing the write up until now.

Continue reading →

32 Core Threadripper + Dual 2080 Ti Build Log

1 Reply

This post documents my recent desktop PC build featuring a 32 core threadripper processor and two 2080 Ti graphics cards. It is specced to allow for another upgrade to four 2080 Ti graphics cards with a full custom water loop. I plan to use it mostly for machine learning and simulating reinforcement learning environments. As an added bonus, it can also run Crysis at 20FPS.

Continue reading →

How to Read 100s of Millions of Records per Second from a Single Disk

Leave a reply

This article gives an overview of the implementation and performance of two recent additions to LocustDB, an extremely fast open-source analytics database built in Rust. The first addition is support for persistent storage, the second is an lz4 compression pass for data stored on disk or cached in memory. Benchmarking LocustDB on a dataset of 1.46 billion taxi rides demonstrates excellent performance for cold queries, reaching > 95% of sequential read speed on SSD and > 70% of sequential read speed on HDD. Comparing to ClickHouse, queries with similar data volume are as fast or faster, and disk space usage is reduced by 40%.

Continue reading →