Category Archives: LocustDB

How to Read 100s of Millions of Records per Second from a Single Disk

This article gives an overview of the implementation and performance of two recent additions to LocustDB, an extremely fast open-source analytics database built in Rust. The first addition is support for persistent storage, the second is an lz4 compression pass for data stored on disk or cached in memory. Benchmarking LocustDB on a dataset of 1.46 billion taxi rides demonstrates excellent performance for cold queries, reaching > 95% of sequential read speed on SSD and > 70% of sequential read speed on HDD. Comparing to ClickHouse, queries with similar data volume are as fast or faster, and disk space usage is reduced by 40%.

Continue reading

How to Analyze Billions of Records per Second on a Single Desktop PC

Introduction

This article gives an overview of LocustDB [1], a new and extremely fast open-source analytics database built in Rust. Part 1 gives some background on analytical query systems and my goals for LocustDB. Part 2 presents benchmark results produced on a data set of 1.46 billion taxi rides and demonstrates median speedups of 3.6x over ClickHouse and 2.1x over tuned kdb+. Part 3 is an architecture deep dive that follows two SQL statements through the guts of the query engine. Part 4 concludes with random learnings, thoughts on Rust, and other incoherent ramblings. Continue reading