Category: performance

Two panels. Top: a single MinIO bucket overflowing with small files, a worried Tux, and a dead directory tree on a tombstone. Bottom: the same objects split across four prefix-partitioned buckets under separate folders, with a healthy green directory tree.

MinIO on XFS: Inode Exhaustion and Prefix Design for Lots of Small Files

MinIO has no NameNode. It puts the namespace on XFS as real directories and xl.meta files. On Lots of Small Files that means you can exhaust inodes while df -h still looks fine, and a flat leaf can stall PUT, LIST, scanner, and ILM together. Here is the on-disk model, the inode math, and a prefix recipe that keeps XFS inside a regime you can operate.

By jlu, 6 days2026-07-27 ago

Erasure coding is efficient on big files but degrades to many inefficient copies on small files

MinIO

MinIO and Small Files: When Erasure Coding Becomes 15x Replication

MinIO fixed the HDFS NameNode limit, but it has no index: it writes one xl.meta per object on every drive of the erasure set. On 5 servers of 12 NVMe, MinIO picks a 15-wide set by default, so each small object is stored 15 times over. A worked sizing that looks fine for two years and dies in days.

By jlu, 1 week2026-07-23 ago

Abstract visualization of Sail engine bridging Rust and Spark technologies

apache spark

Sail: When Apache Spark Meets Rust (A Practitioner’s Deep Dive)

Sail is an open-source Apache Spark replacement written in Rust. It drops the JVM, speaks Spark Connect, and runs 4 to 6 times faster than Spark with native accelerators on ClickBench. A deep dive into its architecture, benchmarks, and production readiness.

By jlu, 2 months ago

Abstract illustration of GPU data flow and infrastructure optimization with geometric chip patterns

Why Your GPU Infrastructure Costs 40% More Than It Should

Most AI infrastructure teams spend 35-60% more on GPU compute than they need to. The cause isn’t cloud pricing. It is architecture, and it is fixable.

By jlu, 2 months ago

Abstract grid of glowing compute cells densely packed into reserved cluster capacity

apache spark

Spark on Kubernetes Reserves CPU It Never Uses. Here’s the Overcommit Fix.

Spark sets executor CPU requests equal to limits, so a Kubernetes cluster reserves twice the CPU it uses and refuses to schedule pending pods. Kubernetes has no native overcommit. Here is the mutating-webhook operator I use to fix it.

By jlu, 2 months ago

data

Stop using MinIO as a NoSQL database — why S3 object stores collapse on small-file workloads

Two MinIO platforms, same root cause: used as a NoSQL store. Field notes on LIST IOPS, XFS directory limits, scanner & heal SLAs, the erasure-coding storage-efficiency inversion on small objects, and why Apache Cassandra (or Ceph) is the right answer on-prem in 2026.

By jlu, 3 months2026-05-17 ago

Tesla’s .SMOL Format Shows Why Most Enterprise Data Lakes Are Architecturally Wrong

When Tesla published patent WO2024073080 describing a new file format internally called “.smol”, the headline was simple: 4x reduction in IOPS for AI training. Most people read this as a hardware story. It isn’t. It’s a data architecture story. And it exposes a structural weakness in how most enterprise data Read more

By jlu, 6 months2026-02-11 ago

GPUs Changed Everything. Storage Is the Bottleneck Again.

GPUs are no longer the bottleneck. Data movement is. I recently attended an online talk that stayed with me longer than most.Not because of a new GPU announcement, but because it clearly articulated something I have seen repeatedly over the years, across very different systems. That message, strongly emphasized by Read more

By jlu, 6 months2026-01-23 ago