Pepitedata
  • Home
  • Audits
  • Expert Call
  • About
  • Blog
  • Contact

performance

Abstract grid of glowing compute cells densely packed into reserved cluster capacity
apache spark

Spark on Kubernetes Reserves CPU It Never Uses. Here’s the Overcommit Fix.

Spark sets executor CPU requests equal to limits, so a Kubernetes cluster reserves twice the CPU it uses and refuses to schedule pending pods. Kubernetes has no native overcommit. Here is the mutating-webhook operator I use to fix it.

By jlu, 1 week ago
Diagram: MinIO erasure-coding storage inflation on small objects — per-drive xl.meta metadata replicated across the erasure set
data

Stop using MinIO as a NoSQL database — why S3 object stores collapse on small-file workloads

Two MinIO platforms, same root cause: used as a NoSQL store. Field notes on LIST IOPS, XFS directory limits, scanner & heal SLAs, the erasure-coding storage-efficiency inversion on small objects, and why Apache Cassandra (or Ceph) is the right answer on-prem in 2026.

By jlu, 3 weeks2026-05-17 ago
AI

Tesla’s .SMOL Format Shows Why Most Enterprise Data Lakes Are Architecturally Wrong

When Tesla published patent WO2024073080 describing a new file format internally called “.smol”, the headline was simple: 4x reduction in IOPS for AI training. Most people read this as a hardware story. It isn’t. It’s a data architecture story. And it exposes a structural weakness in how most enterprise data Read more

By jlu, 4 months2026-02-11 ago
AI

GPUs Changed Everything. Storage Is the Bottleneck Again.

GPUs are no longer the bottleneck. Data movement is. I recently attended an online talk that stayed with me longer than most.Not because of a new GPU announcement, but because it clearly articulated something I have seen repeatedly over the years, across very different systems. That message, strongly emphasized by Read more

By jlu, 5 months2026-01-23 ago
  • Privacy Policy
  • Mentions légales
  • CGV
  • Cookies
Hestia | Developed by ThemeIsle