Pepitedata
  • Home
  • Audits
  • Expert Call
  • About
  • Blog
  • Contact

Blog

Abstract grid of glowing compute cells densely packed into reserved cluster capacity
apache spark

Spark on Kubernetes Reserves CPU It Never Uses. Here’s the Overcommit Fix.

Spark sets executor CPU requests equal to limits, so a Kubernetes cluster reserves twice the CPU it uses and refuses to schedule pending pods. Kubernetes has no native overcommit. Here is the mutating-webhook operator I use to fix it.

By jlu, 1 week ago
Diagram: MinIO erasure-coding storage inflation on small objects — per-drive xl.meta metadata replicated across the erasure set
data

Stop using MinIO as a NoSQL database — why S3 object stores collapse on small-file workloads

Two MinIO platforms, same root cause: used as a NoSQL store. Field notes on LIST IOPS, XFS directory limits, scanner & heal SLAs, the erasure-coding storage-efficiency inversion on small objects, and why Apache Cassandra (or Ceph) is the right answer on-prem in 2026.

By jlu, 3 weeks2026-05-17 ago
data

When the Lakehouse Works, but the Operating Model Breaks… Especially On-Premises

Note: This article was inspired by a LinkedIn post by Can Sinan A. on the hidden operational cost of an Iceberg migration, especially when governance moves from one access-control plane to several layers: catalog, storage and engine configuration Introduction Many Read more

By jlu, 1 month2026-05-04 ago
AI

Tesla’s .SMOL Format Shows Why Most Enterprise Data Lakes Are Architecturally Wrong

When Tesla published patent WO2024073080 describing a new file format internally called “.smol”, the headline was simple: 4x reduction in IOPS for AI training. Most people read this as a hardware story. It isn’t. It’s a data architecture story. And Read more

By jlu, 4 months2026-02-11 ago
ceph

Migrating Unmanaged OSDs to Managed OSDs in Ceph Squid

If you are running Ceph Squid with a mix of managed and unmanaged OSDs, there is one thing worth stating clearly upfront: There is no in-place adoption mechanism in Squid. You cannot “import” an existing OSD into the orchestrator. If Read more

By jlu, 4 months2026-02-05 ago
history

Un peu d’histoire familiale : Pierre Laurenceau, physicien

Pierre Laurenceau est un physicien et inventeur français, actif principalement entre les années 1950 et 1970. Il a travaillé à l’ESPCI Paris et a contribué à la physique expérimentale des champs électromagnétiques, à l’instrumentation non intrusive et à plusieurs développements Read more

By jlu, 5 months2026-01-25 ago
AI

GPUs Changed Everything. Storage Is the Bottleneck Again.

GPUs are no longer the bottleneck. Data movement is. I recently attended an online talk that stayed with me longer than most.Not because of a new GPU announcement, but because it clearly articulated something I have seen repeatedly over the Read more

By jlu, 5 months2026-01-23 ago
apache spark

Apache Spark smoke tests on kubernetes

Introduction Benchmarking remains a critical (and often underestimated) tool when designing or validating large-scale data platforms. While many teams rely on synthetic workloads or production replays, standardized benchmarks still play a key role when comparing architectures, tuning clusters, or validating Read more

By jlu, 5 months2026-01-12 ago
data

The Hidden Costs of Using HDDs in On-Premises MinIO Deployments

Bring the IOPS dude ! As a solutions architect working with MinIO storage solutions, I’ve seen firsthand the challenges that come with on-premises deployments using hard disk drives (HDDs). While HDDs may seem like a cost-effective option initially, they can Read more

By jlu, 6 months2025-12-15 ago
data

MinIO’s Community Edition: The End ? 🚨

Beware the AGPL! you may end with a product orphan from its S3 storage… MinIO has long been a popular open-source S3-compatible storage solution, but their strategy has shifted dramatically—especially since the release of their new AIstor product, now positioned Read more

By jlu, 6 months2025-12-15 ago

Posts pagination

1 2 Next
  • Privacy Policy
  • Mentions légales
  • CGV
  • Cookies
Hestia | Developed by ThemeIsle