Pepitedata
  • Home
  • Blog
  • Contact
  • Privacy Policy
  • More

Blog

AI

Tesla’s .SMOL Format Shows Why Most Enterprise Data Lakes Are Architecturally Wrong

When Tesla published patent WO2024073080 describing a new file format internally called “.smol”, the headline was simple: 4x reduction in IOPS for AI training. Most people read this as a hardware story. It isn’t. It’s a data architecture story. And Read more

By jlu, 1 month2026-02-11 ago
ceph

Migrating Unmanaged OSDs to Managed OSDs in Ceph Squid

If you are running Ceph Squid with a mix of managed and unmanaged OSDs, there is one thing worth stating clearly upfront: There is no in-place adoption mechanism in Squid. You cannot “import” an existing OSD into the orchestrator. If Read more

By jlu, 2 months2026-02-05 ago
history

Un peu d’histoire familiale : Pierre Laurenceau, physicien

Pierre Laurenceau est un physicien et inventeur français, actif principalement entre les années 1950 et 1970. Il a travaillé à l’ESPCI Paris et a contribué à la physique expérimentale des champs électromagnétiques, à l’instrumentation non intrusive et à plusieurs développements Read more

By jlu, 2 months2026-01-25 ago
AI

GPUs Changed Everything. Storage Is the Bottleneck Again.

GPUs are no longer the bottleneck. Data movement is. I recently attended an online talk that stayed with me longer than most.Not because of a new GPU announcement, but because it clearly articulated something I have seen repeatedly over the Read more

By jlu, 2 months2026-01-23 ago
apache spark

Apache Spark smoke tests on kubernetes

Introduction Benchmarking remains a critical (and often underestimated) tool when designing or validating large-scale data platforms. While many teams rely on synthetic workloads or production replays, standardized benchmarks still play a key role when comparing architectures, tuning clusters, or validating Read more

By jlu, 2 months2026-01-12 ago
data

The Hidden Costs of Using HDDs in On-Premises MinIO Deployments

Bring the IOPS dude ! As a solutions architect working with MinIO storage solutions, I’ve seen firsthand the challenges that come with on-premises deployments using hard disk drives (HDDs). While HDDs may seem like a cost-effective option initially, they can Read more

By jlu, 3 months2025-12-15 ago
data

MinIO’s Community Edition: The End ? 🚨

Beware the AGPL! you may end with a product orphan from its S3 storage… MinIO has long been a popular open-source S3-compatible storage solution, but their strategy has shifted dramatically—especially since the release of their new AIstor product, now positioned Read more

By jlu, 3 months2025-12-15 ago
data

Who’s Using Ceph RBD Mirroring for Kubernetes Storage in Production?

Alpha Feature in Production… Good Idea? The most promising approach, journal-based mirroring, offers near real-time replication and faster failover. However, it’s currently an alpha feature in the Ceph CSI driver and relies on rbd-nbd, which introduces significant risks: For documentation Read more

By jlu, 3 months2025-12-15 ago
ceph

Choosing the Right Storage Backend for Kubernetes PVCs

Most discussions about PVC focus on on-prem deployments, but many of these technologies are equally relevant in the cloud, especially when using managed block storage like AWS EBS, which caps at around 64K IOPS. By contrast, OpenEBS LocalPV (ZFS or Read more

By jlu, 3 months2025-12-15 ago
data

🚀 Just discovered TigerBeetle DB – and it’s redefining what’s possible in financial database performance

As someone who’s spent years optimizing data storage performance, I’m genuinely impressed by TigerBeetle’s bold architectural choices: https://lnkd.in/eS3bfHHQ 🎯 Single-core design for maximum throughputInstead of fighting contention with complex locking, they funnel all writes through one optimized core. For high-contention Read more

By jlu, 3 months2025-12-15 ago

Posts pagination

1 2 Next
Hestia | Developed by ThemeIsle