Issue 44 - The Reusability Lag

Benoit Pimpaud

Feb 9

Being pragmatic

Read →

4 Comments

Fred de Villamil

Mar 5

We've been using DuckDB in production for a while now, and tested several approaches.

For now, the most (cost) efficient way we have found was to stream parquet files from a NFS based host.

The object storage model is way too expensive as you're paying on a per query basis. NFS file server makes it much easier, especially on the streaming part.

Expand full comment

Reply (1)

Benoit Pimpaud

Mar 5

Oh interesting. Do you have many concurrent read/write IOs there?

Also wondering how manage file versioning (but I can think of a snapshot/archive strategy as storage there is very cheap...)?

Expand full comment

Reply (1)

Fred de Villamil

Mar 5

Small world : you live / work 10 minutes from my place and my boss is an investor in your company 🤣.

Not that much concurrency, but we read thousand files in s single user requests. And we don't need versioning as these files are the final result of aggregated data.

Expand full comment

Reply (1)

Benoit Pimpaud

Mar 12

Ahah I have DM'd you on LinkedIn 😉

Expand full comment

From An Engineer Sight

Issue 44 - The Reusability Lag