|
| jwr wrote:
| I just implemented a database with changefeeds using FoundationDB
| (in Clojure), to eventually replace RethinkDB in my system. Very
| impressed so far.
| jbverschoor wrote:
| It's unfortunate that they went silent for years after the Apple
| acquisition. That period was key for database adoption. I have
| the feeling everybody kind of settled for pgsql.
| threeseed wrote:
| > I have the feeling everybody kind of settled for pgsql.
|
| That's probably because of spending time on this echo chamber.
|
| In reality everyone has likely been staying with the same
| databases they know and love but just moved to the cloud. It's
| why now AWS for example offers such a wide variety of databases
| e.g. MySQL, PostgreSQL, SQL Server, Oracle, MongoDB, Cassandra,
| Redis.
| eloff wrote:
| Those are two completely non overlapping use cases. If you can
| use pgsql for your problem, you have no business trying to use
| a distributed key value store instead. That would be at least
| as dumb as driving screws with a hammer.
| cwp wrote:
| Yeah, but there are quite a few efforts out there to extend
| PG into a distributed DB of one flavor or another. Some
| examples are YugabyteDB, CockroachDB, Aurora and Citus. It's
| a reasonable approach, but it's also reasonable to come at it
| from the other direction - build a SQL engine on top of a
| solid distributed key-value store. Contrafactuals are always
| dicey, but FDB vanishing behind the Apple wall of silence
| sure didn't help.
| eyelovewe wrote:
| CouchDB 4 is built upon Foundation FWIW
| jbverschoor wrote:
| Didn't know, very happy to hear
| rubyn00bie wrote:
| Here's one of my favorite articles on FoundationDB, where it
| (FDB) passes Jepsen first try:
| https://web.archive.org/web/20150312112556/http://blog.found...
|
| > I ran FoundationDB Key-Value Store through every nemesis in
| Jepsen - including those that found failures in other databases -
| and FoundationDB passed all of them with flying colors.
|
| FoundationDB is one of the coolest pieces of technology I've used
| in the past decade. The tuple keyspace is incredibly useful, so
| are the multi-key transactions. I've physically killed the power
| on an FDB node and FDB cluster; multiple times (heh, home
| servers)... and _every_ time the cluster or node just comes back.
| gregwebs wrote:
| That's great that you are doing your own resiliency testing.
|
| Having someone other than those officially on the Jepsen
| project run the Jepsen test is a good start. However, many
| databases have claimed to run the Jepsen tests themselves and
| pass, but when there is an actual paid engagement for a
| distributed database there are always issues that are found.
| That's generally true even for unpaid official runs as well
| although Zookeeper did pass existing tests. Every database is
| different and the paid engagement will design specific tests
| designed to break the database in question.
| kendallgclark wrote:
| This was the FDB team's stock demo in the early days. It's a
| killer move.
| jFriedensreich wrote:
| I am pretty sure that the new cloudant transaction/storage engine
| is also based on foundationDB, which powers a lot of things
| behind the scenes at ibm. And couchdb 4 with foundationDB storage
| engine is hopefully not too far out either. Lets see how long
| this whole transition takes, but i am still hopeful that the
| mindshare and motivation of apple, snowflake, ibm and apache
| community will lead to something great.
| jorangreef wrote:
| Markus Pilman from Snowflake did an awesome talk on
| FoundationDB's testing at CMU's Quarantine Tech Talks (2020), How
| I Learned to Stop Worrying and Trust the Database:
|
| https://www.youtube.com/watch?v=OJb8A6h9jQQ
| sgk284 wrote:
| Here's another excellent talk at Strangeloop on FoundationDB's
| simulation testing by Will Wilson in 2014:
| https://www.youtube.com/watch?v=4fFDFbi3toc
| jtdev wrote:
| I'd love to see a good primer on data models and scenarios that
| are well suited to FDB.
| selljamhere wrote:
| Their docs might be a good place to start.
| https://apple.github.io/foundationdb/developer-guide.html#da...
| sigstoat wrote:
| this is limited by your creativity and willingness to make
| tradeoffs.
|
| the only really general statement i can think of is that the
| "larger"/"longer" your transactions are, the harder a time
| you'll have getting it to cooperate with FDB. "small"/"fast"
| transactions will be easier to fit into its model.
|
| (to likely replies: this isn't an absolute, see all the quotes.
| yes things like redwood will alleviate some of this, but not
| all.)
| vvern wrote:
| IIRC fdb is fully optimistic concurrency control. It doesn't
| do any locking. If you have workloads which are highly
| contended, you'll need to do something in the layer above to
| coordinate. Otherwise, performance will be unbearable.
|
| This may be out-dated, please let me know if the story has
| evolved here.
| georgelyon wrote:
| FDB is an awesome and unique piece of software (I attribute quite
| a bit of Snowflake's success to FDB). I've also had the pleasure
| of meeting some folks from the original team and they are true
| engineers. Does anyone know if/when Redwood (the new storage
| engine) has landed / will land?
| victor106 wrote:
| > I attribute quite a bit of Snowflake's success to FDB
|
| How so?
| foobiekr wrote:
| Snowflake is the biggest deployment of fdb in the world after
| iCloud.
| kendallgclark wrote:
| Founders are building a distributed systems simulation product
| now called Antithesis. My data fabric startup, Stardog, is a
| happy Antithesis early adopter customer. It's helping us
| reproduce and fix non-deterministic bugs deterministically.
| Good stuff.
| twoodfin wrote:
| Did they ever implement a SQL layer? They seemed like one of the
| only NoSQL products with the architecture to make it plausible to
| do so.
| polskibus wrote:
| What is the backup / restore story in FoundationDB? How does it
| compare to postgresql?
| ex3ndr wrote:
| Much much better. Single line backup/restore and Disaster
| Recovery mode that syncs second DC and able too switch on the
| fly with barely any configs (except one file).
| e12e wrote:
| This seems like a good place to ask - are there any new and
| exiting FOSS "application" worth checking out? I recall from the
| initial publication of the source - there was references to a
| great sql layer? I don't know if a FOSS work-a-like ever
| materialized? Other things I'd hoped for was a network
| filesystem/blob layer, like maybe s3/nfs/webdavfs compatible?
| What are people building on top of foundationdb today?
|
| Ed: i suppose various document/db applications - like IMAP might
| be a good fit too?
| jFriedensreich wrote:
| large unstructured blobs and large files are among the things
| not well suited to foundationdb and couchdb 4 actually reduced
| supported blob size in the transition to foundationdb. it looks
| like object/blob storage systems are at the moment rather
| seperating more from key/value and document storage than
| growing together. but this is a good thing because the
| tradeoffs are very different and it allows each system to focus
| on what it does best. blob stores will hopefully move even more
| to content addressing and merkle dag similar to git and ipfs.
| agency wrote:
| I'm curious about this as well. Is anyone working on building
| text search on top of FDB? It's kind of astounding to me that
| last time I checked Elasticsearch was still essentially the
| only game in town.
| jFriedensreich wrote:
| its pretty hard to catch up with lucene, there is just so
| much work, features and brainpower in there at this point. as
| many features of foundationdb such as the transaction
| guarantees and reliability are not super important for
| fulltext search i cannot imagine any company even apple or
| ibm being able to justify that gigantic investment, instead
| im sure nearly any soluion willcontinue to use lucene under
| the hood for the forseeable future.
| sigstoat wrote:
| peruse the fdb forum. they produce document and record layers
| now. there are community layers of varying quality for a
| network block device, a filesystem, and a few other things.
| AtlasBarfed wrote:
| They got acquihired by apple, didn't they? Was. Fdb ever oss'd?
|
| Is it CP or AP? Comments seem to imply AP
| ssgao wrote:
| FoundationDB is Apache 2.0
| https://github.com/apple/foundationdb/blob/master/LICENSE
|
| It is CP per https://apple.github.io/foundationdb/cap-
| theorem.html
| kendallgclark wrote:
| It wasn't an acquihire. Apple paid a lot of $$ for FDB.
| [deleted]
| ryanworl wrote:
| Two quotes from the paper that I think will motivate people to
| read it:
|
| "Rigorous correctness testing via simulation makes FDB extremely
| reliable. In the past several years, CloudKit [59] has deployed
| FDB for more than 0.5M disk years without a single data
| corruption event. Additionally, we constantly perform data
| consistency checks by comparing replicas of data records and
| making sure they are the same. To this date, no inconsistent data
| replicas have ever been found in our production clusters."
|
| "For example, early versions of FDB depended on Apache Zookeeper
| for coordination, which was deleted after real-world fault
| injection found two independent bugs in Zookeeper (circa 2010)
| and was replaced by a de novo Paxos implementation written in
| Flow. No production bugs have ever been reported since."
| jeffbee wrote:
| Ehhhh, doesn't align with my experience. I think FDB is
| actually really poorly tested. When I was evaluating it for
| replacement of the metadata key-value store at a major, public
| web services company we found that injecting faults into
| virtual NVMe devices on individual replicas would cause corrupt
| results returned to clients. We also found that it would just
| crash-loop on Linux systems with huge pages, because although
| someone from the project had written a huge-page-aware C++
| allocator "for performance", evidently nobody had ever actually
| tried to use it, including the author.
|
| It's also really, really weird that their non-scalable
| architecture hits a brick wall at 25 machines. Ignoring the
| correctness flaws, it only works if you can either design
| around that limit by sharding, and never off cross-shard
| transactions, or if you can assure yourself that your use case
| will never outgrow half a rack of equipment.
| fnordpiglet wrote:
| Can you fix a point in time? Software evolves and I think a
| point I saw is that it wasn't well tested then they changed
| once production workloads told them it needs to change.
| bpicolo wrote:
| What were the strong contenders?
| rbranson wrote:
| Were there other distributed databases that did pass the
| fault injection testing?
| jeffbee wrote:
| There weren't any, which is why that particular shop
| elected to roll their own distributed system on top of
| rocks.
|
| In general I think people who think they want to do
| FoundationDB owe themselves a serious contemplation of the
| cost/benefit of using Cloud Spanner instead. Obviously you
| cannot do your own fault injection testing of Spanner, but
| it does have end-to-end checksums.
| sigstoat wrote:
| > There weren't any, which is why that particular shop
| elected to roll their own distributed system on top of
| rocks.
|
| that's nuts. rocks could've been added as a storage
| engine to fdb far more easily.
| ryanworl wrote:
| This is currently in progress right now.
|
| https://github.com/apple/foundationdb/blob/e7d7b39f12afa8
| ea2...
| jeffbee wrote:
| For the record, I said the same thing. But it's a
| management problem because on the one hand you have a
| known open project with demonstrable flaws, and on the
| other you have your own in-house developers and you will
| tend to discount the bugs they haven't written yet.
|
| But, also for the same record, thinking you can implement
| a reliable, globally-replicated key-value store on top of
| FoundationDB that is cheaper and better than Cloud
| Spanner may be evidence of the same cognitive bias.
| sigstoat wrote:
| > But, also for the same record, thinking you can
| implement a reliable, globally-replicated key-value store
| on top of FoundationDB that is cheaper and better than
| Cloud Spanner may be evidence of the same cognitive bias.
|
| man, good thing nobody made any claim like that.
| sandinmyjoints wrote:
| What is the Flow referred to here?
| oconnor663 wrote:
| It's an async/await framework for C++. I'm not sure what the
| best source on this is, but here's a discussion:
| https://forums.foundationdb.org/t/why-was-flow-
| developed/171...
|
| My understanding is that FDB relies heavily on deterministic
| simulations for testing, and that their async/await model is
| a big part of how they make sure they cover different
| possible interleavings in a deterministic way.
| jorangreef wrote:
| Thanks for the quotes, I've been wanting to read this paper for
| some time. Great to see they went through the consensus
| literature and made a decision to go with Active Disk Paxos,
| instead of stopping short and not fully understanding the
| consensus they're building on. The consensus and replication
| protocol is such a huge part of building a distributed
| database.
| fizwhiz wrote:
| > de novo Paxos implementation written in Flow
|
| That's... brave. Flow is a DSL built on top of C++?
| alistairw wrote:
| Yeah it's their own language on top of c++ to help them with
| testing distributed systems with deterministic simulation.
|
| Their talk from a while ago about it was something that
| really blew me away at the time [0]
|
| [0] https://www.youtube.com/watch?v=4fFDFbi3toc
| monstrado wrote:
| Have nothing but praise for FoundationDB. It has been by far the
| most rock solid distributed database I have ever had the pleasure
| of using. I used to manage HBase clusters, and the fact that I
| have never once had to worry about manually splitting "regions"
| is such a boon for administration...let alone JVM GC tuning.
|
| We run several FDB clusters using 3-DC replication and have never
| once lost data. I remember when we wanted to replace all of the
| FDB hardware (one cluster) in AWS, and so we just doubled the
| cluster size, waited for data shuffling to calm down, and just
| started axing the original hardware. We did this all while
| performing over 100K production TPS.
|
| One thing that makes the above seamless for all existing
| connections is that clients automatically update their "cluster
| file" in the event that new coordinators join or are reassigned.
| That alone is amazing...as you don't have to track down every
| single client and change / re-roll with new connection
| parameters.
|
| Anyway, I talk this database up every chance I get. Keep up the
| awesome work.
|
| - A very happy user.
___________________________________________________________________
(page generated 2021-06-07 23:00 UTC) |