proxy70

	[HN Gopher] Patterns of Distributed Systems (2020) ___________________________________________________________________ Patterns of Distributed Systems (2020) Author : sbmthakur Score : 240 points Date : 2021-02-10 14:39 UTC (8 hours ago)
	web link (martinfowler.com)
	w3m dump (martinfowler.com)
	\| edoceo wrote: \| Oh man, patterns of enterprise, it's like 20 years old now! I \| really like Martin's work. Worth every penny \| mpfundstein wrote: \| id rather argue that his book was one of the major reasons why \| we ended up in over engineering hell. \| \| i was affected by that as a junior and even medior as well and \| i caused a lot of harm :-) \| \| luckily i learned to step away from it. \| GordonS wrote: \| I was exactly the same in my earlier dev days! I'd learn \| about a pattern, think it was the greatest thing since sliced \| bread, and see uses for it everywhere... where it really \| wasn't suitable at all. I was like a builder with only a \| hammer - everything looked like a nail! \| \| I ended up making a lot of things a lot more complex than \| they needed to be, or/and a lot less performant than they \| could be. \| \| At some point, lot's of people seemed to come to the same \| realisation - worshipping design patterns is a _bad idea_. \| Around this time I first heard about "cargo culting". \| \| These days my motto is "no dogma" (well, except about not \| being dogmatic ;), and I think it serves me well. \| zwieback wrote: \| I think I know what you're saying. I remember that time, \| around the birth of Java. But I also remember that we were \| trying to solve real problems that aren't such big problems \| anymore: modularization, dependency management, large \| codebases with crude editing tools, SW delivery and \| distribution in the age of diskettes! \| \| It turns out that developing large systems in static \| languages with heavily constrained libraries is difficult. \| Rapid zero-cost distribution and decentralized storage AND \| GUIs have really changed the field. Does anyone even call \| themselves a SW Engineer anymore? \| kitd wrote: \| I'd argue that the over-engineering hell preceded this book, \| with every tiny J2EE project using a full 3-tier stack, incl \| (but not limited to) stateful EJBs, session EJBs, countless \| JSPs, DAO layers, god only know what else. \| \| It was this book that actually revealed, in patterns, the \| bigger picture, where it was all going wrong, and the \| alternatives. This book and Spring v1.0 go along together in \| my mind. \| \| BTW (and OT), another underrated Fowler book in a similar \| vein is "Analysis Patterns", another 10k-ft view of common \| business problems. \| mmcdermott wrote: \| I'm a little torn on this one. On the one hand I see it, \| because it feels like patterns books made the limitations of \| C++/Java/C# seem respectable, even desirable. Norvig observed \| this with his talk on patterns in dynamic languages (p. 10 of \| https://norvig.com/design-patterns/ summarizes this). \| \| On the other hand, I have found his work to be useful in \| selling approaches to others. There were times where I did a \| design, then browsed the patterns books to find names for \| what I was doing to "make it respectable." \| blandflakes wrote: \| I've done the opposite, disguising something that is a \| named pattern as anything but a factory to prevent \| structure-avoidant zealots from knee-jerk reacting to the \| word in a PR. \| edoceo wrote: \| Oh, see, I saw that book after watching dozens of dot-com \| companies re-implemting common patterns (poorly). Basically \| under-engineering heck. So, once we had the book we could \| shift the conversation and use common terminology. \| \| The discipline is, of course, knowing which spells to cast \| and when. And it's super useful to see the scroll of common \| spells. \| zerkten wrote: \| A problem in a lot of cases is that for some people, the \| discipline is about casting spells as opposed to solving \| problems. It doesn't take a lot of these people to damage \| projects, but they exist independent of pattern books. \| \| If I was going to complain about engineering hell, it'd be \| about how shallow we investigate these areas. We stop at \| the immediate technical events, or inappropriate technology \| choices, instead of getting to the motivations behind it. \| This gets into a qualitative area that HN readers tend to \| want to avoid, but these flaws are with the humans and \| their circumstances. \| nivertech wrote: \| Can you provide some concrete examples? \| \| I.e. pattern X was recommended for a scenario Y, but ended up \| in over-engineering hell for Z. \| withinboredom wrote: \| IoC is one that comes to mind. It greatly simplifies \| testing, but the engineering (before we had libraries for \| it) was ridiculous to implement. \| mmcdermott wrote: \| I think this is a good one. Most recent Java/C# systems \| would have a full IoC container, but have no dynamically \| selected components (which is how frameworks like Dagger \| -https://dagger.dev/ - can exist). A lot of runtime \| reflection/calculation gets done for something that can \| be known at compile time. \| AtlasBarfed wrote: \| I'll plant a midpost flag and say, while it shouldn't be \| revered as a bible of eternal truths, it did document and \| progress the discussion on many things. \| \| I think Fowler does a good job of identifying and classifying \| things, but that hasn't necessarily made IT / Enterprise all \| that simpler. What has made "progress" in IT has fundamentally \| been programmers treating more and more things like code and \| automating the hell out of everything. \| vishnugupta wrote: \| > What does it mean for a system to be distributed? There are two \| aspects: > 1. They run on multiple servers. The number of servers \| in a cluster can vary from as few as three servers to a few \| thousand servers. > 2. They manage data. So these are inherently \| 'stateful' systems. \| \| It's a pity that they don't get to the crux of distributed \| systems because it's very well defined and described for ~40 \| years now. Instead they describe the key characteristics of a \| distributed system in a very hand-wavy manner. \| \| The two _fundamental_ ways in which distributed computing differs \| from single-server /machine computing are. \| \| 1. No shared memory. 2. No shared clock. \| \| Almost every problem faced in distributed systems could be traced \| to one of these aspects. \| \| Because there's no shared memory it's impossible for any one \| server to _know_ the global state. And so you need consensus \| algorithms. \| \| And due to lack of shared clock it's impossible to order the \| events. To overcome this software logical clock has to be \| overlaid on top of distributed systems. \| \| Added to this is the failure modes that are peculiar to \| distributed systems, be it transient/permanent link failures and \| transient/permaent server failures. \| \| This[1] is a decent description of what I've just described here. \| \| I also recommend to read up on some key impossibility results in \| distributed system. The most famous one being the impossibility \| of achiving common knowledge. \| \| I'm surprised that someone as reputed as Thoughtworks don't \| describe the topic in more precise terms. \| \| [1] https://www.geeksforgeeks.org/limitation-of-distributed- \| syst... \| ctvo wrote: \| Agreed. Giving this summary and then going into details would \| benefit readers more. AWS's BuilderLibrary, while containing \| excellent content, also gives an overview of distributed \| systems that sorts of point this out: \| \| https://aws.amazon.com/builders-library/challenges-with-dist... \| throwaway894345 wrote: \| How does "ephemeral computing" fit into your notion of \| distributed systems? Perhaps this is a concern not shared by \| all distributed systems, but it's a practical reality that we \| in the cloud space have to deal with pervasively and it drives \| profound architectural differences. \| g9yuayon wrote: \| Maybe it's because Fowler's target readers are developers of \| enterprise software who are not familiar with distributed \| systems at all, and Fowler's background is not in distributed \| system either. Therefore, he chose to use colloquial terms. \| jstimpfle wrote: \| > 1. No shared memory. \| \| Isn't it rather "no synchronized data access"? Remote memory \| isn't a problem if you can read it in a synchronized fashion \| (taking locks and so on). \| \| And actually "no synchronized information retrieval" is the \| default even on multithreaded, shared memory systems, which is \| why they're a lot like distributed systems. You can use mutexes \| and other synchronization primitives though, to solve some of \| the problems that just aren't solvable on a computer network, \| due to much higher latency of synchronization. \| waynesonfire wrote: \| You can devise all sorts of distributed system architectures. \| You could for example have a synchronous system system \| composed of nodes organized in a ring. \| \| There is not "one definition" of what a distributed system \| is. You have to define that. There are some common \| distributed system architectures that perhaps most of us are \| familiar with--asynchronous networked system, e.g. no shared \| memory with point-to-point communication. There are other \| dichotomies; though I'm not an expert in the field and am \| unable to succinctly define them. \| \| As you add more "stuff" into your distributed system--people \| talkig about adding a memcached or whatever in other \| comments, you've introduce a completely different system. \| Maybe some sort of hybrid. And if you're interested, you can \| formally reason about its behavior. \| \| Regardless, you have to define what you're talking about. \| \| It's an interesting question to ask what is the most \| fundamental component of a distributed system? Could it be \| multiple processing nodes? \| zvrba wrote: \| > 1. No shared memory. \| \| I'd rather say "no reliable message delivery". The only \| difference between completely reliable messaging and shared \| memory is performance. \| \| > Because there's no shared memory it's impossible for any one \| server to know the global state. \| \| Even _with_ shared memory it's impossible to _know_ the global \| state. Just after you've loaded some data from memory, it can \| be immediately changed by another thread. \| mav3rick wrote: \| Your second point is moot. Even in a multi threaded single \| machine program you can load state and have it changed by \| another thread. That's bad design and not a distributed \| system characteristic. \| jayd16 wrote: \| I don't agree with this. Reliability and transactional/atomic \| reads and writes are different things. \| \| "Reliable" is also a vague value judgement. \| \| "Shared memory" implies data coherency. \| DSingularity wrote: \| I think the distinction he is trying to raise is that \| messages can be lost in distributed systems. Building \| distributed shared memory is possible but expensive \| (readers must write, writers must broadcast). That is why \| he is raising that distinction and I think it is a good one \| to raise. \| AtlasBarfed wrote: \| The discussion is kind of walking around the CAP triangle at \| this point. \| jedberg wrote: \| > No shared memory \| \| I'm not sure that is entirely accurate. If you have a memcache \| cluster and all data is stored in there, you have shared \| memory. Albeit slow and subject to atomicity problems, it's \| still shared state. \| \| It's also a bad idea to rely on that to run your app, so there \| is that. But it's possible to have shared memory if you're \| willing to accept the hit to reliability. \| svieira wrote: \| Remote memory does not count as shared because of the \| atomicity problems. The same reason _local_ memory doesn't \| count as shared the minute you spin up two writer threads \| with access to the same mutable memory space. (And why Rust \| is popular for distributed problems that share the same \| clock) \| \| If you replaced Memcache with a single Redis instance where \| all operations were managed by Lua scripts (e. g. you \| introduced atomicity to your operations) you wouldn't have a \| distributed system, just one with slow, sometimes rather \| faulty memory. \| omginternets wrote: \| >It's a pity that they don't get to the crux of distributed \| systems because it's very well defined and described for ~40 \| years now. \| \| Really? \| \| I'm incidentally in the midst of a lit-review on the subject \| and it seems quite apparent that no standard definitions have \| emerged. \| \| >The two fundamental ways in which distributed computing \| differs from single-server/machine computing are. \| \| >1. No shared memory. 2. No shared clock. \| \| The typical multiprocessor is, in fact, a distributed system \| under the hood. Most of the time the programmer is unaware of \| this thanks to cache coherence algorithms, which in turn \| benefit from a _reliable_ communication layer between \| individual cores. \| \| And yet, we can still observe consistency failures when we \| operate the chip outside of the parameters for which it \| guarantees a single-system image (namely: when using something \| like OS threads). \| \| I think the problem is that we're using the wrong _kind_ of \| definition. Your definitions -- an indeed most definitions \| encountered in literature, with some exceptions -- appeal to \| _design_. They are _teleological_ definitions, and as such they \| can 't define "distributed" in the sense of "distributed \| programming" or "distributed computation". A more useful kind \| of definition is _intentional_ [0]. It is constructed at a \| higher level of analysis that assumes the design serves the \| purpose of _representing the world_ , among others. Thus, you \| get a definition like this: Distributed \| computing is a computational paradigm in which local action \| taken by processes on the basis of locally-available \| information has the potential to alter some global state. \| \| Returning to multiprocessor initial example, the more useful \| question is often not _whether_ computation is distributed, but \| _when_ it makes sense to regard it as such. There are three \| typical cases in which an engineer is engaged in the practice \| of distributed computing: \| \| 1. He is designing or developing a distributed computing \| system. \| \| 2. The system is operating outside of specified parameters, \| such that design invariants no longer hold. \| \| 3. The system is malfunctioning, which is to say it violates \| its specification despite operating within specified \| parameters. \| \| The second case is the most relevant to our prototypical \| multiprocessor. The use of OS threads, for example, can be \| understood as operating outside of the range of parameters for \| which the SSI can fulfill its guarantees. It is important to \| note that the system can still be made to function correctly \| (contrary to case #3), provided the programmer shoulders the \| burden of distributed control. \| \| It's definitely possible -- and I would argue, _correct_ -- to \| reframe "no shared memory" and "no shared clock" in \| intentional terms, but as we've seen with the multiprocessor \| example, those two conditions alone do not define "distributed \| system" in general; they are not fundamental properties. I will \| however grant that they are the most common manifestations of \| distribution in practice. \| \| To summarize: the literature has not -- to my knowledge -- \| arrived at a good definition ~40 years ago. If I've missed \| something, please point it out, though. I'd hate to publish \| something incorrect. :) \| \| [0] \| https://en.wikipedia.org/wiki/Intentional_stance#Dennett's_t... \| gcblkjaidfj wrote: \| Martin Fowler is in the certification for box checkers \| business. \| \| 99% of the people that read them work at places where they must \| "move to X" to justify some department. They will likely \| implement a simulacrum of X (usually by importing some java \| library someone wrote as homework), adding all the pitfalls and \| future problems of X with zero of the benefits of X. \| [deleted] \| thewarrior wrote: \| This is too dismissive. Most of Fowler's work is written \| after interviewing lots of real world practitioners. \| zwieback wrote: \| Agreed, Fowler, Martin, etc. are often criticized not \| because of their work but because of their audience, or \| more specifically their paying customers. Makes little \| sense to me, I got a lot out of their writing, especially \| in the early days of OO. \| disgruntledphd2 wrote: \| Fowler and Beck in particular have been massively useful \| to me recently. Refactoring and TDD by example are \| _wonderful_ books, and completely changed my approach to \| software. \| \| I also love Feathers and Working Effectively with Legacy \| Code, but that might be more of a niche taste ;) \| [deleted] \| morty_s wrote: \| Haven't really read Fowler's stuff, but I have read Martin \| Kleppmann's Designing Data-Intensive Applications and that \| was helpful. Haven't seen it mentioned here (though I \| haven't looked thoroughly through the comments). Just \| thought I'd mention it here. \| davydog187 wrote: \| Curious that there is not a single mention of Erlang or RabbitMQ, \| which follow patterns of distributed systems quite nicely. \| whycombagator wrote: \| Are you referring to the table which presents categories & then \| examples of technology that falls under said category? \| \| If so, the examples are indeed not in any way exhaustive. But I \| don't believe they are intended to be, nor could/should be. \| davydog187 wrote: \| Sure, but arguably RabbitMQ has a much wider adoption and \| success story as a message broker. \| \| Also showing Akka as an actor system, but not mentioning \| Erlang ___________________________________________________________________ (page generated 2021-02-10 23:01 UTC)