[HN Gopher] Patterns of Distributed Systems (2020)
___________________________________________________________________
 
Patterns of Distributed Systems (2020)
 
Author : sbmthakur
Score  : 240 points
Date   : 2021-02-10 14:39 UTC (8 hours ago)
 
web link (martinfowler.com)
w3m dump (martinfowler.com)
 
| edoceo wrote:
| Oh man, patterns of enterprise, it's like 20 years old now! I
| really like Martin's work. Worth every penny
 
  | mpfundstein wrote:
  | id rather argue that his book was one of the major reasons why
  | we ended up in over engineering hell.
  | 
  | i was affected by that as a junior and even medior as well and
  | i caused a lot of harm :-)
  | 
  | luckily i learned to step away from it.
 
    | GordonS wrote:
    | I was exactly the same in my earlier dev days! I'd learn
    | about a pattern, think it was the greatest thing since sliced
    | bread, and see uses for it everywhere... where it really
    | wasn't suitable at all. I was like a builder with only a
    | hammer - everything looked like a nail!
    | 
    | I ended up making a lot of things a lot more complex than
    | they needed to be, or/and a lot less performant than they
    | could be.
    | 
    | At some point, lot's of people seemed to come to the same
    | realisation - worshipping design patterns is a _bad idea_.
    | Around this time I first heard about  "cargo culting".
    | 
    | These days my motto is "no dogma" (well, except about not
    | being dogmatic ;), and I think it serves me well.
 
    | zwieback wrote:
    | I think I know what you're saying. I remember that time,
    | around the birth of Java. But I also remember that we were
    | trying to solve real problems that aren't such big problems
    | anymore: modularization, dependency management, large
    | codebases with crude editing tools, SW delivery and
    | distribution in the age of diskettes!
    | 
    | It turns out that developing large systems in static
    | languages with heavily constrained libraries is difficult.
    | Rapid zero-cost distribution and decentralized storage AND
    | GUIs have really changed the field. Does anyone even call
    | themselves a SW Engineer anymore?
 
    | kitd wrote:
    | I'd argue that the over-engineering hell preceded this book,
    | with every tiny J2EE project using a full 3-tier stack, incl
    | (but not limited to) stateful EJBs, session EJBs, countless
    | JSPs, DAO layers, god only know what else.
    | 
    | It was this book that actually revealed, in patterns, the
    | bigger picture, where it was all going wrong, and the
    | alternatives. This book and Spring v1.0 go along together in
    | my mind.
    | 
    | BTW (and OT), another underrated Fowler book in a similar
    | vein is "Analysis Patterns", another 10k-ft view of common
    | business problems.
 
    | mmcdermott wrote:
    | I'm a little torn on this one. On the one hand I see it,
    | because it feels like patterns books made the limitations of
    | C++/Java/C# seem respectable, even desirable. Norvig observed
    | this with his talk on patterns in dynamic languages (p. 10 of
    | https://norvig.com/design-patterns/ summarizes this).
    | 
    | On the other hand, I have found his work to be useful in
    | selling approaches to others. There were times where I did a
    | design, then browsed the patterns books to find names for
    | what I was doing to "make it respectable."
 
      | blandflakes wrote:
      | I've done the opposite, disguising something that is a
      | named pattern as anything but a factory to prevent
      | structure-avoidant zealots from knee-jerk reacting to the
      | word in a PR.
 
    | edoceo wrote:
    | Oh, see, I saw that book after watching dozens of dot-com
    | companies re-implemting common patterns (poorly). Basically
    | under-engineering heck. So, once we had the book we could
    | shift the conversation and use common terminology.
    | 
    | The discipline is, of course, knowing which spells to cast
    | and when. And it's super useful to see the scroll of common
    | spells.
 
      | zerkten wrote:
      | A problem in a lot of cases is that for some people, the
      | discipline is about casting spells as opposed to solving
      | problems. It doesn't take a lot of these people to damage
      | projects, but they exist independent of pattern books.
      | 
      | If I was going to complain about engineering hell, it'd be
      | about how shallow we investigate these areas. We stop at
      | the immediate technical events, or inappropriate technology
      | choices, instead of getting to the motivations behind it.
      | This gets into a qualitative area that HN readers tend to
      | want to avoid, but these flaws are with the humans and
      | their circumstances.
 
    | nivertech wrote:
    | Can you provide some concrete examples?
    | 
    | I.e. pattern X was recommended for a scenario Y, but ended up
    | in over-engineering hell for Z.
 
      | withinboredom wrote:
      | IoC is one that comes to mind. It greatly simplifies
      | testing, but the engineering (before we had libraries for
      | it) was ridiculous to implement.
 
        | mmcdermott wrote:
        | I think this is a good one. Most recent Java/C# systems
        | would have a full IoC container, but have no dynamically
        | selected components (which is how frameworks like Dagger
        | -https://dagger.dev/ - can exist). A lot of runtime
        | reflection/calculation gets done for something that can
        | be known at compile time.
 
  | AtlasBarfed wrote:
  | I'll plant a midpost flag and say, while it shouldn't be
  | revered as a bible of eternal truths, it did document and
  | progress the discussion on many things.
  | 
  | I think Fowler does a good job of identifying and classifying
  | things, but that hasn't necessarily made IT / Enterprise all
  | that simpler. What has made "progress" in IT has fundamentally
  | been programmers treating more and more things like code and
  | automating the hell out of everything.
 
| vishnugupta wrote:
| > What does it mean for a system to be distributed? There are two
| aspects: > 1. They run on multiple servers. The number of servers
| in a cluster can vary from as few as three servers to a few
| thousand servers. > 2. They manage data. So these are inherently
| 'stateful' systems.
| 
| It's a pity that they don't get to the crux of distributed
| systems because it's very well defined and described for ~40
| years now. Instead they describe the key characteristics of a
| distributed system in a very hand-wavy manner.
| 
| The two _fundamental_ ways in which distributed computing differs
| from single-server /machine computing are.
| 
| 1. No shared memory. 2. No shared clock.
| 
| Almost every problem faced in distributed systems could be traced
| to one of these aspects.
| 
| Because there's no shared memory it's impossible for any one
| server to _know_ the global state. And so you need consensus
| algorithms.
| 
| And due to lack of shared clock it's impossible to order the
| events. To overcome this software logical clock has to be
| overlaid on top of distributed systems.
| 
| Added to this is the failure modes that are peculiar to
| distributed systems, be it transient/permanent link failures and
| transient/permaent server failures.
| 
| This[1] is a decent description of what I've just described here.
| 
| I also recommend to read up on some key impossibility results in
| distributed system. The most famous one being the impossibility
| of achiving common knowledge.
| 
| I'm surprised that someone as reputed as Thoughtworks don't
| describe the topic in more precise terms.
| 
| [1] https://www.geeksforgeeks.org/limitation-of-distributed-
| syst...
 
  | ctvo wrote:
  | Agreed. Giving this summary and then going into details would
  | benefit readers more. AWS's BuilderLibrary, while containing
  | excellent content, also gives an overview of distributed
  | systems that sorts of point this out:
  | 
  | https://aws.amazon.com/builders-library/challenges-with-dist...
 
  | throwaway894345 wrote:
  | How does "ephemeral computing" fit into your notion of
  | distributed systems? Perhaps this is a concern not shared by
  | all distributed systems, but it's a practical reality that we
  | in the cloud space have to deal with pervasively and it drives
  | profound architectural differences.
 
  | g9yuayon wrote:
  | Maybe it's because Fowler's target readers are developers of
  | enterprise software who are not familiar with distributed
  | systems at all, and Fowler's background is not in distributed
  | system either. Therefore, he chose to use colloquial terms.
 
  | jstimpfle wrote:
  | > 1. No shared memory.
  | 
  | Isn't it rather "no synchronized data access"? Remote memory
  | isn't a problem if you can read it in a synchronized fashion
  | (taking locks and so on).
  | 
  | And actually "no synchronized information retrieval" is the
  | default even on multithreaded, shared memory systems, which is
  | why they're a lot like distributed systems. You can use mutexes
  | and other synchronization primitives though, to solve some of
  | the problems that just aren't solvable on a computer network,
  | due to much higher latency of synchronization.
 
    | waynesonfire wrote:
    | You can devise all sorts of distributed system architectures.
    | You could for example have a synchronous system system
    | composed of nodes organized in a ring.
    | 
    | There is not "one definition" of what a distributed system
    | is. You have to define that. There are some common
    | distributed system architectures that perhaps most of us are
    | familiar with--asynchronous networked system, e.g. no shared
    | memory with point-to-point communication. There are other
    | dichotomies; though I'm not an expert in the field and am
    | unable to succinctly define them.
    | 
    | As you add more "stuff" into your distributed system--people
    | talkig about adding a memcached or whatever in other
    | comments, you've introduce a completely different system.
    | Maybe some sort of hybrid. And if you're interested, you can
    | formally reason about its behavior.
    | 
    | Regardless, you have to define what you're talking about.
    | 
    | It's an interesting question to ask what is the most
    | fundamental component of a distributed system? Could it be
    | multiple processing nodes?
 
  | zvrba wrote:
  | > 1. No shared memory.
  | 
  | I'd rather say "no reliable message delivery". The only
  | difference between completely reliable messaging and shared
  | memory is performance.
  | 
  | > Because there's no shared memory it's impossible for any one
  | server to know the global state.
  | 
  | Even _with_ shared memory it's impossible to _know_ the global
  | state. Just after you've loaded some data from memory, it can
  | be immediately changed by another thread.
 
    | mav3rick wrote:
    | Your second point is moot. Even in a multi threaded single
    | machine program you can load state and have it changed by
    | another thread. That's bad design and not a distributed
    | system characteristic.
 
    | jayd16 wrote:
    | I don't agree with this. Reliability and transactional/atomic
    | reads and writes are different things.
    | 
    | "Reliable" is also a vague value judgement.
    | 
    | "Shared memory" implies data coherency.
 
      | DSingularity wrote:
      | I think the distinction he is trying to raise is that
      | messages can be lost in distributed systems. Building
      | distributed shared memory is possible but expensive
      | (readers must write, writers must broadcast). That is why
      | he is raising that distinction and I think it is a good one
      | to raise.
 
    | AtlasBarfed wrote:
    | The discussion is kind of walking around the CAP triangle at
    | this point.
 
  | jedberg wrote:
  | > No shared memory
  | 
  | I'm not sure that is entirely accurate. If you have a memcache
  | cluster and all data is stored in there, you have shared
  | memory. Albeit slow and subject to atomicity problems, it's
  | still shared state.
  | 
  | It's also a bad idea to rely on that to run your app, so there
  | is that. But it's possible to have shared memory if you're
  | willing to accept the hit to reliability.
 
    | svieira wrote:
    | Remote memory does not count as shared because of the
    | atomicity problems. The same reason _local_ memory doesn't
    | count as shared the minute you spin up two writer threads
    | with access to the same mutable memory space. (And why Rust
    | is popular for distributed problems that share the same
    | clock)
    | 
    | If you replaced Memcache with a single Redis instance where
    | all operations were managed by Lua scripts (e. g. you
    | introduced atomicity to your operations) you wouldn't have a
    | distributed system, just one with slow, sometimes rather
    | faulty memory.
 
  | omginternets wrote:
  | >It's a pity that they don't get to the crux of distributed
  | systems because it's very well defined and described for ~40
  | years now.
  | 
  | Really?
  | 
  | I'm incidentally in the midst of a lit-review on the subject
  | and it seems quite apparent that no standard definitions have
  | emerged.
  | 
  | >The two fundamental ways in which distributed computing
  | differs from single-server/machine computing are.
  | 
  | >1. No shared memory. 2. No shared clock.
  | 
  | The typical multiprocessor is, in fact, a distributed system
  | under the hood. Most of the time the programmer is unaware of
  | this thanks to cache coherence algorithms, which in turn
  | benefit from a _reliable_ communication layer between
  | individual cores.
  | 
  | And yet, we can still observe consistency failures when we
  | operate the chip outside of the parameters for which it
  | guarantees a single-system image (namely: when using something
  | like OS threads).
  | 
  | I think the problem is that we're using the wrong _kind_ of
  | definition. Your definitions -- an indeed most definitions
  | encountered in literature, with some exceptions -- appeal to
  | _design_. They are _teleological_ definitions, and as such they
  | can 't define "distributed" in the sense of "distributed
  | programming" or "distributed computation". A more useful kind
  | of definition is _intentional_ [0]. It is constructed at a
  | higher level of analysis that assumes the design serves the
  | purpose of _representing the world_ , among others. Thus, you
  | get a definition like this:                   Distributed
  | computing is a computational paradigm in which local action
  | taken by processes on the basis of locally-available
  | information has the potential to alter some global state.
  | 
  | Returning to multiprocessor initial example, the more useful
  | question is often not _whether_ computation is distributed, but
  | _when_ it makes sense to regard it as such. There are three
  | typical cases in which an engineer is engaged in the practice
  | of distributed computing:
  | 
  | 1. He is designing or developing a distributed computing
  | system.
  | 
  | 2. The system is operating outside of specified parameters,
  | such that design invariants no longer hold.
  | 
  | 3. The system is malfunctioning, which is to say it violates
  | its specification despite operating within specified
  | parameters.
  | 
  | The second case is the most relevant to our prototypical
  | multiprocessor. The use of OS threads, for example, can be
  | understood as operating outside of the range of parameters for
  | which the SSI can fulfill its guarantees. It is important to
  | note that the system can still be made to function correctly
  | (contrary to case #3), provided the programmer shoulders the
  | burden of distributed control.
  | 
  | It's definitely possible -- and I would argue, _correct_ -- to
  | reframe  "no shared memory" and "no shared clock" in
  | intentional terms, but as we've seen with the multiprocessor
  | example, those two conditions alone do not define "distributed
  | system" in general; they are not fundamental properties. I will
  | however grant that they are the most common manifestations of
  | distribution in practice.
  | 
  | To summarize: the literature has not -- to my knowledge --
  | arrived at a good definition ~40 years ago. If I've missed
  | something, please point it out, though. I'd hate to publish
  | something incorrect. :)
  | 
  | [0]
  | https://en.wikipedia.org/wiki/Intentional_stance#Dennett's_t...
 
  | gcblkjaidfj wrote:
  | Martin Fowler is in the certification for box checkers
  | business.
  | 
  | 99% of the people that read them work at places where they must
  | "move to X" to justify some department. They will likely
  | implement a simulacrum of X (usually by importing some java
  | library someone wrote as homework), adding all the pitfalls and
  | future problems of X with zero of the benefits of X.
 
    | [deleted]
 
    | thewarrior wrote:
    | This is too dismissive. Most of Fowler's work is written
    | after interviewing lots of real world practitioners.
 
      | zwieback wrote:
      | Agreed, Fowler, Martin, etc. are often criticized not
      | because of their work but because of their audience, or
      | more specifically their paying customers. Makes little
      | sense to me, I got a lot out of their writing, especially
      | in the early days of OO.
 
        | disgruntledphd2 wrote:
        | Fowler and Beck in particular have been massively useful
        | to me recently. Refactoring and TDD by example are
        | _wonderful_ books, and completely changed my approach to
        | software.
        | 
        | I also love Feathers and Working Effectively with Legacy
        | Code, but that might be more of a niche taste ;)
 
        | [deleted]
 
      | morty_s wrote:
      | Haven't really read Fowler's stuff, but I have read Martin
      | Kleppmann's Designing Data-Intensive Applications and that
      | was helpful. Haven't seen it mentioned here (though I
      | haven't looked thoroughly through the comments). Just
      | thought I'd mention it here.
 
| davydog187 wrote:
| Curious that there is not a single mention of Erlang or RabbitMQ,
| which follow patterns of distributed systems quite nicely.
 
  | whycombagator wrote:
  | Are you referring to the table which presents categories & then
  | examples of technology that falls under said category?
  | 
  | If so, the examples are indeed not in any way exhaustive. But I
  | don't believe they are intended to be, nor could/should be.
 
    | davydog187 wrote:
    | Sure, but arguably RabbitMQ has a much wider adoption and
    | success story as a message broker.
    | 
    | Also showing Akka as an actor system, but not mentioning
    | Erlang
 
___________________________________________________________________
(page generated 2021-02-10 23:01 UTC)