[HN Gopher] Deming's Red Bead Experiment (2002)
___________________________________________________________________
 
Deming's Red Bead Experiment (2002)
 
Author : gkop
Score  : 111 points
Date   : 2022-02-04 15:40 UTC (7 hours ago)
 
web link (maaw.info)
w3m dump (maaw.info)
 
| dang wrote:
| A related video from around 1994:
| http://www.youtube.com/watch?v=C5Io2WweTxQ
| 
| (via https://news.ycombinator.com/item?id=5193898, but no
| comments there)
 
  | cpach wrote:
  | If anyone wants to know more more about Deming, I can warmly
  | recommend this blog post by Avery Pennarun:
  | https://apenwarr.ca/log/20161226
 
    | hitekker wrote:
    | I'd be careful about taking that particular author's
    | interpretation of management literature at face value.
    | 
    | His understanding of _High Output Management_ (a seminal book
    | by the CEO of Intel) was so flawed that the CEO of Dropbox
    | had to correct him.
    | 
    | https://news.ycombinator.com/item?id=21088425
 
      | ignoramous wrote:
      | Drew, fwiw, pointed out that Avery Pennarun (who is,
      | frankly, phenomenal at distilling ideas in a given context)
      | was right about a bunch of things except the TLDR.
      | 
      | Though Avery does hint that "output" itself is a function
      | of values/principles execs ought to imbibe in their org:
      | 
      | > _What executives need to do is come up with
      | organizational values that indirectly result in the
      | strategy they want._
      | 
      | > _That is, if your company makes widgets and one of your
      | values is customer satisfaction, you will probably end up
      | with better widgets of the right sort for your existing
      | customers. If one of your values is to be environmentally
      | friendly, your widget factories will probably pollute less
      | but cost more. If one of your values is to make the tools
      | that run faster and smoother, your employees will probably
      | make less bloatware and you 'll probably hire different
      | employees than if your values are to scale fast and capture
      | the most customers in the shortest time._
      | 
      | It remains to be seen if Avery ends up building a larger
      | company than Drew. I'm willing to bet all of $100 in my
      | depleting bank account that they will.
 
        | hitekker wrote:
        | Drew Houston disagrees with you and Avery. Not just the
        | TL;DR but in the critical details.
        | 
        | > Contrary to what the post suggests, HOM does not say
        | not that the job of an executive is to wave some kind of
        | magic culture or "values" wand and rubber-stamp whatever
        | emergent strategy and behavior results from that. CEOs
        | and executives absolutely do (and must) make important
        | decisions of all kinds, break ties, and set general
        | direction.
        | 
        | Notwithstanding that Drew is a billionaire who founded a
        | billion dollar company and Tailscale has yet to crack a
        | large valuation, Avery basically misinterpreted Andy
        | Groves. That misinterpretation, "CEO as a passive referee
        | whose job is to set the culture", is only sensible to
        | people who've never managed a large group of people.
 
| Tomte wrote:
| The other well-known experiment Deming used is the funnel:
| https://www.2uo.de/deming/#the-funnel-experiment
 
| jeffreyrogers wrote:
| Interesting experiment. I don't think this applies to knowledge
| work in the same way it does to manufacturing.
 
  | kqr wrote:
  | You're right. It doesn't. But it's a matter of degree, not
  | kind. Knowledge work has many times the variability of
  | manufacturing (by design: if you remove variability from
  | knowledge work, you're no longer producing anything new each
  | time.)
  | 
  | In other words, this applies even more severely to knowledge
  | work.
  | 
  | More concretely: in manufacturing you can have a process that
  | yields 9-14 % defective or whatever. The variation is
  | relatively small; say a CV of 10 %. In knowledge work, you'll
  | be looking at processes that generate somewhere between 0.1 and
  | 100 defective ideas for every really good idea. This variation
  | is enormous: 1000 % or so.
 
  | com2kid wrote:
  | Software engineer is put on a team that has more legacy code.
  | If management judges by # of incidents, they are under
  | performing.
  | 
  | Heck I can tell you from experience that if you want to get
  | promoted fast, new product teams are the way to go. You get to
  | file lots of patents, architect huge new systems, and look like
  | a rock star.
  | 
  | Another example: Partner teams upstream keep pushing breaking
  | API changes, downstream teams look bad because their services
  | are the ones having the outage. You do your due diligence, your
  | code is defect free, well tested. Doesn't matter, you are
  | spending half your day putting out fires caused by someone
  | else. Meanwhile another co-worker starts on a team where their
  | upstream services are written to be robust against bad incoming
  | data and have APIs that maintain back compat. Your co-worker
  | puts out buggy poorly tested code, but the upstream services
  | are robust enough that everything keeps chugging along.
  | 
  | Management doesn't see any of this. They just see your team has
  | poor performance, and this other team has great performance.
  | Heck maybe that other team has a higher "velocity" because they
  | can turn out features faster.
 
  | riskable wrote:
  | In knowledge work the skill of the worker matters vastly more
  | because "the process" mostly takes place in their head. You can
  | optimize a workspace and tools for productivity and the
  | reduction of errors but ultimately that has a minimal impact on
  | "the process" that's taking place in the knowledge worker's
  | mind.
  | 
  | If the process was the problem (or perfect), adding a new (or
  | replacing) a worker would have minimal impact but we know this
  | is not true. You could have the best documentation, training,
  | and absolutely stellar code yet one person can turn everything
  | to shit quite quickly! The opposite is true as well: Bringing
  | on a fantastic new worker can make your existing team look like
  | a bunch of inefficient laggards.
  | 
  | Neither of these situations can be fixed by improving processes
  | (maybe hiring processes? Though I doubt it). It'd be like
  | having one magic blue bean in the box that--if found--can
  | either drastically improve or degrade the final productivity by
  | 90%. Would the optimum process improvement then be to try to
  | eliminate magic beans entirely? Sure seems like it (i.e. hire
  | the lowest common denominator and don't try to optimize for the
  | 1%). That way you reduce the likelihood of taking on the "bad
  | 1%"--even though it reduces your chances of obtaining the
  | perfect magic bean.
 
| kqr wrote:
| If you are in any sort of leadership position -- either a formal
| manager, or in that you have respect from your peers, I strongly
| urge you to read Deming.
| 
| There are few authors that have taught me so much about people,
| motivation, systems, quality, statistics, what high-leverage
| effort looks like, and so on.
| 
| I first picked up a book by Deming a few years ago, and not a
| single day has passed that I have not had use for what he taught
| me through his writing.
| 
| The things he says are only becoming more and more relevant with
| every year. I honestly think it ought to be compulsory reading in
| school. The world would be a much better place that way; kinder,
| more efficient, and less superstitial.
 
  | Litost wrote:
  | Thanks for the suggestion, how does what he says stack up
  | against those who came after, I ask this because this is one of
  | my favourite management talks by Russell Ackoff [1] and he
  | mentions Dr Deming so assume he was influenced
  | by/worked/studied with him and given their relative ages
  | wondered if his work might be valuable to start with?
  | 
  | [1] https://www.youtube.com/watch?v=OqEeIG8aPPk
 
  | marbex7 wrote:
  | Which book?
 
    | kqr wrote:
    | I started with The New Economics, then read Out of the
    | Crisis, and finally A Theory of Sampling or whatever its name
    | is.
 
      | marbex7 wrote:
      | Thanks.
 
  | [deleted]
 
| hencq wrote:
| There's a brilliant little book Four Days with Dr. Deming[0] that
| goes over the red bead experiment among other things. It
| basically follows the format of a four day seminar that Dr.
| Deming used to do. It's full of wisdom like this and it does a
| painfully good job making you recognize all the ineffective
| things still going on in companies today.
| 
| [0]
| https://www.goodreads.com/en/book/show/34987.Four_Days_with_...
 
| [deleted]
 
| buescher wrote:
| Years ago I found a discussion of this on the web that involved a
| deep dive into optimal strategies for getting white beads,
| variations in paddle construction, root cause analysis on bead
| size and weight and hole depth in the paddles, and so on. It was
| a six sigma nightmare come to life and missed the point so
| profoundly I wish I could find it again to use as an example of
| how easily Deming is misunderstood.
| 
| Related: "A bad system beats a good person any time" does not
| mean "having any system, no matter how bad, is better than having
| even the best people and no apparent system".
 
  | AnIdiotOnTheNet wrote:
  | Wait a bit and HN will probably provide you a similar
  | discussion.
 
  | krallja wrote:
  | "beats" in the sense of "the beatings will continue until
  | morale improves," right?
 
    | buescher wrote:
    | It certainly isn't positive.
 
  | hencq wrote:
  | Oh boy, I'd love to see that too. Unfortunately it's all too
  | common to see this stuff in reality as well.
  | 
  | > Related: "A bad system beats a good person any time" does not
  | mean "having any system, no matter how bad, is better than
  | having even the best people and no apparent system".
  | 
  | I'm a big fan of sociotechnical systems [0] where the motto is
  | to give people complex jobs in simple organizations.
  | Unfortunately in practice you usually see the tendency to do
  | exactly the opposite.
  | 
  | [0] https://en.wikipedia.org/wiki/Sociotechnical_system
 
| vintermann wrote:
| Deming made me realize that there is actually management
| literature out there that isn't just fads and slogans.
 
  | openknot wrote:
  | Would there be any other similar recommendations to Deming's
  | books? I would think that Eliyahu M. Goldratt's books sound
  | similar (specifically, the "Theory of Constraints,"
  | alternatively presented through a fictional story in "The
  | Goal").
 
    | mark_undoio wrote:
    | I'm a fan of Womack & Jones's "Lean Thinking". This is all
    | about Lean manufacturing, which I think is partly rooted in
    | Deming's work. The focus is more on how to optimise the
    | overall system than the management of individuals.
    | 
    | The content of that book isn't directly applicable to e.g.
    | software companies but if you think a bit you can see quite a
    | lot of analogous situations (e.g. warehoused inventory is
    | incomplete projects or not-yet-shipped code, "monuments"
    | could be inappropriate central test / build systems, etc).
 
      | kqr wrote:
      | If you want someone to translate it to software for you,
      | Reinertsen's Principles of Product Development Flow is
      | about adapting the philosophy to knowledge work.
      | 
      | Ward's Lean Product and Process Development is also a good
      | take on those ideas.
 
    | m104 wrote:
    | I recommend Russell Ackoff's writings as somewhat related and
    | more to do with how systems of people and processes work (or
    | don't). Here's a great place to start:
    | https://thesystemsthinker.com/a-lifetime-of-systems-
    | thinking...
 
  | [deleted]
 
| larrydag wrote:
| The key to doing this experiment well is having the right
| facilitator that brings the attitude. A good facilitator will
| roleplay a leader/manager/exec that will praise when measures are
| good and berate when measures are bad. The idea of this
| experiment is to show how management can harm the process even
| when there is inherent variability, good or bad.
| 
| Here is Dr. Deming himself performing the experiment
| https://www.youtube.com/watch?v=7pXu0qxtWPg
 
  | Rickasaurus wrote:
  | Thanks for sharing this, it's amazing to see it in action.
 
  | laserlight wrote:
  | What an amazing demonstration. It's unfortunate that these
  | lessons haven't been learned decades later.
 
| curiouscats wrote:
| The W. Edwards Deming Institute Blog https://deming.org/blog/
| 
| Deming on various management topics
| https://deming.org/category/deming-on-management/
| 
| More resources on Deming's ideas https://deming.org/online-
| resources-on-w-edwards-demings-man...
 
| pakitan wrote:
| I don't get it. This "experiment" could have been replicated by a
| simple computer simulation, given that worker output is entirely
| random. The supposed moral of the story is that system design
| defines outcome, not individual performance but how does that
| even count as "science" when you don't have control and
| experimental group. He designed a system with inherent flaws and,
| surprise, it has flaws. We can see there is variance in
| "productivity" but we have no idea how this same variance would
| have affected output if workers actually had agency.
 
  | lupire wrote:
  | It's a demo experiment, like whe the physics teacher swinga a
  | heavy pendulum at their own nose, or shoots a BB gun at a
  | falling toy.
 
  | function_seven wrote:
  | That's the point.
  | 
  | So first, not all science requires an RCT. Dividing
  | expiremental subjects into study and control groups is one way
  | of doing science. It's not the only way.
  | 
  | In this case, this is a concrete demonstration of just how much
  | variance can emerge from a "statistically neutral" process. The
  | systemic flaws are part of the demonstration. What appear at
  | first glance to be identical tools, inputs, and processes are
  | in fact subtly different. The demo shows management types that
  | their charts and graphs cannot always be relied upon to
  | differentiate performance levels among staff. The system itself
  | must also be scrutinized. If Bob's ad campaigns are
  | outperforming Alice's by 20% in the first quarter, it doesn't
  | necessarily mean Bob is a marketing genius and Alice needs a
  | PIP.
  | 
  | A computer simulation would not have nearly as powerful effect
  | on most people as a live demonstration using real beads. And
  | the imperfections in the paddles is something that naturally
  | arises when they're physically made, but would have to be tuned
  | by the programmer building the simulation. Which would lead to
  | questions about "just how did they decide what variances would
  | come into play?"
 
    | jiggawatts wrote:
    | A real world example is that I do programming on a high-end
    | workstation laptop. My coworkers use old budget laptops.
    | 
    | This is not in their control -- they're victims of corporate
    | policy.
    | 
    | Does it influence quality in complex and hard to quantify
    | ways?
    | 
    | Most assuredly...
 
    | pakitan wrote:
    | I think I get it now. The point of the experiment is to ELI5
    | the concept of variance to management types who skipped
    | statistics classes :) Could be useful for some bosses I had
    | :)
 
      | function_seven wrote:
      | So I'm watching a video of this right now[0], and it's even
      | more enlightening than I figured it would be! Deming makes
      | comments throughout the demonstration that I swear I've
      | heard in the real world. For example, one worker--whose
      | previous results put him on probation (he had 12 red
      | beads)--managed to have only 6 the next day. "Looks like
      | probation worked".
      | 
      | Meanwhile another worker--previously scoring 5, and getting
      | a merit-based raise from it--did poorly with 12. The
      | remark: "That raise went to his head. He's getting lazy".
      | 
      | So yeah, the value of this is in the actual doing of it.
      | 
      | [0] https://www.youtube.com/watch?v=7pXu0qxtWPg
 
  | advisedwang wrote:
  | This experiment is something he did in classes etc so people
  | could _experience_ the obvious idiocy in trying to manage
  | individuals for system behaviour. His point is that ALL actual
  | work is also dominated by system behaviour, just more subtly,
  | and managers must focus on the systems and not worker
  | performance.
 
    | pakitan wrote:
    | > His point is that ALL actual work is also *dominated* by
    | system behaviour
    | 
    | If that's his point, it seems obviously wrong. Some work,
    | like in the experiment is dominated by system behavior. For
    | others, system would play a much smaller role. For example, 2
    | people cranking code in a startup. No matter what system you
    | apply, if they are not good programmers, nothing of value
    | will come out.
 
      | kqr wrote:
      | That's also missing the point somewhat. Put the best two
      | programmers in the world in a shitty system that rewards
      | them for the wrong things and they will produce garbage.
      | Put mediocre programmers in a fantastic system that brings
      | out the best in their collaboration and you might actually
      | get to market sooner and better than the other group.
 
      | salawat wrote:
      | Something will most certainly come out. You're just not
      | defining the system. If I define the system as "one
      | programmer is responsible for looking at the desired
      | product and writing specifications" and the other
      | programmer is to translate specification into programming
      | code, and never shall one do the other's job, I assure you,
      | the best programmers in the world will produce shit over
      | time.
      | 
      | Randomly swap in two new actors with different life
      | experiences into the same spots to do the same work, and
      | you'll still get shit. If in the unlikely event, you get
      | amazing work, it's not that the people doing it were
      | special; it's just anothe outlier in the data stream. Add
      | in the emotional toll of working as hard as possible to
      | succeed but never being able to meet prescribed quality
      | levels?
      | 
      | A system is perfectly tuned to produce the results it does.
      | Want different results? Change the system. That is Deming's
      | point. We have a tendency to blame variance in a system on
      | the human actors immediately proximal, instead of paying
      | attention to the actual significant constraints. This is an
      | important lesson to management types, as they are to
      | process/system what a programmer is to a computer.
      | 
      | The planners cast the dice for downstream long before
      | downstream can do anything about it, and in many corporate
      | setups, top down works just fine, but bottom up never gets
      | any attention.
 
| pierrebai wrote:
| I find the experiment skewed. Or more precisely, that it is not
| meant to investigate human behaviour or psychology. It is rather
| precisely designed to support a chosen result to support a given
| world view. The fact that it has been ran for 50 years is a
| strong indication of this.
| 
| IOW, the experimenter wanted to be able to arrive at the
| conclusion that difference in performance was unrelated to
| workers and designed the experiment so it would give this result.
| In short, this demonstrate few things outside of a very
| artificially setup situation, where the workers have no say and
| the job is predestined to fail.
| 
| Anyone who worked anywhere knows very well that there are
| actually vast difference between two workers.
 
  | cool_dude85 wrote:
  | >IOW, the experimenter wanted to be able to arrive at the
  | conclusion that difference in performance was unrelated to
  | workers and designed the experiment so it would give this
  | result.
  | 
  | That's the whole point. The experiment is not that we're
  | supposed to be surprised that the workers did not affect
  | performance - in fact, that's the subtext of the whole thing!
  | We know it from the start cause he explains exactly how the
  | process works and we can all see that individuals cannot affect
  | their output.
  | 
  | The point is, if we are unaware that we're in such a situation,
  | we can still find metrics to allow us to rank workers, fire low
  | performers, give out raises, etc. When we myopically focus on
  | such metrics, and disregard the system that makes them
  | worthless, we're making all our decisions on random chance,
  | even though we have a clear process, data collection, the whole
  | thing.
 
    | pierrebai wrote:
    | That's also my whole point: this is not an experiment but an
    | elaborate artificial argument designed to prove a point of
    | view decided in advance. That is why I find it unsavory.
 
      | Jtsummers wrote:
      | The point of the experiment is to be extreme, but after
      | reading (a very large portion of) Deming's work, I don't
      | think he'd disagree with your initial assertion that there
      | are differences between workers.
      | 
      | The broader points he makes, related to this experiment at
      | least: There are individual and systemic issues that
      | influence the outcome of a process. The actual ratio will
      | vary depending on what kinds of processes are involved.
      | 
      | If the job is to be a literal screw turner on an assembly
      | line, then there is relatively little difference between
      | the majority of people (assuming they are generally able
      | bodied, sighted, and have decent coordination), the
      | _system_ (tempo, length of shift, accessibility of the
      | thing being screwed together, tools being used) will have a
      | much larger impact than the individual 's skill. The system
      | of the assembly line will influence the outcome more than
      | the individual's skill (at least above a basic threshold, a
      | supremely uncoordinated individual could flounder even with
      | the slowest pace of work). Switch to more skilled work and
      | you will find, increasingly, more differences in outcome
      | based on individual performance versus the system of the
      | work, but even there the system matters.
      | 
      | Look at software development offices that still favor
      | things like manual build processes, code versioning
      | control, testing, and deployment over automation. They
      | provide many opportunities for human error (even just
      | simple miskeying of data) that can reduce everyone's
      | effectiveness no matter how skilled. (Fortunately these
      | kinds of places are increasingly rare, at least outside of
      | US defense contractors.)
      | 
      | The experiment, then, is an artificial construct (like most
      | classroom experiments) meant to illustrate a point by
      | showing one extreme. This acts as a counterpoint to the
      | more conventional wisdom that the individual, and not the
      | system, is what actually matters for the outcome. The
      | conventional wisdom, of course, being wrong in many
      | circumstances since it tends to place too strong a weight
      | on the individual performance and too weak a weight on the
      | system.
      | 
      | It would be unsavory if he had said, "See, stop evaluating
      | individuals their contribution doesn't matter." But he
      | never did say that (in anything I read, at least), and
      | anyone who looks at this experiment and draws that
      | conclusion would be an idiot.
 
| emeraldd wrote:
| I wonder what the limits of this are? From a naive point of view
| there has to be a point where training/skill/physical
| endurance/etc. come into play. The bed experiment seems to fit a
| fixed rate, assembly line style of work. While I would agree that
| numeric/performance ranking is mostly meaningless, everyone knows
| that one somebody you go to when no one else can fix a problem.
 
  | IggleSniggle wrote:
  | I see what you mean, but I also think that's encapsulated in
  | the idea of "ready willing workers."
  | 
  | Obviously there are differences between people, and better and
  | worse teams. But the lesson here is about how the environment
  | factors in, and how management can accidentally arbitrarily
  | suppress innovation or reward luck within normal bounds of
  | success. Or hamper themselves to failure by insisting on a
  | broken process.
  | 
  | Could it be the case that "everybody goes to Jim," and as a
  | result, Jim gets good at helping people? Could it be that if
  | everybody just went to Kim for 2 weeks, that her fixes might
  | turn out to be better yet completely orthogonal method of
  | solving the problem?
  | 
  | The Red Bean experiment is an antidote to rigid process and the
  | praise/blame game as based on inspection of results. It's a
  | story intended for management to hear, not an absolution or
  | dismissiveness of personal reasonability.
  | 
  | If you've hired "ready willing workers," then looking at the
  | results doesn't necessarily show you who was killing it and who
  | wasn't.
  | 
  | That worker who is always "killing it" may be good at scooping
  | up projects that always look great. That worker who is always
  | underperforming might be maintaining essential infrastructure
  | without which the system would fall apart.
  | 
  | The worker who's killing it may be doing so by spending all
  | their time "buttering up" a customer. The worker who appears
  | underperforming may appear so because they spend all their time
  | "buttering up" a customer, but someone else always lands the
  | sale.
  | 
  | It's a meditation on imperfect knowledge.
 
  | kqr wrote:
  | As you have observed already, this experiment is set up
  | specifically to eliminate the effect of training/skill/physical
  | endurance etc, and YET when it's performed in real life with a
  | good facilitator, people who are unlucky start to feel like
  | they're underperforming and need to step it up, while people
  | who are lucky start to feel like they deserve the praise for
  | doing well.
  | 
  | I've read about people who go for days after the experiment and
  | feel bad about their subpar performance because they feel like
  | they've let down or brought shame to their company and wonder
  | if they couldn't have done something better.
  | 
  | And this is an experiment that's set up to remove any trace
  | indivdual agency what so ever! People still beat themselves up
  | over it.
  | 
  | When you experience this experiment for real, you start to
  | forget that it's actually designed to eliminate any sort of
  | skill.
  | 
  | In other words, the experiment shows how hard it is to
  | recognise when we're judging the system and not the people in
  | it. The experiment shows that even when you think you're seeing
  | individual performance, it's very plausible you're not.
 
  | ziggus wrote:
  | Focusing on the type of work being done is a bit of a bike
  | shed, since the experiment isn't about the work per se, but the
  | measurement of the work as a function of the employee alone -
  | ie, without the context of the systems in which the employee
  | functions.
  | 
  | A good example of the type of mismeasurement done in non-
  | manufacturing contexts is the ridiculously stupid burn-down
  | chart.
 
    | webmaven wrote:
    | _> A good example of the type of mismeasurement done in non-
    | manufacturing contexts is the ridiculously stupid burn-down
    | chart._
    | 
    | Bad management can find a misuse for any tool, I don't think
    | burn-down charts are a particularly attractive nuisance in
    | that regard.
 
| candyman wrote:
| I was lucky enough to do this with the man himself at NYU. He had
| trouble speaking then but the class was dead silent and hung on
| his every word. Profound thinker.
 
| mark-r wrote:
| I saw a link to this in a discussion of another topic, I'm glad
| somebody pushed it to the top level. Definitely worth the read.
 
___________________________________________________________________
(page generated 2022-02-04 23:00 UTC)