proxy70

	[HN Gopher] Deming's Red Bead Experiment (2002) ___________________________________________________________________ Deming's Red Bead Experiment (2002) Author : gkop Score : 111 points Date : 2022-02-04 15:40 UTC (7 hours ago)
	web link (maaw.info)
	w3m dump (maaw.info)
	\| dang wrote: \| A related video from around 1994: \| http://www.youtube.com/watch?v=C5Io2WweTxQ \| \| (via https://news.ycombinator.com/item?id=5193898, but no \| comments there) \| cpach wrote: \| If anyone wants to know more more about Deming, I can warmly \| recommend this blog post by Avery Pennarun: \| https://apenwarr.ca/log/20161226 \| hitekker wrote: \| I'd be careful about taking that particular author's \| interpretation of management literature at face value. \| \| His understanding of _High Output Management_ (a seminal book \| by the CEO of Intel) was so flawed that the CEO of Dropbox \| had to correct him. \| \| https://news.ycombinator.com/item?id=21088425 \| ignoramous wrote: \| Drew, fwiw, pointed out that Avery Pennarun (who is, \| frankly, phenomenal at distilling ideas in a given context) \| was right about a bunch of things except the TLDR. \| \| Though Avery does hint that "output" itself is a function \| of values/principles execs ought to imbibe in their org: \| \| > _What executives need to do is come up with \| organizational values that indirectly result in the \| strategy they want._ \| \| > _That is, if your company makes widgets and one of your \| values is customer satisfaction, you will probably end up \| with better widgets of the right sort for your existing \| customers. If one of your values is to be environmentally \| friendly, your widget factories will probably pollute less \| but cost more. If one of your values is to make the tools \| that run faster and smoother, your employees will probably \| make less bloatware and you 'll probably hire different \| employees than if your values are to scale fast and capture \| the most customers in the shortest time._ \| \| It remains to be seen if Avery ends up building a larger \| company than Drew. I'm willing to bet all of $100 in my \| depleting bank account that they will. \| hitekker wrote: \| Drew Houston disagrees with you and Avery. Not just the \| TL;DR but in the critical details. \| \| > Contrary to what the post suggests, HOM does not say \| not that the job of an executive is to wave some kind of \| magic culture or "values" wand and rubber-stamp whatever \| emergent strategy and behavior results from that. CEOs \| and executives absolutely do (and must) make important \| decisions of all kinds, break ties, and set general \| direction. \| \| Notwithstanding that Drew is a billionaire who founded a \| billion dollar company and Tailscale has yet to crack a \| large valuation, Avery basically misinterpreted Andy \| Groves. That misinterpretation, "CEO as a passive referee \| whose job is to set the culture", is only sensible to \| people who've never managed a large group of people. \| Tomte wrote: \| The other well-known experiment Deming used is the funnel: \| https://www.2uo.de/deming/#the-funnel-experiment \| jeffreyrogers wrote: \| Interesting experiment. I don't think this applies to knowledge \| work in the same way it does to manufacturing. \| kqr wrote: \| You're right. It doesn't. But it's a matter of degree, not \| kind. Knowledge work has many times the variability of \| manufacturing (by design: if you remove variability from \| knowledge work, you're no longer producing anything new each \| time.) \| \| In other words, this applies even more severely to knowledge \| work. \| \| More concretely: in manufacturing you can have a process that \| yields 9-14 % defective or whatever. The variation is \| relatively small; say a CV of 10 %. In knowledge work, you'll \| be looking at processes that generate somewhere between 0.1 and \| 100 defective ideas for every really good idea. This variation \| is enormous: 1000 % or so. \| com2kid wrote: \| Software engineer is put on a team that has more legacy code. \| If management judges by # of incidents, they are under \| performing. \| \| Heck I can tell you from experience that if you want to get \| promoted fast, new product teams are the way to go. You get to \| file lots of patents, architect huge new systems, and look like \| a rock star. \| \| Another example: Partner teams upstream keep pushing breaking \| API changes, downstream teams look bad because their services \| are the ones having the outage. You do your due diligence, your \| code is defect free, well tested. Doesn't matter, you are \| spending half your day putting out fires caused by someone \| else. Meanwhile another co-worker starts on a team where their \| upstream services are written to be robust against bad incoming \| data and have APIs that maintain back compat. Your co-worker \| puts out buggy poorly tested code, but the upstream services \| are robust enough that everything keeps chugging along. \| \| Management doesn't see any of this. They just see your team has \| poor performance, and this other team has great performance. \| Heck maybe that other team has a higher "velocity" because they \| can turn out features faster. \| riskable wrote: \| In knowledge work the skill of the worker matters vastly more \| because "the process" mostly takes place in their head. You can \| optimize a workspace and tools for productivity and the \| reduction of errors but ultimately that has a minimal impact on \| "the process" that's taking place in the knowledge worker's \| mind. \| \| If the process was the problem (or perfect), adding a new (or \| replacing) a worker would have minimal impact but we know this \| is not true. You could have the best documentation, training, \| and absolutely stellar code yet one person can turn everything \| to shit quite quickly! The opposite is true as well: Bringing \| on a fantastic new worker can make your existing team look like \| a bunch of inefficient laggards. \| \| Neither of these situations can be fixed by improving processes \| (maybe hiring processes? Though I doubt it). It'd be like \| having one magic blue bean in the box that--if found--can \| either drastically improve or degrade the final productivity by \| 90%. Would the optimum process improvement then be to try to \| eliminate magic beans entirely? Sure seems like it (i.e. hire \| the lowest common denominator and don't try to optimize for the \| 1%). That way you reduce the likelihood of taking on the "bad \| 1%"--even though it reduces your chances of obtaining the \| perfect magic bean. \| kqr wrote: \| If you are in any sort of leadership position -- either a formal \| manager, or in that you have respect from your peers, I strongly \| urge you to read Deming. \| \| There are few authors that have taught me so much about people, \| motivation, systems, quality, statistics, what high-leverage \| effort looks like, and so on. \| \| I first picked up a book by Deming a few years ago, and not a \| single day has passed that I have not had use for what he taught \| me through his writing. \| \| The things he says are only becoming more and more relevant with \| every year. I honestly think it ought to be compulsory reading in \| school. The world would be a much better place that way; kinder, \| more efficient, and less superstitial. \| Litost wrote: \| Thanks for the suggestion, how does what he says stack up \| against those who came after, I ask this because this is one of \| my favourite management talks by Russell Ackoff [1] and he \| mentions Dr Deming so assume he was influenced \| by/worked/studied with him and given their relative ages \| wondered if his work might be valuable to start with? \| \| [1] https://www.youtube.com/watch?v=OqEeIG8aPPk \| marbex7 wrote: \| Which book? \| kqr wrote: \| I started with The New Economics, then read Out of the \| Crisis, and finally A Theory of Sampling or whatever its name \| is. \| marbex7 wrote: \| Thanks. \| [deleted] \| hencq wrote: \| There's a brilliant little book Four Days with Dr. Deming[0] that \| goes over the red bead experiment among other things. It \| basically follows the format of a four day seminar that Dr. \| Deming used to do. It's full of wisdom like this and it does a \| painfully good job making you recognize all the ineffective \| things still going on in companies today. \| \| [0] \| https://www.goodreads.com/en/book/show/34987.Four_Days_with_... \| [deleted] \| buescher wrote: \| Years ago I found a discussion of this on the web that involved a \| deep dive into optimal strategies for getting white beads, \| variations in paddle construction, root cause analysis on bead \| size and weight and hole depth in the paddles, and so on. It was \| a six sigma nightmare come to life and missed the point so \| profoundly I wish I could find it again to use as an example of \| how easily Deming is misunderstood. \| \| Related: "A bad system beats a good person any time" does not \| mean "having any system, no matter how bad, is better than having \| even the best people and no apparent system". \| AnIdiotOnTheNet wrote: \| Wait a bit and HN will probably provide you a similar \| discussion. \| krallja wrote: \| "beats" in the sense of "the beatings will continue until \| morale improves," right? \| buescher wrote: \| It certainly isn't positive. \| hencq wrote: \| Oh boy, I'd love to see that too. Unfortunately it's all too \| common to see this stuff in reality as well. \| \| > Related: "A bad system beats a good person any time" does not \| mean "having any system, no matter how bad, is better than \| having even the best people and no apparent system". \| \| I'm a big fan of sociotechnical systems [0] where the motto is \| to give people complex jobs in simple organizations. \| Unfortunately in practice you usually see the tendency to do \| exactly the opposite. \| \| [0] https://en.wikipedia.org/wiki/Sociotechnical_system \| vintermann wrote: \| Deming made me realize that there is actually management \| literature out there that isn't just fads and slogans. \| openknot wrote: \| Would there be any other similar recommendations to Deming's \| books? I would think that Eliyahu M. Goldratt's books sound \| similar (specifically, the "Theory of Constraints," \| alternatively presented through a fictional story in "The \| Goal"). \| mark_undoio wrote: \| I'm a fan of Womack & Jones's "Lean Thinking". This is all \| about Lean manufacturing, which I think is partly rooted in \| Deming's work. The focus is more on how to optimise the \| overall system than the management of individuals. \| \| The content of that book isn't directly applicable to e.g. \| software companies but if you think a bit you can see quite a \| lot of analogous situations (e.g. warehoused inventory is \| incomplete projects or not-yet-shipped code, "monuments" \| could be inappropriate central test / build systems, etc). \| kqr wrote: \| If you want someone to translate it to software for you, \| Reinertsen's Principles of Product Development Flow is \| about adapting the philosophy to knowledge work. \| \| Ward's Lean Product and Process Development is also a good \| take on those ideas. \| m104 wrote: \| I recommend Russell Ackoff's writings as somewhat related and \| more to do with how systems of people and processes work (or \| don't). Here's a great place to start: \| https://thesystemsthinker.com/a-lifetime-of-systems- \| thinking... \| [deleted] \| larrydag wrote: \| The key to doing this experiment well is having the right \| facilitator that brings the attitude. A good facilitator will \| roleplay a leader/manager/exec that will praise when measures are \| good and berate when measures are bad. The idea of this \| experiment is to show how management can harm the process even \| when there is inherent variability, good or bad. \| \| Here is Dr. Deming himself performing the experiment \| https://www.youtube.com/watch?v=7pXu0qxtWPg \| Rickasaurus wrote: \| Thanks for sharing this, it's amazing to see it in action. \| laserlight wrote: \| What an amazing demonstration. It's unfortunate that these \| lessons haven't been learned decades later. \| curiouscats wrote: \| The W. Edwards Deming Institute Blog https://deming.org/blog/ \| \| Deming on various management topics \| https://deming.org/category/deming-on-management/ \| \| More resources on Deming's ideas https://deming.org/online- \| resources-on-w-edwards-demings-man... \| pakitan wrote: \| I don't get it. This "experiment" could have been replicated by a \| simple computer simulation, given that worker output is entirely \| random. The supposed moral of the story is that system design \| defines outcome, not individual performance but how does that \| even count as "science" when you don't have control and \| experimental group. He designed a system with inherent flaws and, \| surprise, it has flaws. We can see there is variance in \| "productivity" but we have no idea how this same variance would \| have affected output if workers actually had agency. \| lupire wrote: \| It's a demo experiment, like whe the physics teacher swinga a \| heavy pendulum at their own nose, or shoots a BB gun at a \| falling toy. \| function_seven wrote: \| That's the point. \| \| So first, not all science requires an RCT. Dividing \| expiremental subjects into study and control groups is one way \| of doing science. It's not the only way. \| \| In this case, this is a concrete demonstration of just how much \| variance can emerge from a "statistically neutral" process. The \| systemic flaws are part of the demonstration. What appear at \| first glance to be identical tools, inputs, and processes are \| in fact subtly different. The demo shows management types that \| their charts and graphs cannot always be relied upon to \| differentiate performance levels among staff. The system itself \| must also be scrutinized. If Bob's ad campaigns are \| outperforming Alice's by 20% in the first quarter, it doesn't \| necessarily mean Bob is a marketing genius and Alice needs a \| PIP. \| \| A computer simulation would not have nearly as powerful effect \| on most people as a live demonstration using real beads. And \| the imperfections in the paddles is something that naturally \| arises when they're physically made, but would have to be tuned \| by the programmer building the simulation. Which would lead to \| questions about "just how did they decide what variances would \| come into play?" \| jiggawatts wrote: \| A real world example is that I do programming on a high-end \| workstation laptop. My coworkers use old budget laptops. \| \| This is not in their control -- they're victims of corporate \| policy. \| \| Does it influence quality in complex and hard to quantify \| ways? \| \| Most assuredly... \| pakitan wrote: \| I think I get it now. The point of the experiment is to ELI5 \| the concept of variance to management types who skipped \| statistics classes :) Could be useful for some bosses I had \| :) \| function_seven wrote: \| So I'm watching a video of this right now[0], and it's even \| more enlightening than I figured it would be! Deming makes \| comments throughout the demonstration that I swear I've \| heard in the real world. For example, one worker--whose \| previous results put him on probation (he had 12 red \| beads)--managed to have only 6 the next day. "Looks like \| probation worked". \| \| Meanwhile another worker--previously scoring 5, and getting \| a merit-based raise from it--did poorly with 12. The \| remark: "That raise went to his head. He's getting lazy". \| \| So yeah, the value of this is in the actual doing of it. \| \| [0] https://www.youtube.com/watch?v=7pXu0qxtWPg \| advisedwang wrote: \| This experiment is something he did in classes etc so people \| could _experience_ the obvious idiocy in trying to manage \| individuals for system behaviour. His point is that ALL actual \| work is also dominated by system behaviour, just more subtly, \| and managers must focus on the systems and not worker \| performance. \| pakitan wrote: \| > His point is that ALL actual work is also dominated by \| system behaviour \| \| If that's his point, it seems obviously wrong. Some work, \| like in the experiment is dominated by system behavior. For \| others, system would play a much smaller role. For example, 2 \| people cranking code in a startup. No matter what system you \| apply, if they are not good programmers, nothing of value \| will come out. \| kqr wrote: \| That's also missing the point somewhat. Put the best two \| programmers in the world in a shitty system that rewards \| them for the wrong things and they will produce garbage. \| Put mediocre programmers in a fantastic system that brings \| out the best in their collaboration and you might actually \| get to market sooner and better than the other group. \| salawat wrote: \| Something will most certainly come out. You're just not \| defining the system. If I define the system as "one \| programmer is responsible for looking at the desired \| product and writing specifications" and the other \| programmer is to translate specification into programming \| code, and never shall one do the other's job, I assure you, \| the best programmers in the world will produce shit over \| time. \| \| Randomly swap in two new actors with different life \| experiences into the same spots to do the same work, and \| you'll still get shit. If in the unlikely event, you get \| amazing work, it's not that the people doing it were \| special; it's just anothe outlier in the data stream. Add \| in the emotional toll of working as hard as possible to \| succeed but never being able to meet prescribed quality \| levels? \| \| A system is perfectly tuned to produce the results it does. \| Want different results? Change the system. That is Deming's \| point. We have a tendency to blame variance in a system on \| the human actors immediately proximal, instead of paying \| attention to the actual significant constraints. This is an \| important lesson to management types, as they are to \| process/system what a programmer is to a computer. \| \| The planners cast the dice for downstream long before \| downstream can do anything about it, and in many corporate \| setups, top down works just fine, but bottom up never gets \| any attention. \| pierrebai wrote: \| I find the experiment skewed. Or more precisely, that it is not \| meant to investigate human behaviour or psychology. It is rather \| precisely designed to support a chosen result to support a given \| world view. The fact that it has been ran for 50 years is a \| strong indication of this. \| \| IOW, the experimenter wanted to be able to arrive at the \| conclusion that difference in performance was unrelated to \| workers and designed the experiment so it would give this result. \| In short, this demonstrate few things outside of a very \| artificially setup situation, where the workers have no say and \| the job is predestined to fail. \| \| Anyone who worked anywhere knows very well that there are \| actually vast difference between two workers. \| cool_dude85 wrote: \| >IOW, the experimenter wanted to be able to arrive at the \| conclusion that difference in performance was unrelated to \| workers and designed the experiment so it would give this \| result. \| \| That's the whole point. The experiment is not that we're \| supposed to be surprised that the workers did not affect \| performance - in fact, that's the subtext of the whole thing! \| We know it from the start cause he explains exactly how the \| process works and we can all see that individuals cannot affect \| their output. \| \| The point is, if we are unaware that we're in such a situation, \| we can still find metrics to allow us to rank workers, fire low \| performers, give out raises, etc. When we myopically focus on \| such metrics, and disregard the system that makes them \| worthless, we're making all our decisions on random chance, \| even though we have a clear process, data collection, the whole \| thing. \| pierrebai wrote: \| That's also my whole point: this is not an experiment but an \| elaborate artificial argument designed to prove a point of \| view decided in advance. That is why I find it unsavory. \| Jtsummers wrote: \| The point of the experiment is to be extreme, but after \| reading (a very large portion of) Deming's work, I don't \| think he'd disagree with your initial assertion that there \| are differences between workers. \| \| The broader points he makes, related to this experiment at \| least: There are individual and systemic issues that \| influence the outcome of a process. The actual ratio will \| vary depending on what kinds of processes are involved. \| \| If the job is to be a literal screw turner on an assembly \| line, then there is relatively little difference between \| the majority of people (assuming they are generally able \| bodied, sighted, and have decent coordination), the \| _system_ (tempo, length of shift, accessibility of the \| thing being screwed together, tools being used) will have a \| much larger impact than the individual 's skill. The system \| of the assembly line will influence the outcome more than \| the individual's skill (at least above a basic threshold, a \| supremely uncoordinated individual could flounder even with \| the slowest pace of work). Switch to more skilled work and \| you will find, increasingly, more differences in outcome \| based on individual performance versus the system of the \| work, but even there the system matters. \| \| Look at software development offices that still favor \| things like manual build processes, code versioning \| control, testing, and deployment over automation. They \| provide many opportunities for human error (even just \| simple miskeying of data) that can reduce everyone's \| effectiveness no matter how skilled. (Fortunately these \| kinds of places are increasingly rare, at least outside of \| US defense contractors.) \| \| The experiment, then, is an artificial construct (like most \| classroom experiments) meant to illustrate a point by \| showing one extreme. This acts as a counterpoint to the \| more conventional wisdom that the individual, and not the \| system, is what actually matters for the outcome. The \| conventional wisdom, of course, being wrong in many \| circumstances since it tends to place too strong a weight \| on the individual performance and too weak a weight on the \| system. \| \| It would be unsavory if he had said, "See, stop evaluating \| individuals their contribution doesn't matter." But he \| never did say that (in anything I read, at least), and \| anyone who looks at this experiment and draws that \| conclusion would be an idiot. \| emeraldd wrote: \| I wonder what the limits of this are? From a naive point of view \| there has to be a point where training/skill/physical \| endurance/etc. come into play. The bed experiment seems to fit a \| fixed rate, assembly line style of work. While I would agree that \| numeric/performance ranking is mostly meaningless, everyone knows \| that one somebody you go to when no one else can fix a problem. \| IggleSniggle wrote: \| I see what you mean, but I also think that's encapsulated in \| the idea of "ready willing workers." \| \| Obviously there are differences between people, and better and \| worse teams. But the lesson here is about how the environment \| factors in, and how management can accidentally arbitrarily \| suppress innovation or reward luck within normal bounds of \| success. Or hamper themselves to failure by insisting on a \| broken process. \| \| Could it be the case that "everybody goes to Jim," and as a \| result, Jim gets good at helping people? Could it be that if \| everybody just went to Kim for 2 weeks, that her fixes might \| turn out to be better yet completely orthogonal method of \| solving the problem? \| \| The Red Bean experiment is an antidote to rigid process and the \| praise/blame game as based on inspection of results. It's a \| story intended for management to hear, not an absolution or \| dismissiveness of personal reasonability. \| \| If you've hired "ready willing workers," then looking at the \| results doesn't necessarily show you who was killing it and who \| wasn't. \| \| That worker who is always "killing it" may be good at scooping \| up projects that always look great. That worker who is always \| underperforming might be maintaining essential infrastructure \| without which the system would fall apart. \| \| The worker who's killing it may be doing so by spending all \| their time "buttering up" a customer. The worker who appears \| underperforming may appear so because they spend all their time \| "buttering up" a customer, but someone else always lands the \| sale. \| \| It's a meditation on imperfect knowledge. \| kqr wrote: \| As you have observed already, this experiment is set up \| specifically to eliminate the effect of training/skill/physical \| endurance etc, and YET when it's performed in real life with a \| good facilitator, people who are unlucky start to feel like \| they're underperforming and need to step it up, while people \| who are lucky start to feel like they deserve the praise for \| doing well. \| \| I've read about people who go for days after the experiment and \| feel bad about their subpar performance because they feel like \| they've let down or brought shame to their company and wonder \| if they couldn't have done something better. \| \| And this is an experiment that's set up to remove any trace \| indivdual agency what so ever! People still beat themselves up \| over it. \| \| When you experience this experiment for real, you start to \| forget that it's actually designed to eliminate any sort of \| skill. \| \| In other words, the experiment shows how hard it is to \| recognise when we're judging the system and not the people in \| it. The experiment shows that even when you think you're seeing \| individual performance, it's very plausible you're not. \| ziggus wrote: \| Focusing on the type of work being done is a bit of a bike \| shed, since the experiment isn't about the work per se, but the \| measurement of the work as a function of the employee alone - \| ie, without the context of the systems in which the employee \| functions. \| \| A good example of the type of mismeasurement done in non- \| manufacturing contexts is the ridiculously stupid burn-down \| chart. \| webmaven wrote: \| _> A good example of the type of mismeasurement done in non- \| manufacturing contexts is the ridiculously stupid burn-down \| chart._ \| \| Bad management can find a misuse for any tool, I don't think \| burn-down charts are a particularly attractive nuisance in \| that regard. \| candyman wrote: \| I was lucky enough to do this with the man himself at NYU. He had \| trouble speaking then but the class was dead silent and hung on \| his every word. Profound thinker. \| mark-r wrote: \| I saw a link to this in a discussion of another topic, I'm glad \| somebody pushed it to the top level. Definitely worth the read. ___________________________________________________________________ (page generated 2022-02-04 23:00 UTC)