|
| jawns wrote:
| The most difficult part about code deletion is practicing the
| Chesterton's Fence principle:
|
| > In the matter of reforming things, as distinct from deforming
| them, there is one plain and simple principle; a principle which
| will probably be called a paradox. There exists in such a case a
| certain institution or law; let us say, for the sake of
| simplicity, a fence or gate erected across a road. The more
| modern type of reformer goes gaily up to it and says, "I don't
| see the use of this; let us clear it away." To which the more
| intelligent type of reformer will do well to answer: "If you
| don't see the use of it, I certainly won't let you clear it away.
| Go away and think. Then, when you can come back and tell me that
| you do see the use of it, I may allow you to destroy it.
|
| https://wiki.lesswrong.com/wiki/Chesterton%27s_Fence
|
| While this tool certainly does the job of _proposing_ code
| deletions, that 's the easier part. The harder part is knowing
| why the code exists in the first place, which is necessary to
| know whether it's truly a good idea to remove it. Google,
| smartly, is leaving that part up to a human (for now).
| proper_elb wrote:
| You raise a good point, and I would answer it with agree and
| disagree:
|
| Agree: Yes, you are correct, merely observing that a code path
| was never executed in the last 6 months is not the same as
| understanding why the code path was created in the first place.
| There might be the quite real possibility of an infrequent
| event that appears just once in every two years or so (of
| course, this should also be documented somewhere!).
|
| Disagree: Pragmatically, we _have_ an answer if the code path
| was not executed after 6 months use in production and test: We
| know that, with a very high probability, the code path was
| created either by mistake (human factor) or intentionally for
| some behavior that is no longer expected from our software. To
| continue the Fence metaphor, regarding Sensenmann: After 6
| months, we know about the Fence that 1) it has no role to play
| in keeping the stuff out that we want out (that was all done by
| other fences that were had contact with an animal at least
| once) and 2) that it _might_ have been used to keep out flying
| elephants or whatever, but no such being was observed in the
| last 6 months (at least the fence made no contact with it,
| which it then should have!) and probably went away.
|
| That said, having a human in the loop is probably a good idea.
| anonymousiam wrote:
| It should also be clear that this article is about "deleting"
| code from an active project, not about "deleting" it entirely
| from the version control system. Thus, any code "deleted"
| through the described process could still easily be restored if
| necessary.
| breck wrote:
| As a counter to Chesterton's Fence: sometimes the fastest way
| to understand what something does is to remove it and see what
| complains. You might get only 1 complainer for every 10 fences
| you take down. Putting that one fence back up takes much longer
| than taking it down, but the time saved from removing the other
| 9 unnecessary ones makes it a net win. And this time you can
| add Documentation to the rebuilt fence.
| macNchz wrote:
| Also known as the scream test in IT: unplug that old server
| and see who screams!
| shagie wrote:
| Microsoft uses a scream test to silence its unused servers
| - https://www.microsoft.com/insidetrack/blog/microsoft-
| uses-a-...
| IshKebab wrote:
| A further counterpoint: if you follow the Fence proponents'
| logic to its conclusion you can _never remove any code_ which
| is clearly an absurd situation.
|
| I think the real logical flaw is that Fencers (as I will now
| call them) put the blame on the person who removes an
| apparently useless fence. But they're wrong. The real blame
| lies with the person who built the apparently useless fence
| and didn't put a sign on it explaining why it shouldn't be
| removed.
| proper_elb wrote:
| > A further counterpoint: if you follow the Fence
| proponents' logic to its conclusion you can never remove
| any code which is clearly an absurd situation.
|
| No, that would only be the case if one would never
| understand any code. Chesterton's Fence consists of two
| parts ("understanding some code" as a precondition to
| "removing some code"), and leaving one or the other part
| out makes it some other thing than what Chesterton's Fence
| means.
|
| > The real blame lies with the person who built the
| apparently useless fence and didn't put a sign on it
| explaining why it shouldn't be removed.
|
| Chesterton's Fence is not about blame, or the past in
| general - it is about how to deal with things that are in
| the present. (Although I agree that the original fence-
| builder should have left a note or two!)
| kube-system wrote:
| The principle says you can't remove it _until you
| understand why it was there_. It's more about doing due
| diligence.
|
| I follow the principle when I remove code, and it's a
| reason why good code comments are important. "Oh yeah, this
| was written for [x] which is no longer a thing, we can
| remove it now"
| hgsgm wrote:
| It's not about blame, it's about making good decisions.
| xboxnolifes wrote:
| And not putting up a sign explaining why it's a necessary
| fence is a bad decision. Avoiding removing all unlabeled
| fences because they _might, maybe, potentially_ be
| useful, is also likely a bad decision if taken to it 's
| conclusion.
| TeMPOraL wrote:
| Chesterton's Fence is a reminder to actually get up, walk
| over to the fence and skim the multiple labels and notes
| the builders left there - because the big problem is
| usually someone looking at a thing that came before them,
| and _assuming_ they understand what it was for, without
| bothering to actually check it.
| einpoklum wrote:
| The article is about the removal of _dead_ code. So, not a
| "fence across the road" - it's a fence that was moved to the
| side of the road, already cleared. The question is just whether
| to dismantle the fence or keep it there just in case.
| Xorlev wrote:
| +1. And, it's in version control forever. It's not as if it
| entirely disappears. Like one of the sibling comments
| mentioned, I only rarely reject Sensenmann CLs.
|
| That's worth explaining: it's automated code deletion, but
| the owner of the code (a committer to that directory
| hierarchy) must approve it, so it's rare there's ever a false
| deletion.
| opportune wrote:
| I don't think you understand Senssenmann fully based on this
| post. At Google basically everything in use has a Bazel-like
| build target. This means the code base is effectively a
| directed "forest"/tree-like data structure with recognizable
| sources and sink. If you can trace through the tree and find
| included-but-not-used code by analyzing build targets, you can
| safely delete it. There are even systems (though not covering
| everything) that sample binaries' function usage you could
| double check against.
|
| > why the code exists in the first place
|
| If the code is unreachable it's at best a "possibly will be
| used in the future" and most likely simply something that was
| used but not deleted when it's last use was removed (or a YAGNI
| liability).
|
| If you can find a piece of code included in build targets but
| unreachable in all of them, it's typically safe to delete. And
| it's not done without permission generally, automation will
| send the change to a team member to double check it's ok to
| delete/nobody is going to start using it soon.
| UncleMeat wrote:
| "This code has been dead for six months" is a _very_ good
| heuristic that the code is not relevant. I do occasionally
| reject the sensenmann CLs, but only very very rarely. This isn
| 't weird code that nobody knows why it exists but it is
| currently doing something. This is code that cannot execute.
| ninjanomnom wrote:
| Code that only triggers from a yearly holiday, disaster
| alerts, leap years, or the like, would have longer periods of
| going unused and likely be very problematic if removed.
| Unless by dead code you mean unreachable code in which case
| it shouldn't exist in the first place and I agree should be
| removed.
| joshuamorton wrote:
| Yes, the nice thing about blaze/bazel + sensenmann is that
| you can very accurately say "this code was not built into a
| binary that has run in the past 6 months".
|
| Sometimes you still want it (e.g. python scripts that are
| used every once in a while for ad-hoc things and might go
| months between uses), but _usually_ the right thing to do
| is productionize stuff like that slightly more (and also
| test it semi-regularly to make sure it hasn 't broken).
| CyberDildonics wrote:
| You can probably get most of that by just looking at the
| atime attribute on the file system.
| joshuamorton wrote:
| Nah, there's stuff that scans the entire repo regularly
| for all kinds of interesting purposes, and of that's
| ignoring the fact that `atime` isn't available or a
| source of truth in piper.
|
| Like conceptually I believe this could be wrong in both
| directions, since there's heavy caching of build
| artifacts, you can totally build a transitive dependency
| of some file without actually reading the file (and
| potentially do this for a relatively long period of time,
| though I don't think that will happen in practice), and
| stuff will regularly look through large swaths of files
| that aren't necessarily run.
| taspeotis wrote:
| Different industries I guess. The new financial year comes
| around every 12mo. Good luck explaining to the accountants
| that you deleted their end-of-year reconciliation reports
| because they didn't run them every 6mo.
| jeffbee wrote:
| Wouldn't it be mostly their fault for approving the
| removal?
| dekhn wrote:
| Google's response to Chesterton's Fence is: "if you liked it,
| then put a test on it".
|
| I used to update the internal version of numpy for Google and
| if people asked me to rollback after I made my update (having
| fixed all the test failures I could detect), and they didn't
| have a test, well, that's their problem. The one situation
| where that rule wouldn't apply is if I somehow managed to break
| production and we needed to do an emergency rollback.
|
| I shed a tear when some of my old, unused code was autodeleted
| at Google, but nowadays my attitude is: HEAD of your version
| control should only contain things which are absolutely
| necessary from a functional selection perspective.
| Forge36 wrote:
| I like that philosophy. In a similar vein: if it's important
| why aren't we testing it.
|
| How do you encourage testing?
| sitkack wrote:
| I hope everyone involved gets L+1!
| [deleted]
| joebiden2 wrote:
| Sincere quesion: what is interesting or novel about this? Is it
| just the scale or did I miss some subtle aspect?
|
| This is more (or less?) the same as industry best practices, just
| scaled up. There is a challenge in scaling up, as there is more
| potential for someone to mess it up. But it's the same technique.
|
| So what am I missing?
| summerlight wrote:
| In the perspective of software engineering economics, the scale
| is important. Everyone knows it's good to clean up unused code
| but they just don't care because they think it doesn't yield a
| short term ROI for themselves. Then why don't we bring the cost
| down and see what happens? Automation changes this equation.
| codemac wrote:
| > the same as industry best practices, just scaled up.
|
| That's like saying S3 is the same as ext4, their the same, just
| scaled up! This is a poor argument, you'll note that S3 and
| ext4 are entirely different things, not "challenges",
| fundamentally different implementations.
|
| Google is the only company I've ever worked for that
| automatically deleted dead code, let alone across a company of
| 100k+ SWE.
| joebiden2 wrote:
| Fair enough and thanks for the reply. Still, for anything
| bothering engineers more than a bit repeatedly, anyone will
| write tools to remove the manual burden.
|
| Our internal practice is to delete code if you suspect it's
| unused, run tests, and if it doesn't affect any tests, go for
| it. This could be automated, but it is not pressing enough,
| so we didn't automate it yet.
|
| We could though, and it may even be a good idea, but I still
| don't get the novelty. But I appreciate your point of view.
| kccqzy wrote:
| I think the takeaway is that at Google's scale, even if you
| think some minor problem is not pressing enough to be
| automated, it will become pressing soon enough.
| jsnell wrote:
| Your proposed process is exactly the wrong way around.
| You'll end up keeping dead code just because it has tests,
| and delete code that's still used in prod just because it
| happened to be untested.
|
| This is one of the details that the blog post goes into.
| Sounds like it's not as trivial and obvious a problem as
| you think it is, and you would have benefited from just not
| dismissing the post because of that.
| brunooliv wrote:
| Not to criticize your POV and argument directly, but, in
| the end, a lot of things, especially like these, are always
| easily subjected to the "we could do it, but just didn't
| bother to yet" kind of argument, and, when it comes down to
| the real work, things are much harder than they
| superficially appear to be. So yeah this isn't new...
| But... You know eheh
| joebiden2 wrote:
| Well, I'd politely agree to disagree. Google scale is
| defined by novel, radical approaches, like for example
| inventing map/reduce, writing papers on LLMs others then
| implement successfully, or creating something like
| Kubernetes.
|
| The specific topic here is not one of those google
| problems to me, as I can compare it to other problems we
| already solved. But yes, we could miss that critical
| point where a totally different problem domain emerges
| just from one order of magnitude more, so fair game.
| btilly wrote:
| I liked most of the blog, but it bothers me to see stuff like,
| "... just as with the introduction of unit testing 20 years
| ago...".
|
| No, unit testing was NOT introduced 20 years ago. As an example,
| Perl 1 was released about 35 years ago with a unit test suite
| that got run on every install. Every version of Perl has done so
| since, and since CPAN came along, most Perl modules have followed
| suit. This was the secret sauce behind Perl's reputation for
| being so portable.
|
| Nor was Perl a pioneer. In fact unit testing was used in the
| 1960s on the Apollo program, and was even called unit testing. I
| believe that the concept can be dated back to a 1950s textbook
| but I can't find the reference.
|
| So unit testing is over 60 years old.
| allanrbo wrote:
| Maybe they just mean that Google started investing seriously in
| unit testing 20 years ago?
| UncleMeat wrote:
| This refers to Google, I'm pretty sure. Early Google didn't
| believe in unit testing and it took a few particularly stubborn
| engineers to demonstrate the value of the practice and convert
| the culture to promoting unit testing.
| charcircuit wrote:
| Does this only get rid of unused binaries? Or is the system smart
| enough to use the profiling infrastructure to identify dead code
| in general?
| kccqzy wrote:
| Profiling is inherently probabilistic and I don't think it
| should be used. Anything that inspects the runtime (dynamic)
| behavior of code isn't good enough for a code deletion tool.
| Only static analysis will do.
| Jolter wrote:
| Just read the article. It's not that long.
| charcircuit wrote:
| I was hoping to spark a conversation about this approach as
| no such thing was mentioned in the article even though it
| should be possible to do.
|
| The whole point of these comment sections is to have a
| discussion. If the point of this site was just to read
| articles there wouldn't be a comment section.
| croes wrote:
| FYI: Sensenmann is the german word for Grim Reaper
| Zetobal wrote:
| I still don't get the tech industries fascination with random
| german words at least here it's sort of fitting.
| aardvarkr wrote:
| Citation needed ^ Maybe there are just some great German devs
| and you're using a lot of their software?
| mflendrich wrote:
| Google's Zurich office has had a tradition of creating
| codenames in German (regardless of the backgrounds of any
| engineers involved).
|
| Source: I worked on Sensenmann.
| meibo wrote:
| Looks like this was made by a team in Zurich, which is mostly
| German, so I imagine it came to them fairly naturally, and
| who doesn't want to pick cool names for hackathon projects.
| evmar wrote:
| It was kind of an internal joke at Google for German-
| speaking teams to make German-named projects. (It's maybe
| only a joke that makes sense to the infamous German sense
| of humor.)
| mkoubaa wrote:
| Sounds like a useful system from which almost nothing is usable
| outside of Google.
| vamega wrote:
| Sure; but working at another very large company, I can say I
| wish we had this.
|
| Old unused code is a huge problem for us. The coordination
| costs of trying to update company wide problems are made much
| more severe by old code.
|
| I wish we had something like this. We're large enough we'd need
| our own system anyway. We don't have a monorepo, and we don't
| use tools so many others do.
| DamonHD wrote:
| I worry about archival and enough history for diagnosing long-
| standing subtle issues that take a long time to surface as bugs.
| This is not theoretical: apparently a TeX bug picked up after
| many years had been there from the start.
| kragen wrote:
| i think piper saves the full history of the whole monorepo; if
| that's correct it's not 'deletion' in that sense
| gravypod wrote:
| (opinions are my own)
|
| > Its goal is simple (at least, in principle): automatically
| identify dead code, and send code review requests
| ('changelists') to delete it.
|
| It sends CLs (pull requests) and shows up as a commit. You
| get a chance to approve or deny the deletion
| kragen wrote:
| yeah but even if you approve it the code is still there in
| the code history, right?
| bradfitz wrote:
| Yes.
| DamonHD wrote:
| Good.
|
| But relatively hard to find and work with.
| dekhn wrote:
| I used to do archeaology on Google's monorepo (IE,
| looking far into the past of mapreduce, search engine,
| ads, and other products) and it wasn't really that hard.
| Heck, there was even a sythetic filesystem where you
| could just cd to a historical commit # and see a view of
| the repo at that timepoint (google's version control is
| based on an always-increasing globally shared commit
| numbers).
| kevinoconnor7 wrote:
| Not really. If you're in the very rare situation where
| you need to diagnose a bug in long-since dead code, you
| can just view repository synced to where the version was
| cut.
| pradn wrote:
| No you just go to the folder and select "show deleted".
| dmoy wrote:
| Don't even need to do that. Codesearch with a `from:0`
| qualifier does full regex search on the history iirc
| tonfa wrote:
| Still fairly easy to find IMO. You can search deleted
| code easily, and blame layers allows finding how code
| evolved fairly quickly.
| kragen wrote:
| thank you
| jmyeet wrote:
| The most important part of this is that the build units are
| hermetic and all dependencies are explicit. This is why you need
| to use something like Bazel/Blaze vs older build systems like
| make where identifying what's used, particularly when you get
| into meta-rules, becomes all but impossible.
|
| As the article points out, you also have to look at what's
| actually run. This is the real advantage of Google
| infrastructure: the vertical integration so if a binary is run on
| Borg, or even on the command line, that can be tracked.
| lifeisstillgood wrote:
| There is a meta-meta situation surrounding really good software
| management.
|
| You can knock up some code that say solves a specific business
| problem right now. (meta:0)
|
| But you need an environment that can take a new piece of code and
| deploy it and test it (meta:1)
|
| how is that code running - this is shadingnfrom production
| monitoringninto QA and performance (meta:2)
|
| Compare all the running code and its performance against the
| benefits of replacing code or going back to level 0 and just
| fixing a business problem (meta:3)
|
| Then this death eater - meta 4 I think.
|
| And to me this is why comments like "software needs to solve
| business problems" is naive - once you start using software you
| need more software to manage the software - it's going to grow
| till it consumes the business.
| falcor84 wrote:
| >For example, if an engineer is unsure how to use a library, they
| can find examples just by searching
|
| Isn't that the case with all libraries? How does the monorepo
| help here?
| er4hn wrote:
| Discoverability. It's a lot easier to search one repo then it
| is to search a set of repos. For the latter you need to have
| all the repos listed somewhere, and have them be accessible.
| speedgoose wrote:
| SourceGraph is great to search in many repos.
| knutzui wrote:
| It's just as easy to index multiple repos as it is to index
| one, which means that the same goes for searching. Why would
| it be any different for one as opposed to many?
| codetrotter wrote:
| We use GitLab in the company I work for. There may be
| repositories created by others in the company, that depend
| on repos I work on, where I don't have access to said other
| repos. So to me these are invisible. If everything was in
| one monorepo, it'd all be visible to me easily.
| Kwpolska wrote:
| That's more of a culture thing. Your company chose to
| enforce more granular permissions. Perhaps there are good
| reasons behind it, e.g. code for different clients being
| under different contracts and NDAs.
| speedgoose wrote:
| I'm not sure. I also thought that these big repos have to use
| sparse checkouts to not use too much space on the developers
| machines. So you would have to use an external code search
| index anyway.
| kpw94 wrote:
| "searching" how a method is used is as simple as clicking on
| the symbol.
|
| Think Visual Studio "find all references", but working around
| the entire company's codebase, not just your current project.
| oneplane wrote:
| On a non-google level just being aware of code sitting around
| costing resources is pretty important. Often, tests and
| maintenance are just ignored or not calculated in as cost (be it
| time, money, effort or otherwise). It is almost in the same realm
| as "I don't know why it works" which is as dangerous as "I don't
| know why it doesn't work".
___________________________________________________________________
(page generated 2023-04-29 23:00 UTC) |