|
| dataflow wrote:
| The notion of fire-and-forget is itself the problem. Even with
| threads, you should have them join the main thread before the
| program exits. Which implies you should hold strong references to
| them until then. Most people don't go out of their way to do this
| even when they're able to, but that's what you're supposed to do.
| bornfreddy wrote:
| Wow. What a strange design decision, as evidenced by sheer number
| of developers who don't / didn't know about this (myself
| included). I hope this gets _fixed_ instead of just documented.
| jcheng wrote:
| Agreed, I'm really surprised at all the comments defending this
| behavior. I suspect there is a non-obvious reason why it's this
| way, but "you should've read the docs" and "but why _wouldn't_
| you hold your own strong reference" are weird takes IMHO.
| boomskats wrote:
| As someone who happens to be eternally grateful to the author for
| his contribution to the Python ecosystem [0], I kinda feel like
| this comment thread is overreacting to his overreaction. When I
| look at this post all I see is a useful, well explained, byte-
| size writeup that a search engine might recommend to someone
| looking for help in writing async Python.
|
| Maybe it's because a bunch of my friends are Scottish and I get
| their sense of humour.
|
| [0]: https://rich.readthedocs.io/ (yes I'm talking about the
| fancy new progress bar that pip got recently)
| rlpb wrote:
| This issue doesn't exist with Trio's structured concurrency
| model. In other words, the problem is already solved.
| nbadg wrote:
| I'll +1 the Trio shoutout [1], but it's worth emphasizing that
| the core concept of Trio (nurseries) now exists in the stdlib
| in the form of task groups [2]. The article mentions this very
| briefly, but it's easy to miss, and I wouldn't describe it as a
| solution to this bug, anyways. Rather, it's more of a different
| way of writing multitasking code, which happens to make this
| class of bug impossible.
|
| [1] https://github.com/python-trio/trio
|
| [2] https://docs.python.org/3/library/asyncio-task.html#task-
| gro...
| Tanjreeve wrote:
| Oh good so now we can all move to this years Async flavour in
| Python.
| edfletcher_t137 wrote:
| This is a great blog post. Concise, lacking fluff or extraneous
| prose, it gets right to the point, presents the primary-source
| reference and then gets right to the solution. A bit of
| editorializing in the middle but that's completely allowed when
| writing this tightly. Well damn done, OP.
|
| And also it's _great_ information that I - like I 'm sure many of
| you - also never noticed. THANK YOU!
| [deleted]
| mgsk wrote:
| What does this add this isn't already right there in the
| documentation?
| nkrisc wrote:
| If there was nothing to add then there wouldn't be loads of
| projects on GitHub making exactly this mistake.
| Jtsummers wrote:
| It draws attention to a problem that a lot of people have
| created for themselves by not reading the documentation (or
| not recalling it if they read it). I guess the author could
| have just linked the documentation but then they couldn't
| have added the additional context of the github search
| demonstrating how common it is.
| newaccount74 wrote:
| I must have looked through the docs for create_task a dozen
| times while trying to figure out how async/await works in
| Python but still managed to overlook this part.
| edflsafoiewq wrote:
| That is unsurprising. It was first added as a brief note
| only in 3.9, and expanded to its present length only in
| 3.10.
| klyrs wrote:
| The author doesn't go into much detail on that point: this
| warning should be present in documentation of many Python
| libraries that use create_task and return the result to the
| user unless that library stores those tasks in a collection
| as is recommended -- at which point the library author had
| better roll their own garbage collection!
| isoprophlex wrote:
| Well, I don't know, I kinda miss the human angle. I'd have
| loved to first read six paragraphs about how the author's
| grandmother raised them on home grown threads and greenlets :^)
| nickjj wrote:
| > I'd have loved to first read six paragraphs about how the
| author's grandmother raised them on home grown threads and
| greenlets.
|
| With recipes, often times your problem is you want to learn
| how to make something where having the steps listed out is
| the most important thing. The story behind the recipe isn't
| important to solve your problem but for tech the story around
| the choice is important. Often times the "why" is really
| important and I really like hearing about what led someone to
| use something first. Often times that's more important or
| equally as important as the implementation details.
|
| It wouldn't make sense for this post given its title but if
| someone were making a post about why they chose to use async
| in Python I'd expect and hope that half of the post goes into
| the gory details of how they tried alternatives and what
| their shortcomings were for their specific use cases. That
| would help me as the reader generalize their post to my
| specific use cases and see if it applies.
| bialpio wrote:
| Off-topic but the life story is there to make them eligible
| to be protected by copyright. IANAL.
|
| Source: https://copyrightalliance.org/are-recipes-
| cookbooks-protecte...
| flandish wrote:
| Interesting. I always thought it was search engine
| optimization.
| aidenn0 wrote:
| SEO is definitely a big part of it; Google penalized
| pages where people closed or navigated away quickly.
| fbdab103 wrote:
| I immediately bounce from those Stackoverflow clones that
| keep appearing up at the top of searches. So, I am
| wondering how much this is still weighted in the scores.
| gdprrrr wrote:
| https://github.com/quenhus/uBlock-Origin-dev-filter
| jonas21 wrote:
| You might. But many people don't. They just want an
| answer and don't care if it's a clone or not.
| chucksmash wrote:
| Had this driven home recently, watching a younger dev
| happily clicking links I've long ago blocked via browser
| extension (w3schools AND geeksforgeeks _in one session_ )
| rmbyrro wrote:
| SEO makes total sense. I always add grandma keywords when
| I'm searching for Python stuff on Google.
|
| Like: "grandma, how the hell have I still not memorized
| the API and keep needing to resort to the same doc pages
| again and again?"
|
| Now I trained ChatGPT with grandma letters from when I
| was young, so it will answer just like if it was my
| grandma.
| water-your-self wrote:
| Its engagement optimization. Adsense pays more if you
| spend more time on the page
| yunohn wrote:
| When is the last time you heard of online recipe blogs
| enforcing copyright claims on other blogspam? Ridiculous.
|
| The real reason is simple, people who write recipes
| aren't robots - they're expressing their stories and
| emotions, while explaining how to make food that's dear
| to them..
| throwaway81523 wrote:
| There's a similar thing in tkinter but I guess users discover it
| faster, since the failure if you don't save the reference shows
| up fairly quickly.
| Lammy wrote:
| I experienced a heisenbug exactly like this in Ruby when trying
| to `while case Ractor::receive`:
| https://github.com/okeeblow/DistorteD/blob/dd2a99285072982d3...
| zzzeek wrote:
| I think asyncio is kind of neat for what it's good at, but
| beginner programmers who have never wrote code before are going
| directly to using Python asyncio (i know this because they are
| telling me so when they post sqlalchemy discussions). This is
| just wrong.
| samwillis wrote:
| This is one of many reasons I'm sceptical of the current trend in
| Python to "async all the things". The nuance to how it operates
| is often opaque to the developer, particularly those less
| experienced.
|
| GUI toolkits (like Textual) however are a really good use case
| for Asyncio. Human interaction with a program is inherently
| asynchronous, using async/await so that you can more cleanly
| specify your control flow is so much better than complicated
| callbacks. Using async/await in front end JS code for example is
| a delight.
|
| Where I'm particularly unconvinced of their use is in server side
| view and api end point processing. The majority of the time you
| have maybe a couple of IO opps that depend on each other. There
| is often little than can be parallelised (within a request) and
| so there are few performance gains to be a made. Traditional
| synchronous imperative code run with a multithreaded server is
| proven, scalable and much easier to debug.
|
| There are always places where it's useful though, things such as
| long running requests (websockets, long polling), or those very
| rare occurrences where you do have many easily parallelizable IO
| opps within one short request.
| heavyset_go wrote:
| > _Where I 'm particularly unconvinced of their use is in
| server side view and api end point processing. The majority of
| the time you have maybe a couple of IO opps that depend on each
| other. There is often little than can be parallelised (within a
| request) and so there are few performance gains to be a made.
| Traditional synchronous imperative code run with a
| multithreaded server is proven, scalable and much easier to
| debug. Traditional synchronous imperative code run with a
| multithreaded server is proven, scalable and much easier to
| debug._
|
| Python doesn't have multithreading that scales or supports real
| parallelism. asyncio has very measurable performance benefits
| for exactly that use case you've mentioned versus threaded
| servers.
| zzzeek wrote:
| Sorry that's not accurate. Asyncio and threading offer the
| same variety of "parallelism" , which is that both can wait
| on multiple io streams at once (the gil is released waiting
| on io). Neither offer CPU parallelism, unless lots of your
| CPU work is in native extensions that release the gil. In
| that unusual case, threading would offer parallelism where
| asyncio wouldn't.
|
| Asyncio's single advantage is you can wait on _lots_ of io
| streams, like many thousands, very cheaply without having to
| roll non blocking IO queueing code directly.
| heavyset_go wrote:
| I didn't say that asyncio offered parallelism, I'm pointing
| out that normal assumptions about multithreading you'd make
| with other languages don't always apply to Python. You'd
| typically assume that threads offer parallelism, a property
| you might choose to use them for over something like
| single-threaded asyncio.
|
| I've found that for even IO bound workloads, the amount of
| throughput plateaus when using a relatively small amount of
| threads despite the GIL being released on IO.
| Topgamer7 wrote:
| These days with graphql, or complex microservices
| architectures, you could have multiple hops to fulfil l the
| original request.
|
| Flask sync will hold that thread hostage until the request is
| done. Where async with properly used async libs will allow
| other requests to process.
|
| We often have medium sized reports take seconds. That is a lot
| of time to wait. And would just end up bloating your service
| scaling to handle more connections.
|
| Any service with decently long lived network requests will
| benefit from event loop handled scheduling.
| traverseda wrote:
| >Where I'm particularly unconvinced of their use is in server
| side view and api end point processing.
|
| Sure, performance isn't going to get better, but for websockets
| and server sent events the occasional long-lived async task can
| be great. Especially when you need to poll something, or check
| in on a subprocess.
| nbadg wrote:
| The thing is, there's a lot more nuance to it than this.
| Async/await is part of the language syntax in python, but
| asyncio is only one particular implementation of an event loop
| framework to power it. But really what async/await provides is
| a general-purpose cooperative multitasking syntax. This allows
| other libraries to implement their own event loop frameworks,
| each with their own different semantics and considerations (the
| two best-known alternatives being Curio and Trio). At a
| language level, there's nothing even forcing you to use
| async/await for ascync IO -- you could, if you really wanted,
| probably write a library that used it to start threads and
| await their completion.
|
| So you have, from highest-level to lowest-level: application
| code, async/await language syntax, the event loop framework,
| and then the implementation of the event loop itself. The OP
| article concerns a peculiar implementation detail in the lowest
| level that makes it very easy to write bugs at the highest
| level.
|
| But that means that even if you do "async all the things",
| you'll only encounter this situation if you write your
| application code in a particular way. It just so happens that
| "in a particular way" is, in this case, the overwhelming
| majority of how people write it, which is, of course, why the
| OP article is relevant.
| heavyset_go wrote:
| > _The OP article concerns a peculiar implementation detail
| in the lowest level that makes it very easy to write bugs at
| the highest level._
|
| Are other async implementations using the asyncio.Task
| abstraction? I haven't looked into it, but I assumed that
| asyncio.Task was tied to the asyncio implementation and event
| loop.
| pdonis wrote:
| _> GUI toolkits (like Textual) however are a really good use
| case for Asyncio._
|
| Only if the GUI toolkit is explicitly written to be asyncio-
| aware and use asyncio's event loop. Textual appears to be
| written specifically to do that.
|
| However, other GUI toolkits that I'm aware of that have Python
| bindings aren't written that way. Qt, for example, uses its own
| event loop, and if you want anything other than a GUI event to
| be fed into Qt's event loop so your event-driven code can
| process it, you have to do that by hand and make sure it works.
| There is no point in even trying to use another event loop,
| such as Python's asyncio event loop, since that loop will never
| run while Qt's event loop is running.
| samsquire wrote:
| I am a huge fan of parallel and async code. I spend a lot of
| time researching it and trying to design software that is
| easily parallelisable.
|
| Many GUIs use the event/message pump pattern, such as Windows
| 32 API. Qt does something with its event loop (QEventLoop)
|
| Threads are a rather low level instrument to get background
| tasks going because the interface between the main thread and
| the threads is rather omitted.
|
| In Java you could use a ConcurrentLinkedQueue. And in Python
| you can use JoinableQueue.
|
| I am heavily interested in this space because I want to write
| understandable software that anybody can pick up and work with.
| I worked on a JMS log viewer that used threads but would crash
| with ConcurrentModificationException due to not being thread
| safe. I changed it to be thread safe but its performance
| dropped through the floor. In my learnings since then I should
| hast sharded each JMS connection topic to its own thread or
| multiplexed multiple JMS topics per thread and loop over them.
| The main thread can interrogate the thread with a lock, that
| should be faster than every thread trying to acquire the lock.
| It would be driven by the main thread but the work is done in
| the background. The threads can keep the fetched messages in
| memory until the main thread is ready for them.
|
| I think with the right abstraction, thread safety can be
| achieved and concurrency shouldn't be something to be afraid
| of. It is very difficult and challenging working at the low
| levels of concurrency such as a concurrent browser engine.
| (I've not done that though.)
|
| This is why languages such as Pony lang, Inko, Cyber and
| Erlang, Elixir are so promising. We can build high performance
| systems that parallelise.
|
| Writing an async/await pipeline that looks synchronous is far
| easier to understand and maintain than nested callbacks. So I
| can see where async is useful. I just hope we can design async
| software to be simpler to maintain and extend.
| whoopdeepoo wrote:
| I don't write any colored function code in python, I'd much
| rather work with process/thread pools
| Animats wrote:
| Me too, but threading is botched in Python. Not just the
| Global Interpreter Lock. Some Python packages are not thread-
| safe, and it's not documented which ones are not. Years ago I
| discovered that CPickle was not thread safe, and that wasn't
| considered a problem.
| michael_j_x wrote:
| I am not sure I agree that the GUI is a good use case for
| async. A human interaction with the program must almost always
| pre-empt whatever the program was running, so I can not see how
| a cooperative multi-threading runtime like async Python can
| work in such a scenario.
| kodablah wrote:
| It is for this reason in Temporal Python[0], where we wrote a
| custom durable asyncio event loop, that we maintain strong
| references to tasks that are created in workflows. This wouldn't
| be hard for other event loop implementations to do too.
|
| 0 - https://github.com/temporalio/sdk-python
| make3 wrote:
| he never said it was hard, his point is that it's unintuitive &
| a lot of people don't know or don't remember
| kodablah wrote:
| I mean the default asyncio event loop can be
| replaced/extended where you won't have to know/remember on
| each create_task. But yes, it is an unintuitive default.
| NelsonMinar wrote:
| Does anyone understand why the event loop only keeps weak
| references to tasks? It'd seem wise to do something to stop it
| from being garbage collected while running, maybe also while
| waiting to run.
| coopsmoss wrote:
| I agree, I think this is very unpythonic behavior
| masklinn wrote:
| Only guess I'd have is to protect the system against infinite-
| loop tasks, but I don't remember any other runtime caring and
| an a task which never terminates seems easier to diagnose than
| one which disappears on you.
| kortex wrote:
| Because it's almost always the case that the consumer is going
| to keep a reference to the task in some way, so that is the
| logical choice for the "primary owner" of the task. Python
| doesn't have ownership per se like rust, but if you keep more
| than one hard reference to an object around, it'll prevent
| collection, so in cases such as this it makes sense to
| designate one primary owner and have all other references be
| weakref.
| skitter wrote:
| > if you keep more than one hard reference to an object
| around, it'll prevent collection
|
| Which is the behavior the parent comment asks for.
| anthomtb wrote:
| Well, looks like I know what I am doing first thing on Monday. I
| converted a bunch of code to asyncio a while back. I have yet to
| run into any heisenbug in that code and want to keep it that way.
| cpburns2009 wrote:
| I've been working on a PySide6 application recently using
| asyncio. I read the docs but totally overlooked the requirement
| to hold references to tasks created with `create_task()`.
| dehrmann wrote:
| Eww. What's especially nasty is this is the opposite behavior of
| threads.
| aeturnum wrote:
| I really think this writer doth protest too much.
|
| Yes, the base async interface is confusing and overly complex.
| It's a downside! As they note lots of people have stepped in to
| provide better helpers (like TaskGroups) - but these are the docs
| for the base library!
|
| > _But who reads all the docs? And who has perfect recall if they
| do?_
|
| Everyone reads the docs? That is why you don't need perfect
| recall because you can read them whenever you want.
|
| Python has lots of confusing corner cases ("" is truthy, you need
| to remember to call copy [or maybe deepcopy!] sometimes, all the
| other situations where you confuse weak v.s. strong references).
| They cause really common bugs. It's just a hazard of the language
| in general and the choices it makes (much like tasks being
| objects is a hazard). I do understand why people think they can
| throw away task references (based on other languages) - but this
| is Python! The garbage collector exists and you gotta check if
| you own the object or something else does.
|
| Edit: this feels like an experienced Python developer, who has
| already internalized all the older, non-async Python weirdness,
| being taken aback by weirdness they didn't expect. Like, I feel
| you, it does suck - but it's not a bug that values you don't
| retain may get garbage collected.
| No1 wrote:
| He didn't even have to read "all the docs" - just the ones that
| pertain the the function that he is using. And then not ignore
| the section marked "Important" _and_ the highlighted "Note".
| richbell wrote:
| What if he read the docs for that function prior to the
| "important" note being added?
| Karunamon wrote:
| > _Everyone reads the docs?_
|
| The author goes on to say they found this pattern lurking in
| various projects on github. So, no. The problem is that this
| behavior is subtle, not intuitive, and unless you are reading
| the actual documentation top to bottom (and not just the
| function signature and first paragraph from the pop up in your
| IDE) you will likely get bitten by this.
|
| What is the point of your comment? The author _shouldn 't_ have
| called out the upturned rake in the darkened shed?
| rollcat wrote:
| > The author goes on to say they found this pattern lurking
| in various projects on github.
|
| I'd call it an anti-pattern. If you spawn a process/thread,
| and never wait/join it, it means you don't actually care what
| it does, if it crashes, etc. I don't see a problem with
| Python's behavior here.
| aeturnum wrote:
| I wouldn't say _shouldn 't_ - they are free to do what they
| want. But this is a blog post about something that can trip
| you up that the docs highlight - which the author calls a
| "heisenbug". The author doesn't even have a suggestion for
| the docs, which already calls out the problem they
| encountered, they just note that there are helpers for this
| problem (which is true).
|
| The point of my comment is that subtle, non intuitive things
| like this are all over Python and, while this one is
| _particularly bad_ , this blog post makes it seem like more
| of an aberration than it is.
| IshKebab wrote:
| > Everyone reads the docs?
|
| Wow I've heard people say that everyone _should_ read all of
| the docs (which isn 't really true) but I've never heard anyone
| claim that everyone _does_ read all of the docs! Wild.
| raverbashing wrote:
| > "" is truthy
|
| Humm, no? Unless you mean ("",) >>> not ""
| True
| aeturnum wrote:
| Oh, sorry, you are right - "" is false-y, even though it's a
| valid empty value. So it's hard to tell the difference
| between a value not being filled and a value being filled
| with an empty value.
|
| ex: answers = {} answers["I exist"] =
| "" if answers["I exist"]: print("a")
|
| does not print.
| fbdab103 wrote:
| I guess I am too deeply in the Python ecosystem to see a
| problem here. Unless you want to check for the existence of
| "I exist"? In which case, the Python Way would be
| answers = {} answers["I exist"] = "" if "I
| exist" in answers: print("a")
| pacaro wrote:
| Maybe ... if answers.get('I
| exist'): print('a')
|
| Which is why you should always explicitly check for
| _None_ if that is your intent.
| aeturnum wrote:
| It's not a problem? The async interface isn't a problem
| either. It's just a thing you have to remember about
| python: "most input is truthy except for the input that
| isn't"
|
| "Most of the time you don't disrupt your program by not
| keeping the returned reference in scope except for when
| you do"
|
| It's just a thing that trips people up.
| dwattttt wrote:
| > It's just a thing you have to remember ...
|
| The more of these things there are, the more brainpower
| you devote to remembering the right way to do things; if
| you don't you introduce bugs, a subtle, painful one here.
| heavyset_go wrote:
| "Empty containers are falsy" is a Python fundamental,
| this isn't a subtle bug, but an obvious one.
| fbdab103 wrote:
| Truthy is a Pythonic core principle of the language. It
| is not an edge case phenomenon in the language which I
| would expect a regular practitioner to confuse.
|
| https://docs.python.org/3/library/stdtypes.html#truth-
| value-...
| aeturnum wrote:
| I mean, I've seen bugs around that in code I've worked on
| and I've created bugs where it's a factor.
|
| Weakrefs are also a core part of the language:
| https://docs.python.org/3/library/weakref.html . You
| can't use python without using them.
| fiddlerwoaroof wrote:
| What I learned when I wrote Python professionally was
| "never rely on truthiness" explicitly writing out a
| boolean expression that does what you want is more
| explicit ("explicit is better than implicit", PEP 8) and
| prevents a whole class of bugs down the line.
| nemetroid wrote:
| PEP 8, which you mention, explicitly recommends relying
| on truthiness:
|
| > For sequences, (strings, lists, tuples), use the fact
| that empty sequences are false: #
| Correct: if not seq: if seq: #
| Wrong: if len(seq): if not len(seq):
| AeroNotix wrote:
| PEP8 is touted a lot as if it is a perfectly correct tome
| of ... correctness. I've worked in Python long enough to
| know that it both doesn't cover everything and the advice
| is sometimes actively bad.
| heavyset_go wrote:
| > _if answers[ "I exist"]:_ if "I exist"
| in answers: ...
| wizzwizz4 wrote:
| > _So it 's hard to tell the difference between a value not
| being filled and a value being filled with an empty value._
| >>> answers = {} >>> if answers["I don't exist"]:
| ... print("a") Traceback (most recent call
| last): File "", line 1, in
| if answers["I don't exist"]: KeyError: "I don't
| exist"
|
| The method you're trying to use doesn't work _anyway_ : it
| doesn't matter that it's confusing. You'd have the same
| problem with the value False.
| Etheryte wrote:
| I think you may be too bold with the assumption here,
| personally I would wager that the majority of people who write
| Python don't even know Python has official docs outside of a
| site called Stack Overflow.
| leni536 wrote:
| Considering how many times I need to add site:python.org to my
| python search queries to actually get to the docs, I assume
| that a surprisingly low number of python developers actually
| read the docs.
| 0x008 wrote:
| If you use Druck duck go you can prefix search with "!py3"
| iforgotpassword wrote:
| > Everyone reads the docs?
|
| For Python? The language where everyone just cobbles together
| random code from the internet and other repos? I can totally
| see how this mistake happens left and right. The bar of entry
| for this language is way too low to assume only rigorous senior
| devs use it.
| bandyaboot wrote:
| He doesn't really get into what makes this a Heisenbug, only that
| it's indeterminate in nature. Would attaching a debugger/stepping
| through the code make it less likely that your task would get
| garbage collected out from under you?
| Izkata wrote:
| You're probably going to need a reference to the task in order
| to inspect it in the debugger. Creating that reference prevents
| the bug.
| foobarbecue wrote:
| Yeah, he seems to be re-defining the term to mean "a bug that
| occurs occasionally depending on system state" as opposed to "a
| bug that changes behavior when you observe it closely e.g. in a
| debugger."
| macintux wrote:
| The first is a common way of using the term Heisenbug. I
| first heard it used that way 10 years ago when discussing
| Erlang's error handling model.
| throwaway81523 wrote:
| CPython does most of its memory management by reference
| counting, which fails to reclaim circular structure. So to make
| sure it gets everything, it occasionally runs a conventional
| tracing GC. If the GC happens to run just after you create that
| async task, the task itself can get collected, it sounds like.
| It's good to know about this and is (my own editorializing) yet
| another reason Python3 should have used Erlang-style
| concurrency instead of this async stuff.
| No1 wrote:
| His argument hinges on "I can't be bothered to read the docs on
| the stuff I'm using." So instead of reading the docs on
| coroutines and tasks before using them, writes a rant about how
| it's all wrong because he didn't understand how it works.
|
| On a more fundamental level, why would anyone assume that a
| coroutine is guaranteed to complete if it is never awaited? There
| is no reason a scheduler could not be totally lazy and only
| execute the coroutine once awaited.
|
| At least he bothered to make note of TaskGroups, also clearly
| shown in his documentation screenshot, immediately above the
| section marked _Important_ that went ignored, and finishes with
| "As long as all the tasks you spin up are in TaskGroups, you
| should be fine." Yep, that's all there was to it.
| ptx wrote:
| > _There is no reason a scheduler could not be totally lazy and
| only execute the coroutine once awaited._
|
| Isn't the point of create_task (which is what the article is
| about) to launch concurrent tasks without immediately awaiting
| them? The example in the docs [1] wouldn't work (in the stated
| manner) if the task didn't start until it was awaited.
|
| > _At least he bothered to make note of TaskGroups [...] Yep,
| that 's all there was to it._
|
| That only works on Python 3.11, which was released just a few
| months ago. Debian still uses 3.9, for example, so the
| TaskGroups solution can't be used everywhere yet.
|
| [1] https://docs.python.org/3/library/asyncio-
| task.html#coroutin...
| zackees wrote:
| [dead]
| [deleted]
| [deleted]
| m3047 wrote:
| Hrmmmm.
|
| > But who reads all the docs?
|
| asyncio.create_task() doesn't exist in 3.6, and I can't find the
| string "to avoid a task disappearing" in the doc, so I'll go out
| on a limb: there is no such doc. However I see the reference to
| weakref.WeakSet.
| Jtsummers wrote:
| The world didn't end in 2016. Welcome to seven years in the
| future where this documentation does, in fact, exist:
|
| https://docs.python.org/3/library/asyncio-task.html#asyncio....
| cutler wrote:
| Maybe grafting async onto a single threaded dynamic language just
| isn't such a good idea in the first place.
| murphy214 wrote:
| bingo
| qxmat wrote:
| Python has a few weird issues like this. The last one I
| encountered was with a class inheriting Thread, join and the SQL
| Server ODBC driver on Linux. Fairly sure I hit page faults thanks
| to a shallow copy on driver allocated string data but didn't have
| the time to investigate like the hero of this blog post.
| whoopdeepoo wrote:
| > But who reads all the docs
|
| Why is this so common? Do people seriously not read a
| language/library documentation? That's the absolute first thing I
| do when evaluating a technology.
| adamckay wrote:
| Because people have deadlines and need to get things working.
| You read enough to figure out how to do what you need to do and
| then mostly move on.
|
| This function was added in 3.7 with no note on the importance
| of saving a reference. In 3.9 a note was added "Save a
| reference to the result of this function, to avoid a task
| disappearing mid execution." which was then expanded with the
| explanation of a weak reference in 3.10.
| skitter wrote:
| It absolutely is common. People see there is a len function
| that takes one argument, they call len(some_collection), see
| that it indeed returns the number of items in the collection
| like they expect and move on. They don't expect len to return a
| negative number instead on Thursdays, and of course it doesn't
| because that would be a pretty big footgun. People also see
| that there is a create_task function that takes a coroutine,
| they call create_task(some_coroutine), see that the coroutine
| indeed runs like they expect, and move on. Sure, you're
| _supposed_ to await the result, but maybe they don 't need the
| awaited value anymore, only the side effects, and see that it
| still works.
| throwaway81523 wrote:
| I had a manager who actually told me not to read docs. I was a
| bad report and read them anyway.
| winter_blue wrote:
| This article just makes me feel like Python, while a language
| with nice-ish syntax, is a language that was poorly hacked and
| put together with little concern/thought about the real-world
| implications of poor design decisions like this async design
| decision (and also dynamic typing - _a terrible thing in any_
| language).
| crdrost wrote:
| Most languages have something like this, usually around async.
|
| For instance NodeJS has had a bit of this around promises, and
| eventually needed to institute the rule "if a promise rejects
| with an error, anf nobody is around to hear it, we will crash
| your program on the assumption that you probably needed to
| clean up some resources but didn't and now they're going to
| leak. Listen to the error with a handler that does nothing, if
| we are wrong about that."
| macintux wrote:
| One of many reasons I like Erlang: _everything_ is async, so
| you have plenty of tooling /libraries/core language features
| to support you.
| photochemsyn wrote:
| 'async footguns' returns 20,000+ hits on Google. Top one
| happens to be:
|
| https://news.ycombinator.com/item?id=32086973
|
| > "Async seems to be the first big "footgun" of Rust. It's
| widespread enough that you can't really avoid interacting with
| it, yet it's bad enough that it makes..."
| deschutes wrote:
| Fun stuff. Why aren't unfinished tasks gc roots?
| [deleted]
| [deleted]
| dehrmann wrote:
| Another common async footgun I see is unthrottled gathering, and
| no throttling mechanism in the standard library. Once you gather
| an unspecified number of awaitables, bad things start to happen,
| either with CPU starvation, local IO starvation, or hammering an
| external service.
|
| What I like about threads is they make dangerous things like this
| harder, and you have to put more thought into how much concurrent
| work you want outstanding. They also handle CPU starvation better
| for things that are latency-sensitive. I've seen degenerate
| requests tie up the event loop with 500 ms of processing time.
| rednafi wrote:
| Huh! Unless you're using semaphores, you can also recreate
| similar situation with threads. Spin up a whole bunch of
| threads and send all of them towards some shared object or make
| 100s of requests with them.
|
| There's not much difference between spinning up threads
| explicitly and creating async task with asyncio.create_task. In
| either case, you can throttle them with semaphores.
| dehrmann wrote:
| I don't have a source or affected versions, but semaphores
| can scale poorly. I vaguely remember each blocked acquire
| getting checked on every event loop iteration, or something
| silly like that.
| acjohnson55 wrote:
| Something linters can help with would think?
| ryanianian wrote:
| C++ has nodiscard which is super useful for scenarios like this
| where ownership can be tricky.
| smetj wrote:
| Start a thread/greenthread/fiber/process/task without holding a
| reference to at least tie all loose ends at exit? Hmm dunno.
| tgv wrote:
| You can do that in go. You don't even get a reference to the
| thread/goroutine.
| nixpulvis wrote:
| Fire and forget.
| crabbone wrote:
| In many years since asyncio has been added, I have never used it
| willingly, outside of the cases where a third-party library
| required it. There has never been a practical benefit for any of
| that stuff when compared to select. It always worked poorly and
| never justified the effort one has to put into writing code that
| uses the library. The behavior OP describes is just one of the
| many bad design decisions that are so characteristic of this
| library.
| pyuser583 wrote:
| I don't find this behavior odd at all. Dereferencing unassigned
| values is normal Python garbage collector behavior. Threads are
| an exception (no pun intended), but they're an exception in lots
| of ways - just try pickling them.
| samsquire wrote:
| Thank you for this. This is really useful information.
|
| I recently adapted some garbage collection code to add register
| scanning.
|
| I can imagine all sorts of subtle bugs where things go away
| randomly. One problem I have with my multithreaded code is that
| sometimes a thread crashes and the logs are so long I don't
| notice. From my perspective the thread is just not doing
| anything.
|
| Sometimes the absence of behaviour can be really tricky to debug!
| sgt wrote:
| Is this something go developers also have to be careful with when
| using goroutines?
| gerad wrote:
| No. But sometimes goroutines have the opposite problem, where
| they don't terminate and get cleaned up.
|
| https://betterprogramming.pub/common-goroutine-leaks-that-yo...
| [deleted]
| candiddevmike wrote:
| Is there an (easy?) test for checking goroutine leaks?
| Snawoot wrote:
| Yes, it's visible on goroutine profile, provided by built-
| in profiler pprof. E.g.: https://github.com/mysteriumnetwor
| k/node/issues/5311#issueco...
| Jtsummers wrote:
| No. Goroutines don't generate a reference to hold onto, either.
| They just run until they or the program terminate.
| [deleted]
| makomk wrote:
| Well, this explains that one really annoying intermittent bug
| that I was having in some asyncio-based code.
| aardvark179 wrote:
| The same problem or something similar exists in many languages.
| Threads are GC roots because the OS knows about them, but this
| may not be true for lightweight threads or async callbacks.
|
| It is hard to fix because you don't want to introduce references
| from an old object (such as a list of callbacks) to many new
| objects as that will introduce GC issues, and many other
| potential leaks.
| jarboot wrote:
| If I want to create a task that runs even after the function
| returns, ie "async def f():
| asyncio.create_task(coro=10_second_coro.run()); return;" is there
| any way to mitigate this? Function-scoped set of tasks?
| nhumrich wrote:
| Yes, read the last part of the included documentation and hold
| onto background tasks.
| jmholla wrote:
| Your task is implicitly not function-scoped as you want it to
| survive exiting the function. What your doing here would be
| better architecturally done with threads. async is not a direct
| replacement for threading.
|
| But, you could also return the task object to the caller and
| have them manage it. There's also nothing async about your
| function, so you don't need the async or to await it.
| cmstodd wrote:
| Thanks for posting.
| nixpulvis wrote:
| Hey, at least it's documented... good developers actually RTFM.
|
| I can't comment on the design of this API, because I don't feel
| like learning the library, but in some performance critical
| applications these sorts of contracts aren't all that uncommon.
| Granted, this is python, I guess it's a bit more suspicious, IDK.
| vbernat wrote:
| The documentation update is quite recent (Python 3.11). It was
| added after this ticket: https://bugs.python.org/issue44665
| (not the first ticket around this problem).
| [deleted]
| osigurdson wrote:
| A little pedantic but HUP concerns the fundamental limits of
| simultaneously knowing a particle's position and momentum, not
| about observation impacting outcomes.
| notatoad wrote:
| wow. yeah, this absolutely explains a heisenbug that i've been
| chasing for a while. and i can't count the number of times i've
| had that exact doc page open on my screen in the last few months,
| and never bothered to read that block of text that starts with
| "important"...
|
| thanks
| aldenpage wrote:
| That's extremely insidious. I suppose I never encountered this
| issue because I almost always call asyncio.gather(*), which makes
| having a collection of tasks natural.
| kortex wrote:
| This is good form. It makes top-level control flow easier to
| follow, and keeps the concurrency scoped.
| BiteCode_dev wrote:
| And this is why trio got it right, and why I think the task
| groups (nurseries from trio) can't arrive soon enough in the
| stdlib.
|
| Because not only you must maintain a reference to any task, but
| you should also explicitly await it somewhere, using something
| like asyncio.wait() or asyncio.gather().
|
| Most people don't know this, and it makes asyncio very difficult
| to use for them.
___________________________________________________________________
(page generated 2023-02-11 23:00 UTC) |