|
| monocasa wrote:
| There is 'printf'. It's just that printf (and the rest of the
| standard library) is technically as much a part of the C language
| as the language grammar itself, and C compilers are welcome to
| use innate knowledge of those functions for optimizations. The
| other place you typically see this is calls to functions like
| memcpy/memset being elided to inline vector ops or CISC copies,
| or on simpler systems, large manual zeroing and copying being
| elided the other way to a memset or memcpy call.
|
| C compilers will typically have an escape hatch for envs like
| deeply embedded systems and kernels like gcc's -ffreestanding and
| -fno-builtin that says "but for real though, don't assume std lib
| functions exist or you know what they are based on the function's
| name".
|
| One of my favorite parts of rust as someone who
| uses it for deeply embedded systems is the separation of core and
| std (where core is the subset of std that only requires memcpy,
| memset, and one other I'm forgetting). The rest of the standard
| library is ultimately an optional part of the language with
| compiler optimizations focused on general benefits rather than
| knowing at the complier how something like printf works. no_std
| is such a nicer env than the half done ports of newlib or pdclib
| that everyone uses in C embedded land.
| tptacek wrote:
| Huh, this is pretty great; I've always fussily used fputs() when
| I'm just printing static strings, and apparently I don't need to
| bother, since the compiler will just do it for me.
| guerrilla wrote:
| Moar please. I'm loving these counterintuitive C optimization
| gotchas lately[1]. They are like little brain teasers.
|
| 1. https://news.ycombinator.com/item?id=28930271
| 0xcde4c3db wrote:
| About a year ago there was something of a "joke isEven()
| implementation discourse" on Twitter, which eventually evolved
| a sort of informal optimizer abuse contest. For example:
|
| https://twitter.com/zeuxcg/status/1291872698453258241
|
| https://twitter.com/jckarter/status/1428071485827022849
| aw1621107 wrote:
| OK, those are horrifying and fascinating, and they basically
| break my brain.
|
| Is there a explanation somewhere of why the first one
| "works"? The second one I think is the compiler assuming the
| default case will never be hit since it'll result in infinite
| recursion, which is UB under C++, so it's basically assuming
| 0<=x<=3 and optimizing from there. Is that correct?
|
| The first one I'm less certain about. The only thing I can
| think of is that the compiler deduces an upper limit of
| INT_MAX - 1 to avoid signed overflow, and then somehow
| figuring out the true/false pattern from there? Still a bit
| of a gap in my understanding there.
| barsonme wrote:
| My guess: since overflowing int is UB, and the only value
| of n that stops the recursion is zero, the compiler assumes
| that n must be zero and checks accordingly.
|
| That doesn't explain why it uses test dil, 1 instead of
| test dil, dil or cmp 0 or whatever.
| davemp wrote:
| Optimizers have to keep the same input/output pairs unless
| there is undefined behavior. In the second function the
| truth table looks like: in | out
| ---------- 0b000 | 1 0b001 | 0
| 0b010 | 1 0b011 | 0 0b100 | don't care
| . . . MAX | don't
| care
|
| The compiler just chooses the most efficient way it knows
| to get the filled out entries correct which happens to be:
| in | ~in[0] ---------- 0b000 | 1
| 0b001 | 0 0b010 | 1 0b011 | 0 0b100
| | 1 . . .
| MAX | 1
|
| It would have been just as valid to do:
| in | in[2] or ~in[0] ---------- 0b000 |
| 1 0b001 | 0 0b010 | 1 0b011 | 0
| 0b100 | 1 0b101 | 1 . .
| . MAX | 1
|
| The first function's table looks like: in
| | out ---------- 0b000 | 1 0b001 |
| don't care 0b010 | don't care .
| . . MAX | don't care
|
| And the compiler still likes the even check in this case,
| which makes sense.
| notriddle wrote:
| The first function (the `n == 0 || !isEven(n+1)`
| recursive function) has defined behavior for negative
| numbers. That's probably why it compiled to an even
| number check.
| archi42 wrote:
| It's all fun and games until you write (or review) C/C++ test
| cases for a compiler or disassembler ;-) It never stopped to
| amaze me how good the compiler was to figure out that I
| actually wrote very complicated "return 0".
| eikenberry wrote:
| https://web.archive.org/web/20211019052752/https://www.netme...
| GoblinSlayer wrote:
| Imagine somebody thought omitting the return statement and doing
| whatever the compiler likes is a good feature to have.
| dboreham wrote:
| Like Scala?
| dnautics wrote:
| pretty sure scala (and most FP) has a well-defined "what to
| do when you leave off the return statement", not one that "is
| up to the compiler"
| [deleted]
| qwerty456127 wrote:
| > puts(3) only returns "a nonnegative integer on success and EOF
| on error"
|
| How does it decide which nonnegative integer to return?
| robotresearcher wrote:
| It's arbitrary. The article shows an implementation that
| returns 10 (ASCII '\n'). But the spec says it doesn't matter,
| so you should only be using it to test >0 for success.
| Bayart wrote:
| The correct implementation is _obviously_ to return 1 on
| success !
| woodruffw wrote:
| That's answered below:
|
| > On success, puts(3) appears to return '\n', the newline or
| line feed (LF) character, which has ASCII value... 10.
|
| But note that that isn't standard behavior. The language in
| POSIX[1] is identical to that in the blog post. `puts` is free
| to return whatever positive number it wants on return.
|
| [1]:
| https://pubs.opengroup.org/onlinepubs/9699919799/functions/p...
| cyberge99 wrote:
| Apparently there is no available capacity for that site either.
| Bang2Bay wrote:
| https://search.yahoo.com/ for
|
| There is no 'printf'
|
| and look through the cache
| ltr_ wrote:
| [off topic] I always wondered how '%n' is used in production
| code.
| mormegil wrote:
| So, why does puts do "return r ? EOF : '\n';"? Some backwards
| compatibility? Or is there a logical reason for that?
| _kst_ wrote:
| That particular implementation probably returns the result of
| the last fputc() or equivalent that it called.
|
| puts() returns EOF (typically -1) on error, or some unspecified
| non-negative value on success.
|
| fputc() returns EOF on error or the written character, treated
| as an unsigned char and converted to int, on success.
|
| Don't expect all puts() implementations to do the same thing.
| For example, the glibc implementation appears to return the
| number of characters written on success. Implementations are
| free to rely on implementation-defined behavior. User code
| that's intended to be portable cannot.
| LukeShu wrote:
| That particular implementation (NetBSD's) (which is
| transcribed in to the article) does something more optimized
| than making repeated calls to `putchar()`.
|
| But as pdw's link shows, what you suggest is exactly what the
| historical implementation was. So NetBSD is simply matching
| historical Unix.
| masklinn wrote:
| Per the man:
|
| > puts() and fputs() return a nonnegative number on success, or
| EOF on error.
|
| r is the result of the write, if it's nonzero the write failed
| and thus so did puts.
| m45t3r wrote:
| Yeah, but I think the question was why EOF and "\n". It could
| as easily just return 1 or -1 for example, and it would make
| more sense I think.
| kevin_thibedeau wrote:
| puts() always adds a line termination so success means that
| '\n' is the last char for that implementation.
| pdw wrote:
| It's what historic Unix did:
| https://github.com/v7unix/v7unix/blob/master/v7/usr/src/libc...
|
| Why it did that? I'm not sure, but at the time C did not have
| 'void' functions: every function returned a value. They
| probably wanted to make the behavior of the stdlib functions
| deterministic, even if the return value was useless and
| undocumented.
| anonymousiam wrote:
| Compiler optimization can sometimes cause unpredictable or even
| incorrect behavior. Below is a blob of C code for the TI MSP430
| compiler that exemplifies at least one of TI's optimization bugs:
|
| // Define Common Communications Frame
|
| typedef volatile union commFrameType
|
| { struct { unsigned
| SyncHeader:16; unsigned MessageID:8;
| unsigned short MessageData[msgDataSize]; // ID-unique data
| unsigned CRC:8; // LSB of CCITT-16 for above data
| } __attribute__ ((packed)) Frame; unsigned char
| b[16]; // Accessible as raw bytes as well
| unsigned short w[8]; // Accessible as raw words as well
| unsigned long l[4]; // Accessible as raw long words as
| well
|
| } __attribute__ ((packed)) CommFrame;
|
| static CommFrame IpcMessage = { FRAME_SYNC_R, IpcBlankMessage };
| // If frame was accepted into TX queue, prepare next frame for
| transmission
|
| // IpcMessage.Frame.MessageID++; // Bump up to next message type
|
| // IpcMessage.Frame.MessageID += 1;
|
| // The above two lines that are commented out cause a bizzare
| linker error if either are used instead of the line below.
| IpcMessage.Frame.MessageID = IpcMessage.Frame.MessageID + 1; //
| Bump up to next message type
|
| The MSP-430 is a 16-bit microcontroller and the packed CommFrame
| structure has Frame.MessageID on an odd-byte boundary. Some
| processors might raise a SIGBUS, but TI says that it's okay to
| access a byte on an odd address boundary.
|
| It's pretty silly that i++; and i+=1; don't work, but i=i+1; is
| just fine.
| secondcoming wrote:
| 'unsigned MessageID:8;' isn't the same as 'unsigned char
| MessageId'
| RcouF1uZ4gsC wrote:
| This is a bit like saying there is no '+';
|
| Because if you put in return 1+2+3;
|
| And look at the assembly code, you will see that the compiler
| generated something like return 6;
|
| The compiler is allowed to take advantage of the standard to
| substitute in more efficient code that does the same thing.
|
| IIRC, for C++, it would actually be ok if std::vector was
| implemented completely as a compiler intrinsic with no actual
| header file. (No compiler I am aware of actually does it that
| way).
| dnautics wrote:
| yeah but everyone knows that "there is no +"; It's an operator,
| and in C, anyways operators are special and expected to not
| necessarily do C-function-ey things, e.g, "take arguments of
| different types and add them successfully" not everyone is
| aware that C has "anointed functions" (including, I believe
| malloc) that the compiler is allowed to fiddle with.
| malkia wrote:
| Is there more info to this, I remember this from Commmon Lisp
| (but details evade me) that the compiler can take benefit of
| certain specific functions and rely on them being... "open
| coded" - e.g. it can produce more efficient code by replacing
| these with something more suitable...
| http://www.sbcl.org/manual/#Open-Coding-and-Inline-Expansion
|
| https://www.thecodingforums.com/threads/what-is-the-meaning-...
| talaketu wrote:
| > more efficient code that does the same thing
|
| In this case, it produces a different result.
| masklinn wrote:
| It produces a different ub, which is ub.
|
| Furthermore observability would be defined in terms of the C
| abstract machine, "observing" by decompiling the program is
| out of scope.
| talaketu wrote:
| oh right
|
| > But what if you're not using C99 or newer?
|
| UB - that takes all the fun out of it.
| Someone wrote:
| Code that does #include
|
| must compile, so that _header_ must exist (whether it is stored
| in a _file_ is the implementer's choice. AFAIK, the standard
| carefully avoids the use of the term 'header file')
|
| Also, I think code that doesn't do that include must fail to
| compile when it tries to use _std::vector_. So, logically, that
| header must exist.
| gpderetta wrote:
| Well not really. The preprocessor is part of the compiler, so
| it only needs set a flag to tell the compiler proper to
| enable std::vector.
| rrauenza wrote:
| Quick Summary:
|
| The C compiler optimizer replaces printf("Hello World!\n") with
| puts("Hello World!\n") and the implicit return from main()
| changes from 13 (the return value of printf) to 10 (the return
| value of puts)
| moffkalast wrote:
| Calls on puts you say?
| helmholtz wrote:
| Brilliant.
| enlyth wrote:
| In other words long volatility
___________________________________________________________________
(page generated 2021-10-21 23:00 UTC) |