|
| paulddraper wrote:
| Floating Point Operations Per
| nightmonkey wrote:
| My last two startups.
| HPsquared wrote:
| You could use it as a unit of accounting for investing losses.
| A gigaflop would be a loss of 1 billion, etc.
| [deleted]
| yujian wrote:
| this
| jjgreen wrote:
| I'm usually a fan of Higham, but the last few posts have been
| weak.
| liotier wrote:
| Flops ?
| jjgreen wrote:
| Very witty Wilde ...
| MattyDub wrote:
| Can somebody explain why a square root is also considered a flop?
| Surely that involves more work than the other four operations the
| article listed. Is there some hardware algorithm for the square
| root that is as fast as (e.g.) division?
| AlbertCory wrote:
| Disappointed this is not about basketball.
| dragontamer wrote:
| My personal notes on this subject:
|
| * MIPS was perhaps the integer equivalent to FLOP, still used in
| modern microcontrollers because the 8051 at 12MHZ would only
| execute 1MIPS (12 clocks per instruction). Modern 8051 chips
| obviously have sped up to 1 clock per instruction, but MIPS (and
| Dhrystone MIPS in particular) are still a common benchmark today.
|
| * FLOPs is very difficult to calculate in theory because modern
| CPUs have vector units, and multiple pipelines per core. You
| could have 3x AVX512 instructions in parallel on today's CPUs on
| a single core.
|
| * FLOPs we're traditionally a 64-bit operation for the
| supercomputer community. Today, most FLOPs are 32-bit for video
| games. Finally, the deep learning / neural net guys have
| popularized 16-bit flops, and even 8-bit iops.
|
| * 'The' flop is a misnomer because it's almost always the
| multiply-and-accumulate instruction: X = A + B * C. Which... Is
| two operations per instruction (per shader/SIMD lane). Eeehhh
| whatever. Who cares about these details?
|
| * As 'Dhrystone' is the benchmark for MIPS, the benchmark for
| 64-bit flops is Linpack.
| gumby wrote:
| > MIPS was perhaps the integer equivalent to FLOP, still used
| in modern microcontrollers because the 8051 at 12MHZ would only
| execute 1MIPS (12 clocks per instruction)
|
| Actually the origin of this term was VAX MIPS (VAX 780
| specifically) because that was a ubiquitous, pretty fast for
| its time minicomputer. There were faster machines, and slower
| mainframes still being built, but that was what the late 70s
| were like.
|
| When the 8051 was released in 1980 it surely didn't run at 12
| MHz! Back then the Z80 sold because it could run 8080 code at a
| blistering 2 MHz.
|
| BTW the benchmark for FLOPS in those days was Whetstone, hence
| the otherwise weird name "Dhrystone"
| dragontamer wrote:
| The 1981 manual for the 8051 contains numerous references to
| 12MHz.
|
| http://bitsavers.informatik.uni-
| stuttgart.de/components/inte...
|
| It was 12T clocked: even though the clock was 12MHz, it would
| only operate at 1MHz / 1MIPS, because it took 12-clock-ticks
| to even perform one addition.
|
| IIRC, there was a standard crystal (11.0592 MHz crystal?? I
| forget exactly) for the communications at the time. So going
| just above 11 MHz (or really, just above 11.0592 MHz) was
| needed for reliable serial comms.
| gumby wrote:
| Wow, right on page 1-2! I'm surprised -- I don't remember
| anything running that fast back then. Thanks.
|
| (Love those old Intel books too)
|
| Nevertheless, FWIW, MIPS started out as Vax MIPS, and at
| first people often used to write "VAX MIPS".
| segfaultbuserr wrote:
| > 'The' flop is a misnomer because it's almost always the
| multiply-and-accumulate instruction: X = A + B * C. Which... Is
| two operations per instruction (per shader/SIMD lane). Eeehhh
| whatever. Who cares about these details?
|
| If FMA is supported, it can either be counted as one or two
| operations, depending on the rule of the benchmark involved or
| the marketing of the processor. The marketing specification of
| a processor's theoretical peak performance sometimes counts a
| single-instruction FMA as two operations. On the other hand,
| for the purpose of code profiling, counting FMA as one
| operation is more realistic... As you said, who cares about
| these details?
| fluoridation wrote:
| Given that the point of the FLOPS unit is to compare
| processors, it does make more sense to count complex
| instructions as more than a single floating-point operation.
| If one CPU could multiply a 4x4 matrix by a vector in a
| single instruction that can run a million times per second,
| and another CPU needed ~32 instructions and so can only
| multiply 500k matrices per second but retires 16 million
| instructions in that same second, it would be silly to
| compare instructions instead of multiplications and
| additions.
| dragontamer wrote:
| As a computer-engineer, the circuit design needed to make a
| fast multiplication operation (ie: Wallace Tree, and
| similar) are an order-of-magnitude larger than the circuit
| design needed for fast addition (ie: a Kogge-Stone Carry
| lookahead Adder).
|
| This idea that additions and multiplications can be
| combined like this as "equivalent operations" is kinda
| bullshit. But hey, if its "how its done" (and its done this
| way because multiply-then-add is how you do matrix-
| multiplications...) then so be it.
|
| Just remember that this is an arbitrary subdivision of a
| matrix multiplication operation, that may not have much
| relevance as a benchmark outside of matrix multiplications.
| fluoridation wrote:
| It was just an example, not necessarily a realistic one.
| The point is that we want to compare how quickly a
| processor will compute our problem, not how many
| instructions it's going to execute. If it was a car you
| want to compare things like its top speed and
| acceleration, not something inane like engine revolutions
| per kilometer. You measure and compare things that are
| relevant to the user, not implementation details.
| namirez wrote:
| Floating point operations per ...?
| GenericDev wrote:
| Floating Point Operations Per Second [1]
|
| [1] https://academickids.com/encyclopedia/index.php/FLOPS
| Zambyte wrote:
| > One should speak in the singular of a FLOPS and not of a
| FLOP, although the latter is frequently encountered. The
| final S stands for second and does not indicate a plural.
|
| The author of this post seems to have fallen for this error.
| dataflow wrote:
| I think their point was that the p stood for "per", not for
| the second letter of "operation".
___________________________________________________________________
(page generated 2023-09-05 23:00 UTC) |