|
| Animats wrote:
| In the transistors per cubic meter dimension, maybe. In the cost
| and heat dimensions, no.
| bee_rider wrote:
| Chiplets should help keep the yields up (I assume, at least,
| that they test the chiplets before packing them in together
| into the package, although that's a guess). Improving yields
| should help with prices; throwing away products or binning them
| down isn't very economically favorable.
| hristov wrote:
| Yes they do. In fact chiplets are bringing a lot of
| excitement to the semi testing industry. Chiplets have
| popularized the concept of "known good die". I.E. a chiplet
| that has been tested so much that is practically proven to be
| error free.
|
| This is necessary because packaging chiplets is usually
| nonreversible. So if you put ten chiplets in a package and
| one of them is bad, you have to toss the entire package with
| all ten chiplets and all your packaging costs.
| bee_rider wrote:
| Where are R&D nodes at nowadays, anyway?
|
| I guess chiplets will help with yield issues, which could make
| some nodes economical (which is of course what Moore's law is all
| about). But of course they won't allow them to get down to
| transistor sizes that they just couldn't hit at all.
| tux1968 wrote:
| Jim Keller is bullish on chiplets and it's not just about
| yield, it's also about power efficiency and the economics of
| only using the most expensive nodes (eg. 3nm) for critical
| parts and being able to combine it lego-style with IP
| fabricated on cheaper nodes.
|
| https://youtu.be/c_Wj_rNln-s?t=127
| BeefWellington wrote:
| In a way, these points are roughly how modern computers work.
|
| For instance, let's say you bought a new PC in 2017 with a
| high end graphics card. Your CPU was probably manufactured on
| 14nm processes, your GPU on 16nm, your RAM from 16 to 20nm,
| and your motherboard chipset on 22nm. It's been a similar
| story throughout computing.
|
| Taking that idea and scaling it down to processor designs
| themselves is cool and no doubt faces hard technical
| challenges.
| bee_rider wrote:
| That's interesting, I'm curious about the argument for power
| efficiency (it seems like breaking a design up into chiplets
| would only hurt there, but I'm no Jim Keller!). I'll
| definitely check the video out when I get a chance.
|
| Lego-ing out IP blocks could be really cool. It would be neat
| if this gave OEMs room to compete in a more technical field
| (selling Xeons in Dell cases vs Xeons in Lenovo cases is not
| very cool, if Dell and Lenovo could pick the parts in their
| package, that could be interesting).
| pclmulqdq wrote:
| I think this may be where some manufacturers are heading,
| but not necessarily the big "consumer" names like
| Intel/AMD. It's a way to significantly lower the risk of
| building an SoC, and I think it's more plausible that this
| sort of third-party chiplet market will be driven by the
| Marvell/Broadcom types rather than Intel/AMD.
| touisteur wrote:
| There was something impressive with tenstorrent original
| architecture (haven't been following the talk about
| pivoting to risc-v) was the low cost of validation. It
| seemed like the idea was 'cheapest to tape-out with max
| FLOPS' and it somehow worked out well. Can't say more, I
| couldn't ever get my hands on one of them Grayskull so...
| pclmulqdq wrote:
| Quietly, digital ASIC validation has gotten very cheap
| and quick over the last 20 years thanks to the
| proliferation of software models and compilers that
| translate Verilog to C++ (which are literally orders of
| magnitude faster than previous simulators). I'm not
| surprised that Tenstorrent has been doing well with that,
| given the expertise they have.
| uluyol wrote:
| It's also about reuse and fungibility.
|
| AMD has used the same IO die across multiple generations of
| Zen hardware. This cuts the dev and validation costs for
| using new processes.
|
| Fungibility helps reduce the cost of making SKUs. Want more
| cores? Add more compute dies. Your customers want more big
| chips then expected? It's easier/faster to adapt production
| since only the final steps differ. The component pieces are
| the same.
| anyoneamous wrote:
| Betteridge alert!
| AnimalMuppet wrote:
| Where do chiplets fit in Moore's Law? Moore said that the number
| of transistors _in a single package_ would increase. Well, are
| chiplets their own package? If so, then Moore 's Law is clearly
| broken, because they are a _decrease_ in the number of
| transistors in a package (compared to conventional approaches).
|
| But if you don't consider a chiplet to be an independent package,
| then yes, maybe they can save Moore's Law.
|
| Personally, once chiplets enter the picture, I think the question
| is becoming one of semantics, and therefore less interesting.
| ilyt wrote:
| >Where do chiplets fit in Moore's Law? Moore said that the
| number of transistors in a single package would increase.
|
| Well there were no multi-chiplet IC's back in his days
| sylware wrote:
| Once we have a GPU filling the whole silicon real estate of a M2
| chip, that with a 2nm process...
|
| I guess the next playstation? Even though I quite dislike the
| digital jail which are video game consoles, going to be hard to
| resist to FFXX.
| bena wrote:
| Who cares? Moore's Law is not something that needs "saving". It
| was an observation of the time. It was bound to run up against
| physical reality.
|
| In 1965, he predicted a doubling of components every year for the
| next ten years. In 1975, he changed it to every 2 years. And he
| was probably only looking out to the next 10 to 20 years.
|
| Because, you know, that's reasonable. Even Moore himself knew
| that it was impossible to continue indefinitely.
|
| Yes, let's get better. Yes, let's try to find more ways to
| efficiently compute. But let's not be slaves to words spoken
| before we were born because they were easy to achieve back then.
| stefncb wrote:
| Moore even later admitted that he made an insane guess for
| marketing.
| hindsightbias wrote:
| People will care when HW stops being able to keep up with SW
| bloat.
| NovaDudely wrote:
| For his prediction to last as long as it has is an achievement
| on its own. Anyone that can predict 20 years out and be within
| 50% of a target on whatever field is a win to me. Moore should
| be chuffed it stood so long.
| wmf wrote:
| Nividia is using "the end of Moore's Law" as an excuse to
| provide zero price/performance improvement. It would be pretty
| anti-consumer if that became the industry standard.
| dolni wrote:
| I'm sure AMD would LOVE to eat Nvidia's lunch.
| wmf wrote:
| Yet they're not. They're selling very few GPUs at high
| prices.
| Buttons840 wrote:
| Is that because they have bad hardware or because of the
| CUDA monopoly?
| wmf wrote:
| Neither?
| gymbeaux wrote:
| More the latter than the former these days, but at a
| macro level, like last 20 years, it's more the former
| than the latter.
|
| Basically AMD has caught up (arguable, but let's say they
| have) to Nvidia with respect to driver quality and
| hardware performance/power consumption, just in time for
| the world to become more concerned with how a GPU handles
| AI stuff than how it handles video games.
| causality0 wrote:
| I think the claim AMD has caught up with Nvidia drivers
| is untrue. Sure, they don't cause constant crashing in
| mainstream games anymore like they used to but that's
| where the improvement ends. AMD drivers are still broken
| enough you run into issues whenever you step outside the
| box. Emulation is universally worse on AMD, and that's
| when it's not completely broken. VR is the same way, and
| AMD is now six months late for their "we promise AI won't
| be busted on RDNA3" promise.
| gymbeaux wrote:
| I've read about/had enough issues with NVidia drivers in
| the last couple of years I am willing to say AMD has
| effectively caught up in that regard, especially when you
| consider "FineWine" (the common occurrence of AMD GPUs
| gaining performance over time through driver updates,
| while that seldom occurs on the Nvidia side).
| dotnet00 wrote:
| Wasn't the 4090 roughly twice the performance of a 3090 at a
| similar cost?
| BeefWellington wrote:
| The 3090 and the 4090 are on different process nodes
| (8/10nm vs 4nm), so you'd expect the 4090 to be more
| efficient. Pumping the same (or in this case more) power
| through a more efficient node _should_ result in
| improvements in performance.
|
| Hardware Unboxed covers this pretty well here:
| https://www.youtube.com/watch?v=aQklDR8nv8U&t=1365s
|
| You can see that compared to the 3090Ti or the 6950XT* it's
| sipping power when it's performance limited. I think what
| many gamers see though, is that the performance difference
| is about all you'd expect going to 4nm from 8nm, based on
| performance gains we've seen in the past from other chips.
|
| Anyway, the rough idea is: If I built a 3090 on a 4nm node
| it would perform the same. Plus, NVidia has a track record
| of how they behave when the competition isn't up to snuff.
|
| Now, this is silly because using a more leading edge
| manufacturing process takes a ton of work and _is_ an
| improvement overall.
|
| [*]: The 6950XT is manufactured on a 7nm node from TSMC.
| Note that there is weirdness in how each company measures
| their process node, which I won't go into here but just be
| aware that a TSMC N7 process might not be 7nm sized parts
| for all parts, and a lot of it has to do with how well they
| can layer things.
| dotnet00 wrote:
| The person I replied to said price/performance, not
| efficiency.
|
| With the high end chips I think they're simply of the
| opinion that performance is more important than
| efficiency (and I don't really pay attention to the lower
| range too much these days), and thus like most other high
| end desktop parts, they're all clocked as high as the
| chip and cooling can handle, even if dialing it back a
| little would drastically improve efficiency.
| dghughes wrote:
| Ancient Greeks believed life involves and needs two types of
| suffering one is war which is destructive the other is
| competition. Competition benefits society it enlightens it
| moves it forward. Competition isn't meant to destroy your
| competitor that would be war. Moore's Law may be a type of good
| suffering.
| ilaksh wrote:
| What about radical new compute-in-memory paradigms based on
| memristors or something? We really need more performance for
| running transformers. Not for general purpose CPUs.
| _hypx wrote:
| The problem with chiplets is that you can only fit so many of
| them in a realistic package. Once you hit that limit, you will
| need a way to pack more transistors on a single die. As a result,
| chiplets can only partial solve some of the issues of Moore's
| Law. It can't be a long-term extension of it.
| OliverGuy wrote:
| I suppose that the packaging problem can be solved - up to a
| point - with stacking vertically as well as placing them next
| to the io die. Wonder how well that will scale, could we stack
| multiple chiplets high? What effect would that have on
| thermals?
| jackmott wrote:
| [dead]
| _hypx wrote:
| Vertical chip stacking is another idea being proposed. But as
| you said, that will have thermal problems.
|
| It could be possible to stack less power-hungry chips like
| cache chips, but probably not the main CPU/GPU core. It will
| just be another way of extending Moore's Law.
| bravetraveler wrote:
| Would the X3D cache setup for the newer AMD CPUs count as
| vertical chips?
|
| I'm a bit naive of the actual... physical bits. I'm not
| clear what makes a chip, and if the cache would be that -
| or something a bit simpler
| jagger27 wrote:
| Yes, it totally counts and is indeed being vertically
| stacked on Ryzen products today. Cache takes up a huge
| portion of total die area and doesn't hurt temperatures
| too much.
|
| For now, AMD is using the technique to add more cache to
| their CPUs rather than moving all of it to another layer.
| adgjlsfhk1 wrote:
| As I understand it, this is because the 3d stacking is
| fairly expensive (it adds a lot of silicon and extra
| manufacturing cost), so 3d cache only makes sense once
| your chip is too big to put all the cache in 2D.
| convolvatron wrote:
| I suspect there is a yield issue also. that is one large
| chip with one defect in the wrong place is just glass.
| with chipsets you would assemble the module after chipset
| unit test.
|
| but maybe redundancy and binning has made this moot.
| polishdude20 wrote:
| I wonder if / when hardware can't continue to get better, will
| software start to then be the differentiating factor in
| performance?
| BenoitP wrote:
| It sure can. Large parts of the whole stack have been developed
| with trading off developer velocity vs performance.
|
| But hardware still has plenty of runway. The third dimension is
| barely beginning. Also customizing the hardware to the workload
| has lot of untapped potential. RISC-V has a committee
| discussing accelerating dynamic languages for example. Memory
| tagging for automatic array bounds checking, pointer masking
| for helping GCs, custom instructions caches, etc. And you can
| go deeper; why not have a Node Webserver chip?
| bayindirh wrote:
| > Large parts of the whole stack have been developed with
| trading off developer velocity vs performance.
|
| The time is always paid by someone. If developers decide to
| write the best code, they pay the price once in terms of
| development time. If the code is written with a developer
| velocity first perspective, every user pays the price every
| time they use the code.
|
| Also, developer velocity vs. program performance is not a
| correct perspective to look at it. I have used at least half
| a dozen libraries with great developer ergonomics and have
| world leading performance (e.g. Eigen, Catch2, easylogging++
| from top of my head).
| transcriptase wrote:
| As far as I can tell hardware gains are enabling an entire
| generation of programmers to ignore the idea of taking
| optimization/performance into consideration entirely, outside
| of niche areas.
|
| The sluggishness, absurd loading times, and RAM usage for even
| the simplest applications on blisteringly fast hardware is
| astonishing.
| bee_rider wrote:
| This has been going on for two or three generations at least.
| Rapzid wrote:
| Seems fine to me.
| bayindirh wrote:
| When you start to pay attention to code you write, it
| accelerates in tremendous amounts, even without doing fancy
| optimizations. Also, clean and sensible code is easily
| optimized by compilers a great deal already.
|
| A numerical Gaussian integration code I have written in
| 2017 or so can run at 1.7 _million_ iterations _per second,
| per core_ , on a stock, 3rd generation i7 3770K.
|
| It's implemented in C++, and only written carefully. No
| fancy optimizations are done.
|
| If programmers paid half the attention their code deserves,
| I bet most of the software we use would be 2x faster, on
| average.
| bee_rider wrote:
| In this case, it would probably be best to use a tuned
| library, right?
|
| The tricky bit is finding the spot between "find a
| library" and "be lazy about it," if you want to find
| tasks worth doing well.
| bayindirh wrote:
| Yes, the integration was not handled by a library though.
| I coded it myself (because it was a novel method). Used
| Eigen for matrix related stuff because it was fast and
| _fun_ to use. Not to mention how feature packed it is.
| NovaDudely wrote:
| I came across this recently which is a fun way of
| demonstrating how performance can change via using
| optimised code.
|
| It is about running a rebuild if Minecraft on late 90s
| early 2000s Macs and showing the huge performance
| differences.
|
| Yes Minecraft is not a great example but it is visually
| striking.
|
| https://www.youtube.com/watch?v=awGVUxl0T0Y
| Rapzid wrote:
| Great, wow, that's interesting.
|
| I mean as a user everything seems fine to me. Things are
| getting better, not worse.
| bayindirh wrote:
| Because OS can hide the resource waste pretty cleverly.
| macOS compresses RAM on the fly, Linux moves very stale
| pages to swap early to keep memory utilization in check.
|
| On that department, I was able to store two 3000x3000
| double precision dense matrices and some big double
| precision vectors, _plus the 3D model representing the
| object I 'm working on_ in ~250MB IIRC. The code is a
| scientific materials simulator, BTW.
|
| Eclipse (the IDE) uses ~1GB (after warming up) with my
| fat C++ config, and that thing is written in Java, and is
| a full blown IDE with an always on indexer + static
| analyzer.
|
| VSCode uses 1GB for nothing. Atom used 1GB for nothing.
| Evernote uses ~900MB on start on macOS. That thing takes
| notes!
|
| We have the resources and wizardry to keep bad
| programming in check, but we can do much better if we
| wanted to.
| NovaDudely wrote:
| This is why I say if folks want the "year of Linux desktop"
| focus on optimising performance now before it is forced. Make
| older machine perform better and avoid a lot of the hardware
| lockdown coming.
|
| Note: I don't believe there ever will be a year of the desktop.
| But for Libre/Open source to thrive, it cannot just be
| free/open. It has to be good.
| ilaksh wrote:
| Then they invent a new hardware paradigm.
| thechao wrote:
| There's an enormous stack of SW in the HW that's enabling
| "cheap" ASIC development at the cost of chip performance.
| There's probably as much room to optimize on any given node as
| there are optimization opportunities for the SW on that part.
|
| Both SW _and_ HW have been leaning on Moore 's law a long time.
| anticensor wrote:
| Wirth's Law will eventually win.
___________________________________________________________________
(page generated 2023-06-17 23:01 UTC) |