proxy70

	[HN Gopher] SiFive Tapes Out First 5nm TSMC 32-bit RISC-V Chip w... ___________________________________________________________________ SiFive Tapes Out First 5nm TSMC 32-bit RISC-V Chip with 7.2 Gbps HBM3 Author : pabs3 Score : 144 points Date : 2021-04-15 06:29 UTC (1 days ago)
	web link (www.tomshardware.com)
	w3m dump (www.tomshardware.com)
	\| zozbot234 wrote: \| IIRC, 32-bit RISC-V is only intended for deep embedded workloads, \| with 64-bit for general purpose compute. So a SoC w/ a single \| 32-bit core would seem to be a less-than-ideal fit for the \| cutting-edge 5nm process. \| tyingq wrote: \| The core is supposed to compete with the Cortex M7. The \| smallest process M7 I can find is the STM32H7, which is 40nm. \| makapuf wrote: \| I rave for stm32 with high end processes (10nm or less), \| whether that makes sense or not. I just love stm32.. \| dragontamer wrote: \| Routers / Switches have extremely weird performance \| characteristics, and I think that's what SiFive is targeting \| with this chip. \| \| * HBM3 for the highest memory bandwidth (10Gbps switches need \| tons and tons of bandwidth. That's 10Gbps per direction per \| connection, 8x ports is 160Gbps, and then that's multiplied \| multiple times over by every memcpy / operation your chip \| actually does. You need to DELIVER 160Gbps, which means your \| physical RAM-bandwidth needs to be an order of magnitude \| greater than that) \| \| * Embedded 32-bit design for low-power usage. \| \| * All switches have small, fixed size buffers. Memory capacity \| is not a problem, its feasible to imagine useful switches and \| routers (even 10Gbps, 40Gbps, or 100Gbps) that only have \| hundreds-of-MBs of RAM. As such, 32-bit is sufficient and \| 64-bit is a waste (You'd rather half your pointer memory \| requirements with 32-bit pointers rather than go beyond 4GB \| capacity) \| GoblinSlayer wrote: \| It's E76 with F set, and F set is huge compared to RV64I. And \| the article proposes HPC as possible application. \| rjsw wrote: \| Routers need quite a bit of memory to handle IPv6. \| \| Switches as an application of this makes sense. \| jandrese wrote: \| IPv6 address comparison on a 32 bit design is fairly awkward. \| Switches won't care, but routers need to make routing \| decisions. \| foobiekr wrote: \| While these are all good points, this really does not appear \| to be a competitive NPU design on any axis that matters. I \| don't know what this chip is for, but a router NPU it is not, \| nor a switch. Maybe some soho switch or smart NIC, but those \| have moved on far along the performance spectrum away from \| the place where this would fit. \| zibzab wrote: \| Yeah, this seems like an odd move to me. \| \| For this kinda of applications the static leakage of the newer \| & smaller node will probably hurt rather than help. \| [deleted] \| justincormack wrote: \| I think it means 32 bit floating point, not 32 bit CPU, as it \| mentions "other relatively simplistic applications that do not \| require full precision" but its a bit unclear. \| phendrenad2 wrote: \| The quote that stands out to me is that the core is "ideal \| for applications which require high performance -- but have \| power constraints (e.g., Augmented Reality and Virtual \| Reality , IoT Edge Compute, Biometric Signal Processing, and \| Industrial Automation)." \| Fordec wrote: \| With my industry / product management / business strategy hat \| on, totally agree from SiFive's perspective. \| \| With my early days electronics hat on, the 5nm process adds \| additional energy performance gains that in conjunction with \| RISCV in an embedded environment, especially in a battery \| powered remote operation use case, has me salivating at what \| could be achieved from a would-be customer perspective. \| volta83 wrote: \| HBM2 is like 2Tb/s, how is HBM3 7GB/s ? \| hajile wrote: \| HBM3 wasn't just supposed to be about speed. It also offers a \| 512-bit option that doesn't require a silicon interposer. I'd \| guess this was added to make cheaper consumer GPU designs \| possible. \| \| I suspect they're using the HBM2 spec for the narrow bus and \| cheaper interposer while keeping speeds lower and only using a \| couple stacks instead of the 16 or so HBM2 stacks required for \| those 2Tb/s speeds you mention. It makes sense given that their \| chip likely couldn't use a huge amount of bandwidth anyway. \| virtuallynathan wrote: \| I think that's per-Pin bandwidth? \| vmception wrote: \| HBM3 was expected to be like 4GB/s per pin which was seen as \| double HBM2 per pin, so this is therefore almost even double \| that, which is good news \| \| The HBM2 total memory bandwidth is like 2TB/s, just different \| scale \| \| Anyway I could totally be using wrong nomenclature and \| terminology, feel free to discuss, these aren't assertions or \| aren't strongly held assertions \| throwaway4good wrote: \| What is the use case of this chip? I have the feeling it is some \| way away from a general purpose CPU / SOC like the Apple M1? \| 01100011 wrote: \| RTFA? \| \| > The SoC can be used for AI and HPC applications and can be \| further customized by SiFive customers to meet their needs. \| Meanwhile, elements from this SoC can be licensed and used for \| other N5 designs without any significant effort. \| \| > The SoC contains the SiFive E76 32-bit CPU core(s) for AI, \| microcontrollers, edge-computing, and other relatively \| simplistic applications that do not require full precision. \| throwaway4good wrote: \| So it is a proof of concept / demo of subcomponents someone \| else may license? Is that a correct interpretation? \| sanxiyn wrote: \| Yes. \| klelatti wrote: \| How did SiFive get anywhere near 5nm TSMC? \| baq wrote: \| perhaps paid some money when the process wasn't booked till the \| end of time \| lizknope wrote: \| They pay money just like any other customer of TSMC. SiFive has \| a lot of buzz in the industry. I wouldn't be surprised that \| TSMC wanted to work with them. \| \| But there are other intermediary companies that help startups \| group multiple chips from multiple companies together into a \| single mask. This is called a "shuttle" and allows the \| companies to split the costs of the masks (I've heard up to $30 \| million for 5nm) \| \| SiFive is probably building about 2,000 of these chips for \| development boards. They aren't trying to order a hundred \| million like Nvidia. \| klelatti wrote: \| Thanks that's very interesting. No intention in any way to \| belittle SiFive - just puzzled as to how they managed to get \| onto this process when it's obviously so much in demand. Good \| for them! \| RicoElectrico wrote: \| For test chips there is something called shuttle. \| \| Other than that, foundries are known to sponsor IP development \| on their processes. \| snypher wrote: \| "The tape out means that the documentation for the chip has \| been submitted for manufacturing to TSMC, which essentially \| means that the SoC has been successfully simulated. The silicon \| is expected to be obtained in Q2 2021." \| \| Would this mean the actual chip delivery may still be delayed? \| StringyBob wrote: \| Chip manufacturing has many steps. For a new leading edge \| process it may take 3-6 months to get silicon back after \| submitting the design to a silicon foundry for manufacturing. \| \| For a small volume 'shuttle' run hopefully there won't be \| delays, but this is not the same as having working chips! \| \| The foundry will do initial checks it is manufacturable at \| 'tapeout' when you submit your design, but you don't know for \| sure if your chip works with intended functionality until you \| get it back! You are relying on lots and lots of simulations \| up front before your 'tape-out'. \| \| Sometimes issues are found and a chip requires a re-spin - \| basically another go with the bugs fixed. You want to do this \| as few times as possible (ideally right first time) due to \| cost and time of these iterations. \| gumby wrote: \| It's also in TSMC's marketing interest to product a small \| number of RISC V parts with their latest process. \| \| Plus it's probably fun for some of the people there. \| ohazi wrote: \| I know they're separate lines and capacity is sold well in \| advance and all that, but this chip shortage still baffles me. \| \| A startup can tape out a 5 nm chip, but STMicroelectronics can't \| make any of their 40-130 nm microcontrollers for the next year? \| \| Also car companies are supposedly the culprit, even though their \| volume is only in the low tens of millions per year, and the \| dustup is apparently over only six months of capacity? What? I \| get that the auto industry is a nice reliable long-term source of \| revenue for chip companies, but fabs should barely be sneezing at \| that sort of volume. \| lizknope wrote: \| I'm in the semiconductor company. \| \| I don't really understand your question. \| \| Anyone can start a company and tape out a chip even in 5nm. My \| previous startup did something similar. We used an intermediate \| company between us and TSMC that specifically works with \| smaller companies. They (or TSMC) will bundle together 4 to 20 \| chips into a common mask as a "shuttle" run. Shuttle runs are \| really only used to get samples for the first version of your \| chip. You can't really go to production with them because the \| mask has chips from multiple different companies but this \| allows all of the companies to share the mask costs (I've heard \| up to $30 million for 5nm) \| \| What is ST Micro talking about? I assume they can produce chips \| but can't get the volume that they want. SiFive are probably \| producing about 2,000 of these chips for development and test \| boards. ST Micro would be buying in the hundreds of millions or \| tens of billions range. \| bogomipz wrote: \| >" Shuttle runs are really only used to get samples for the \| first version of your chip." \| \| Is a "tape out" the same thing as a shuttle run/sample chip \| run? \| Kliment wrote: \| a "tape out" is the process of transforming a design into a \| physical die - i.e. a manufacturing run. It's when you hand \| over a design to a foundry to do their thing with it. \| zibzab wrote: \| Sounds like OSH Park for silicon... \| \| Anyway, I'm still not sure why SiFive is doing this. Seems \| like a waste of money even as a prototype \| lizknope wrote: \| The article mentions that is is from the OpenFive division \| of SiFive. OpenFive used to be Open Silicon and their \| business model was working with other companies to take \| their Verilog RTL and do all of the physical design \| (synthesis to logic gates, place and route of the standard \| cells, timing analysis, test vector generation) and then \| work with the foundries to deliver all of the data for \| manufacturing. \| \| Since Open Silicon is now OpenFive and part of SiFive they \| literally have all this experience in house and don't need \| to depend on another company between them and TSMC. \| \| https://en.wikipedia.org/wiki/Open-Silicon \| variaga wrote: \| SiFive is in the business of selling IP cores and back-end \| implementation services. The gold standard for IP core \| validation is "silicon proven" i.e. that it's not just a \| nice theoretical design on paper, but someone has actually \| turned it into a physical chip and tested the real life \| performance. \| \| _Lots_ of people will try to sell you their designs and \| services. Picking the wrong ones can waste millions of \| dollars and months /years of time. \| \| The money spent on this a prototype buys SiFive credibility \| for both aspects of their business (assuming the chip \| works) - "we were able to do this for ourselves, so you \| know we'll be able to do it for you". \| \| So it's not a waste, it's a marketing expense, and a \| necessary one. \| varispeed wrote: \| Out of curiosity - what software is being used to design \| chips? Is there anything within reach of a small company, or \| something open source? \| thechao wrote: \| Front-end is HDLs -- (System)Verilog, VHDL, etc. \| Implementation and formal will be Jasper & its ilk. Backend \| (physical, etc.) use fab-specific bespoke software from the \| majors (Cadence, NXP, MG, Synopsis, ...). \| \| The front-end stuff could be done by _one person_ ; \| Verilator is a great example (although it's now "in house" \| to NXP). Implementation, LEC, etc. are mathematically \| intimidating -- they're proof engines -- but doable by a \| small team. \| \| Physical _requires_ inside knowledge of the fabs. The fabs \| aren 't going to let you participate unless you're a major, \| because it costs them a lot of money, and each additional \| participant is another potential leak of their critical IP. \| \| The tooling is all "vertical" and starts on the backend. If \| you can't do backend, you're not a player. \| jecel wrote: \| The commercial tools are indeed very expensive but the \| required data files can be as much of a problem. Normally \| you have to sign a bunch of NDAs (non disclosure \| agreements) to get your hands on the design rules and \| standard cell libraries supplied by the foundries and \| required to make the tools work. \| \| One effort to organize several previously available open \| source tools into a practical system is OpenLane, which is \| based on the DARPA OpenRoad project: \| \| https://woset-workshop.github.io/PDFs/2020/a21.pdf \| \| Recently, Google has financed a project where a foundry has \| made its data files available without any NDAs: \| \| https://github.com/google/skywater-pdk \| \| The combination has made it possible to have completely \| open source chip designs. \| PragmaticPulp wrote: \| > Also car companies are supposedly the culprit, even though \| their volume is only in the low tens of millions per year, and \| the dustup is apparently over only six months of capacity? \| What? I get that the auto industry is a nice reliable long-term \| source of revenue for chip companies, but fabs should barely be \| sneezing at that sort of volume. \| \| I agree. I think the blame on automakers has been blown out of \| proportion. It doesn't make any sense that automakers cancelled \| orders, then reinstated those orders again with some extra \| demand, and now the entire chip market is stalled. \| \| It's most likely due to the fact that consumer demand is up \| everywhere. The pandemic didn't hit the economy nearly as hard \| as expected, and we piled a lot of stimulus on top of that. \| Savings rate went up a bit, but much discretionary spending was \| diverted away from things like dining out and toward buying \| consumer goods. \| \| > STMicroelectronics can't make any of their 40-130 nm \| microcontrollers for the next year \| \| They're almost certainly making huge volumes of \| microcontrollers, but they're all spoken for with orders from \| the highest bidders. \| \| We won't have inventory sitting on shelves again until fab \| capacity isn't being 100% occupied by existing orders. Need \| some surplus before we can get parts at DigiKey. \| bravo22 wrote: \| A lot of chips are made on mature fab lines because they don't \| need the performance of 5nm lines or can't justify the mask \| costs. \| \| No one is investing in mature fab lines because they're not \| leading edge and they're being run to amortize the initial \| investmnet made into them years ago. Therefore not much \| additional capacity for mature lines. \| \| So yes you can see 5nm chips being taped out but the 40-130nm \| chips are squeezed for capacity. Also this chip is likely not \| running in the same crazy volumes that ST microcontrollers. It \| is easier for TSMC to squeeze in a few dozen to a hundred \| wafers for SiFive on their line. \| dragontamer wrote: \| > A lot of chips are made on mature fab lines because they \| don't need the performance of 5nm lines or can't justify the \| mask costs. \| \| Alternatively: they're car-scale products dealing primarily \| with high electric currents (10s or 100s of milliamps) and/or \| higher voltages (5V instead of 1.3V). \| \| Smaller chips use (and therefore output) less current than \| larger scale chips. But if your goal is to output 10mA to \| better drive an IGBT or other transistor anyway, then you \| really prefer 40nm to 130nm ANYWAY, because those larger \| sizes are just a lot better at moving those large currents \| around. \| \| Bigger wires mean bigger currents. \| bravo22 wrote: \| High voltage MOSFETs and IGBTs are built on a completely \| different process. Size is definitely not an issue with \| them. It is about exotic doping to create the desired \| characteristics. \| \| They're built using much larger feature sizes but on \| completely separate lines. \| dragontamer wrote: \| I'm not really in the industry. But I know that high- \| voltage MOSFETs / IGBTs need substantial amounts of \| current to turn on / off adequately. Under typical use, \| there's a dedicated chip called a "Gate Driver" that \| provides that current, between a microcontroller and the \| IGBT. \| \| Its not that the IGBT / MOSFETs are built on these \| microcontrollers. Its that the Gate-Driver can be \| integrated into a microcontroller (simplifying the \| circuit design and reducing the number of parts you need \| to buy). \| \| Under normal circumstances, a microcontroller can \| probably source/sink 1mA (too little to adequately turn \| on an IGBT). You amplify the 1mA with a gate-driver chip \| into 100mA, and then the amplified 100mA is used to turn \| on/off the IGBT. \| \| By integrating a gate-driver into the microcontroller, \| you save a part. \| variaga wrote: \| Your point is valid, but this is almost certainly a shuttle \| run, so it won't be even one full wafer. \| bravo22 wrote: \| You're right. Definitely a "hot" wafer for the engineering \| samples. \| monocasa wrote: \| ST fabs their own chips. If their fabs don't have the capacity, \| it's a huge slog to tape them out to a radically different \| process at another company. \| Kliment wrote: \| This is an extremely low volume prototype run. You can get \| those scheduled on short notice. Fabs love them because they \| can do process optimization using them, without impacting \| production customers. They're ridiculously expensive per-die \| and you commit to accept a much higher failure rate than \| normal. \| \| ST can and is making microcontrollers. It's just that they've \| sold their production for a year ahead, before it's even been \| manufactured. Car companies fucked everyone over by flipping a \| large volume of orders back and forth causing bullwhip effect \| on the whole industry, and lots of knock-on effects in other \| industries who suddenly got told (occasionally too late) that \| they need to plan their inventory a year ahead because they \| can't get anything at short notice anymore. Car companies \| vehicle production volume is tens of millions, but each vehicle \| has thousands to tens of thousands of ICs. The six months you \| are mentioning are not the capacity period, they are the _lead \| times_ involved. \| \| I don't want to repeat the whole story but I wrote a comment \| about this on another thread. See \| https://news.ycombinator.com/item?id=26659709 \| jankeymeulen wrote: \| Thousands to tens of thousands per car? I think you're off by \| an order of magnitude. \| rowanG077 wrote: \| What? You think it's ten thousands to hundred thousand. \| Hundred thousand seems excessive to me. \| buildbot wrote: \| I know a typical Mercedes has roughly a hundred individual \| computers, not too far reached to think the average chip \| count could be 10 or higher per device on the can bus. \| mschuster91 wrote: \| Almost everything in a car has a _number_ of chips. Power \| regulations, communication buses... and in electric cars \| with thousands of batteries, _at least_ one chip per \| battery for protection. \| osamagirl69 wrote: \| This is blatantly false, unless you are confusing battery \| for an assembled battery pack. In EVs each battery \| management IC can run somewhere in the range of 4-14 \| cells in series per chip, and they almost universally run \| banks of up to 100 cells in parallel. For example, in the \| tesla model s the pack is comprised into submodules of 76 \| cells in parallel and 6 of those groups in series per \| management chip--so only one management chip per 456 \| cells. \| dragontamer wrote: \| Electric cars have ONE battery with thousands of cells. \| I do realize that the colloquial term for "cell" is \| "battery" (ex: an AA cell is called a battery), but it \| becomes important to be precise with our words when \| talking about manufacturing. \| \| Small scale Li-ion does a protection-IC per cell (ex: \| cell phones), mostly because cell phones are so small \| they only use one cell. \| \| Larger scale Li-ion, such as Laptop batteries, may use \| one-IC per cell, OR one-protection IC for all 3x or 4x \| cells combined. As long as all the cells are soldered \| together, one protection IC is cheaper and still usable. \| \| At electric-car scales, you have thousands-and-thousands \| of cells. You can't just manage all of them with one IC, \| so you build an IC per bundle. Maybe 48 cells or \| 100-cells per IC or so. \| mschuster91 wrote: \| Indeed yes I meant cells, I'm not a native English \| speaker. \| \| > At electric-car scales, you have thousands-and- \| thousands of cells. You can't just manage all of them \| with one IC, so you build an IC per bundle. Maybe 48 \| cells or 100-cells per IC or so. \| \| Ah okay, I had more expected something on the order of 1 \| IC per 4 cells to allow individual cell health \| monitoring. \| dragontamer wrote: \| > Indeed yes I meant cells, I'm not a native English \| speaker. \| \| You're doing fine. Native English speakers don't know the \| difference between cell or battery either. This is more \| of a precise / technical engineering distinction. \| \| * 9V Battery (https://imgur.com/FHJdhIK), a collection of \| 6x cells. \| \| * AAAA Cell (one singular chemical reaction of 1.5V) \| \| Notice that the imgur is wrong: they call it a AAAA \| battery (when the proper term is a AAAA cell). \| \| -------- \| \| "Battery" is a bunch of objects doing one task. \| Originally, a "battery" described cannons. Or two rooks \| (in chess) that work together. Or... 6x 1.5V cells \| working together to produce a 9V battery. \| ohazi wrote: \| > Fabs love them because they can do process optimization \| using them, without impacting production customers. \| \| I didn't realize that, but it makes a lot of sense. I assumed \| that they acted more like the downstream manufacturers that \| I'm used to dealing with, that don't even want to talk to you \| unless they think you're going to place a huge order. \| winter_blue wrote: \| HBM might be an interesting idea. I would love to see multiple \| bandwidth levels of memory becoming a norm, with computers a very \| fast small amount of memory, and a larger set of DRR4 or DRR5. We \| already have multiple levels of cache, why not having multiple \| levels of RAM? Operating systems and software would need to \| accommodate a new reality where NUMA is the norm though. But it's \| good that we even have the concept of NUMA, so this is not \| entirely uncharted/unfamiliar territory. \| wmf wrote: \| You would love to see computers become harder to program? \| makapuf wrote: \| It canbe nice to have the opportunity to program something \| harder but faster. Counter example: Itanium, which was too \| hard to program (compilers) for. \| sanxiyn wrote: \| It is kind of ironic that compiler theory has advanced and \| now we can target Itanium no problem. It was a bit (well, a \| lot) ahead of its time. \| winter_blue wrote: \| I would try to build a new compiler (or a LLVM intermediary \| processing layer) that does NUMA optimizations. \| [deleted] ___________________________________________________________________ (page generated 2021-04-16 22:01 UTC)