proxy70

	[HN Gopher] Automated CPU Design with AI ___________________________________________________________________ Automated CPU Design with AI Author : skilled Score : 61 points Date : 2023-07-02 20:59 UTC (2 hours ago)
	web link (arxiv.org)
	w3m dump (arxiv.org)
	\| dooglius wrote: \| Doesn't seem to be any discussion of what the inputs and outputs \| actually are here, at least for the "coarse-grained" approach. \| Suspect there is some "scaffolding" around e.g. register map and \| memory access, and the rest is essentially learning a map from \| (instruction, register input vals)->(register output val, control \| registers for memory access) \| westurner wrote: \| From the abstract; "Pushing the Limits of Machine Design: \| Automated CPU Design with AI" (2023) \| https://arxiv.org/abs/2306.12456 : \| \| > _[...] This approach generates the circuit logic, which is \| represented by a graph structure called Binary Speculation \| Diagram (BSD), of the CPU design from only external input-output \| observations instead of formal program code. During the \| generation of BSD, Monte Carlo-based expansion and the distance \| of Boolean functions are used to guarantee accuracy and \| efficiency, respectively. By efficiently exploring a search space \| of unprecedented size 10^{10^{540}}, which is the largest one of \| all machine-designed objects to our best knowledge, and thus \| pushing the limits of machine design, our approach generates an \| industrial-scale RISC-V CPU within only 5 hours. The taped-out \| CPU successfully runs the Linux operating system and performs \| comparably against the human-designed Intel 80486SX CPU. In \| addition to learning the world 's first CPU only from input- \| output observations, which may reform the semiconductor industry \| by significantly reducing the design cycle, our approach even \| autonomously discovers human knowledge of the von Neumann \| architecture._ \| \| The von Neumann (and Mark) architectures have an instruction \| pipeline bottleneck maybe by design for serial debuggability; as \| compared with IDK in-RAM computing with existing RAM geometries? \| (See also: "Rowhammer for qubits") \| \| (Edit: High-Bandwidth Memory; hbm2e vs gddr6x (2023) \| https://en.wikipedia.org/wiki/High_Bandwidth_Memory ) \| \| Hopefully part of the fitness function is determined by the \| presence and severity of hardware side channels and electron \| tunneling; does it filter out candidate designs with side-channel \| vulnerabilities (that are presumed undetectable with TLA+)? \| westurner wrote: \| And then maybe someday design reconfigurable - probably modular \| - semiconductor fabrication facility to produce the design(s)? \| xeonmc wrote: \| Pentium FDIV bug, round two incoming. \| brucethemoose2 wrote: \| > The implemented program is executed on a Linux cluster \| including 68 servers, each of which is equipped with 2 Intel Xeon \| Gold 6230 CPUs. \| \| > We verify our output netlist on the FPGAs and tape out the chip \| with 65nm technology. The automatically designed CPU was sent to \| the manufacturer in December 2021. \| granthamb wrote: \| It wasn't clear to me that they had implemented a page table (I \| think that's the S extensions?) which I would think would make \| the I/O space much more complex and difficult to represent. Lack \| of VA translation would make this CPU much less comparable to a \| 486SX. \| behnamoh wrote: \| Wasn't Google Tensor already designed by one of Google's AIs? I \| remember it made a big deal because people thought Google could \| improve their chips much faster than the competition. \| rowanG077 wrote: \| That was just placement not abstract circuit design. \| optimalsolver wrote: \| Could some really alien CPU architectures be discovered with this \| method? \| \| Just wondering how far from human design-space you could end up \| with this. \| ninkendo wrote: \| Silicon validation is a huge part of the overall cost of \| bringing up a chip, because it's so important that the physical \| hardware do what it's supposed to do. So it's gonna be limited \| to behaving exactly as the validation specifies, which likely \| will limit how "alien" it will actually be. \| amelius wrote: \| I didn't read the paper but judging from the abstract it's \| probably a technique for design space exploration. \| \| I.e., they manually designed the CPU but left a (large) number of \| parameters open, then used AI to find an optimum for those \| parameters. \| \| So anything the AI did was completely correctness-preserving. \| \| Note that this may sound like it's a small achievement, but keep \| in mind that for modern CPUs the search of the design space is \| hugely important, and probably the reason for the success of e.g. \| Apple's M1. \| bsder wrote: \| I find this paper _extremely_ suspicious. \| \| If this _actually_ worked, it should be able to cough up a 6502, \| 6809, 8051, etc. as well since they are so much simpler-- \| especially since they even mention a Commodore 64. \| \| The fact that they don't do this stinks very strongly. There are \| other concerning signs in the paper as well. \| staunton wrote: \| Why should it produce those designs? Are they in any known \| sense optimal? \| bsder wrote: \| > Why should it produce those designs? Are they in any known \| sense optimal? \| \| Yes. 6502 was quite cheap for the day so is much more optimal \| for cost than most designs. The 6809 was done fixing the \| mistakes of the 6800 and it's implementation is much more \| orthogonal. The 6800 and 8051 are probably the best \| documented. All of them have extremely long lived tool chains \| and support. Pick your optimality. \| \| In addition, then "Why should it produce a RISC-V design?" \| RISC-V is definitely sub-optimal on quite a few fronts. \| \| If a system is doing actual _CPU design_ , as claimed by the \| paper, those designs (6502, 6809, 8051) are a simple sanity \| check. The designs are extensively documented to the point \| that we have web pages that simulate them down to the \| transisitor. You should be able to provide a "relatively" \| small input and get back a compatible design as an output. A \| 6502 has only 3500 or so transistors. That's on the order of \| the complexity they claim in the paper. \| \| This would prevent someone like me from saying: "You \| basically stuffed a RISC-V design into the training set, \| managed to launder it through ML/AI to get the computer to \| cough it back up, then deployed a legion of humans to patch \| the result suffciently that it could be called "Linux \| compatible", and finally barfed out a publication with 6 \| pages of link references in a 12 page paper." \| \| Here's the touchstone for whether AI is doing chip design: \| "When AI can distinguish between control plane and datapath \| and _synthesize and place them differently_ , AI is doing \| actual design." \| sitkack wrote: \| I don't think any reviewer of the paper would ask why they \| didn't use one the processors mentioned. \| \| I think of lots of reasons to do it with a riscv \| * lots of excellent simulators and emulators * great \| tool chains * both software (Verilog, VHDL) \| implementations as well as hardware * regular, \| compact instruction set (no condition codes) \| \| Using anything _besides_ RISC-V would have been an order of \| magnitude harder. ___________________________________________________________________ (page generated 2023-07-02 23:00 UTC)