|
| dooglius wrote:
| Doesn't seem to be any discussion of what the inputs and outputs
| actually are here, at least for the "coarse-grained" approach.
| Suspect there is some "scaffolding" around e.g. register map and
| memory access, and the rest is essentially learning a map from
| (instruction, register input vals)->(register output val, control
| registers for memory access)
| westurner wrote:
| From the abstract; "Pushing the Limits of Machine Design:
| Automated CPU Design with AI" (2023)
| https://arxiv.org/abs/2306.12456 :
|
| > _[...] This approach generates the circuit logic, which is
| represented by a graph structure called Binary Speculation
| Diagram (BSD), of the CPU design from only external input-output
| observations instead of formal program code. During the
| generation of BSD, Monte Carlo-based expansion and the distance
| of Boolean functions are used to guarantee accuracy and
| efficiency, respectively. By efficiently exploring a search space
| of unprecedented size 10^{10^{540}}, which is the largest one of
| all machine-designed objects to our best knowledge, and thus
| pushing the limits of machine design, our approach generates an
| industrial-scale RISC-V CPU within only 5 hours. The taped-out
| CPU successfully runs the Linux operating system and performs
| comparably against the human-designed Intel 80486SX CPU. In
| addition to learning the world 's first CPU only from input-
| output observations, which may reform the semiconductor industry
| by significantly reducing the design cycle, our approach even
| autonomously discovers human knowledge of the von Neumann
| architecture._
|
| The von Neumann (and Mark) architectures have an instruction
| pipeline bottleneck maybe by design for serial debuggability; as
| compared with IDK in-RAM computing with existing RAM geometries?
| (See also: "Rowhammer for qubits")
|
| (Edit: High-Bandwidth Memory; hbm2e vs gddr6x (2023)
| https://en.wikipedia.org/wiki/High_Bandwidth_Memory )
|
| Hopefully part of the fitness function is determined by the
| presence and severity of hardware side channels and electron
| tunneling; does it filter out candidate designs with side-channel
| vulnerabilities (that are presumed undetectable with TLA+)?
| westurner wrote:
| And then maybe someday design reconfigurable - probably modular
| - semiconductor fabrication facility to produce the design(s)?
| xeonmc wrote:
| Pentium FDIV bug, round two incoming.
| brucethemoose2 wrote:
| > The implemented program is executed on a Linux cluster
| including 68 servers, each of which is equipped with 2 Intel Xeon
| Gold 6230 CPUs.
|
| > We verify our output netlist on the FPGAs and tape out the chip
| with 65nm technology. The automatically designed CPU was sent to
| the manufacturer in December 2021.
| granthamb wrote:
| It wasn't clear to me that they had implemented a page table (I
| think that's the S extensions?) which I would think would make
| the I/O space much more complex and difficult to represent. Lack
| of VA translation would make this CPU much less comparable to a
| 486SX.
| behnamoh wrote:
| Wasn't Google Tensor already designed by one of Google's AIs? I
| remember it made a big deal because people thought Google could
| improve their chips much faster than the competition.
| rowanG077 wrote:
| That was just placement not abstract circuit design.
| optimalsolver wrote:
| Could some really alien CPU architectures be discovered with this
| method?
|
| Just wondering how far from human design-space you could end up
| with this.
| ninkendo wrote:
| Silicon validation is a huge part of the overall cost of
| bringing up a chip, because it's so important that the physical
| hardware do what it's supposed to do. So it's gonna be limited
| to behaving exactly as the validation specifies, which likely
| will limit how "alien" it will actually be.
| amelius wrote:
| I didn't read the paper but judging from the abstract it's
| probably a technique for design space exploration.
|
| I.e., they manually designed the CPU but left a (large) number of
| parameters open, then used AI to find an optimum for those
| parameters.
|
| So anything the AI did was completely correctness-preserving.
|
| Note that this may sound like it's a small achievement, but keep
| in mind that for modern CPUs the search of the design space is
| hugely important, and probably the reason for the success of e.g.
| Apple's M1.
| bsder wrote:
| I find this paper _extremely_ suspicious.
|
| If this _actually_ worked, it should be able to cough up a 6502,
| 6809, 8051, etc. as well since they are so much simpler--
| especially since they even mention a Commodore 64.
|
| The fact that they don't do this stinks very strongly. There are
| other concerning signs in the paper as well.
| staunton wrote:
| Why should it produce those designs? Are they in any known
| sense optimal?
| bsder wrote:
| > Why should it produce those designs? Are they in any known
| sense optimal?
|
| Yes. 6502 was quite cheap for the day so is much more optimal
| for cost than most designs. The 6809 was done fixing the
| mistakes of the 6800 and it's implementation is much more
| orthogonal. The 6800 and 8051 are probably the best
| documented. All of them have extremely long lived tool chains
| and support. Pick your optimality.
|
| In addition, then "Why should it produce a RISC-V design?"
| RISC-V is definitely sub-optimal on quite a few fronts.
|
| If a system is doing actual _CPU design_ , as claimed by the
| paper, those designs (6502, 6809, 8051) are a simple sanity
| check. The designs are extensively documented to the point
| that we have web pages that simulate them down to the
| transisitor. You should be able to provide a "relatively"
| small input and get back a compatible design as an output. A
| 6502 has only 3500 or so transistors. That's on the order of
| the complexity they claim in the paper.
|
| This would prevent someone like me from saying: "You
| basically stuffed a RISC-V design into the training set,
| managed to launder it through ML/AI to get the computer to
| cough it back up, then deployed a legion of humans to patch
| the result suffciently that it could be called "Linux
| compatible", and finally barfed out a publication with 6
| pages of link references in a 12 page paper."
|
| Here's the touchstone for whether AI is doing chip design:
| "When AI can distinguish between control plane and datapath
| and _synthesize and place them differently_ , AI is doing
| actual design."
| sitkack wrote:
| I don't think any reviewer of the paper would ask why they
| didn't use one the processors mentioned.
|
| I think of lots of reasons to do it with a riscv
| * lots of excellent simulators and emulators * great
| tool chains * both software (Verilog, VHDL)
| implementations as well as hardware * regular,
| compact instruction set (no condition codes)
|
| Using anything _besides_ RISC-V would have been an order of
| magnitude harder.
___________________________________________________________________
(page generated 2023-07-02 23:00 UTC) |