proxy70

	[HN Gopher] Show HN: Marsha - An LLM-Based Programming Language ___________________________________________________________________ Show HN: Marsha - An LLM-Based Programming Language Author : ISV_Damocles Score : 78 points Date : 2023-07-25 15:43 UTC (7 hours ago)
	web link (github.com)
	w3m dump (github.com)
	\| cabalamat wrote: \| Why not just define the function headers in Python? It's less \| verbose. \| ISV_Damocles wrote: \| Simply, Marsha is not Python. ;) \| \| More seriously, in Python, types are optional and the exact \| behavior is explicitly defined at runtime, but the author of \| the code already had a behavior in mind when they wrote the \| code. Getting the intended type information from the user to \| provide to the LLM improves the quality of the generated code, \| which is actually somewhat lossy. The "intended type \| information" isn't "explicit type information", though. We \| support any character besides these four as part of a type \| `(),:`. That includes spaces and periods, so you can describe \| the type as a short sentence if you aren't exactly sure what \| the structure ought to be.[1] \| \| And since in Marsha you don't actually write an imperative \| function body, argument names don't matter, only the argument \| order (as Marsha does not support named parameters, at least \| right now) so there's no argument naming in the declaration, \| which can make it more succinct than Python def lines in \| certain situations, though to be fair none of our examples do \| this. \| \| [1]: \| https://github.com/alantech/marsha/blob/main/examples/web/du... \| ilaksh wrote: \| This is great. Have you thought about adding tree of thoughts? \| pombo wrote: \| one of the contributors here. we have a statistical test script \| we can run on our branches to test the compile time, cost and \| reliability. we want to try tree of thought and also something \| like https://www.reddit.com/r/ChatGPT/comments/14d7pfz/become_g \| od.... that said, we found that when we asked GPT to first \| explain why a test case is failing and then to correct that \| failure instead of just asking it to correct the failure, \| unexpectedly costs went up and reliability went down \| ISV_Damocles wrote: \| Btw, here's the test job: https://github.com/alantech/marsha/ \| blob/main/.github/workflo... And the core script for the job: \| https://github.com/alantech/marsha/blob/main/marsha/.time.py \| sudosteph wrote: \| Is there a new trend to give AI/tech products common women's \| names, (Alexa, Macie, Ada, Clara, Julia, etc)? The only male \| named tech product I can think of really is Watson, and that's \| old hat by now. It honestly kind of weirds me out and I feel for \| people who have to share their names with a product. At least \| Siri and Cortana seem pretty unique in that respect. Fwiw, I'm \| sure plenty of products with traditionally male names exist, I \| just can't recall encountering many at work or casually. \| FireInsight wrote: \| I think it's some weird manifestation of women being culturally \| more often in jobs as assistants to men and men in the leading \| roles. That and wanting the AI to seem approachable and human. \| swyx wrote: \| there's also Jarvis and Alfred, on the male side, but yeah, \| it's a known cognitive bias to implicitly associate subservient \| things with female names. not much to do except be aware and \| try to correct for it. i think there's like 3 AI assistants \| named Samantha. obviously inspired by the movie, but does show \| a lack of imagination. \| ISV_Damocles wrote: \| Well, a prior project we worked on was named Alan[1]. The \| choice was somewhat arbitrary: https://marsha.ai was available \| and we thought it was a fine name so here we are. \| \| [1]: https://alan-lang.org \| gs17 wrote: \| ELIZA probably started it back in the 60s. \| chrisjj wrote: \| > Marsha uses this to provide more information to the LLM to \| generate the logic you want, but also uses it to generate a test \| suite to validate that what it has generated actually does what \| you want it to. \| \| What I can't find here is the component that reads "what you want \| it to" from the mind of the user. \| jstarfish wrote: \| Heh...and here I thought I was being clever using Inform 7 as a \| metalanguage. This is way more concise. \| ISV_Damocles wrote: \| Well, to be fair, we do bulk up what you type quite a bit[1] to \| improve GPT's response. \| \| [1]: \| https://github.com/alantech/marsha/blob/main/marsha/parse.py... \| Winse wrote: \| I thought this would kind of just be a pile of garbage, but I \| have to admit I was drawn in. There are some interesting novel \| pieces in Marsha and I am somewhat impressed with this project. \| weego wrote: \| The most interesting thing for me is them providing examples of \| the function made me realise this is the logical conclusion of \| TDD. \| \| Write tests to build the input -> output contracts, have AI \| build the logic that conforms to those contracts. \| ISV_Damocles wrote: \| yeah, the AI hype cycle can be a bit tiring, but I'm glad you \| took a second look. What parts did you find interesting, in \| particular? \| cabalamat wrote: \| How easy would it be to make it output programs in a functional \| language, such as Haskell? it would be nice if it could be made \| to produce functions that are guaranteed to be free from side \| effects. \| \| In the long term I can imagine that the output languages of code \| generators like this might be ones specially designed to work \| well with AI code generators. \| ISV_Damocles wrote: \| So, targeting other languages is on our roadmap, and we have \| found higher reliability when we stick to a functional style \| with Marsha, so I _think_ Haskell will be one of the "good \| ones" amongst target languages, eventually. \| \| But we are focusing on Python first because it's one of the \| most popular languages on the planet and many LLMs out there \| are able to generate high quality code for it, while less-used \| languages tend to produce worse results. Haskell may or may not \| need a manually-tuned LLM to generate solid results. \| vector_spaces wrote: \| I don't think it is correct to call this a programming language. \| \| This program enforces some structure on your specifications for \| LLMs and provides some guardrails, which is absolutely a move in \| the right direction, but these are related more to formatting \| rather than specification, i.e. it provides syntax without \| transparent or unambiguous semantics. Ultimately this is more of \| a markup format than a programming language. Since: \| \| > The next section is the description of the function. Here you \| explain what the function should do. Being more explicit here \| will reduce variability in the generated output and improve \| reliability in behavior, but it's up to you just how explicit you \| will be and how much you leave to the LLM to figure out. \| \| There are reasons that natural language is problematic when \| specifying systems! I wish I could tape a sign that says this to \| the top of HN given the number of projects on the front page the \| past few months calling all sorts of things compilers that are \| just thin wrappers over the OpenAI API. \| \| These projects frame programming with natural language as though \| it is something desirable, like it is the holy grail of computer \| science that will open up engineering to the masses. But we use \| programming languages not to make programming difficult, but \| because unrestricted natural language is problematic. Systems are \| best specified with highly constrained languages that are (more) \| semantically unambiguous. Without sufficient constraints, there's \| a good chance that we don't even know what we ourselves are \| talking about when we specify systems -- how can we hope then \| that computers will? \| \| Even software engineers misinterpret and misunderstand \| requirements, requirements are poorly specified, or requirements \| are (apparently) well-understood but the space of possible inputs \| and use cases are not. This is why mathematicians use a (loosely) \| restricted jargon and notation, and even they run into \| difficulties \| \| All of that said: LLMs are surely a useful tool for software \| engineering and systems-building -- I personally use them most \| days to that end. But make no mistake that this is a markup \| language with some added guardrails to help users make better \| specifications for LLMs and verify the output. It is not a \| programming language, and programming with natural language is \| not generally something that is generally possible or even \| desirable \| FireInsight wrote: \| > calling all sorts of things compilers that are thin wrappers \| over the OpenAI API \| \| Well compilation and transpilation are kind or related, and \| translation is kind of an ambigous version of transpilation, \| and LLMs are kind of a more nondeterministic superset of \| natural language translation (processing), which really makes \| calling it a compiler a huge stretch, I agree. \| \| This all makes me wonder, could it be possible for an LLM to \| spit out the bits, or even assembly, for a hello world program \| written in another programming language. That'd kind of make it \| a really bad compiler. \| Der_Einzige wrote: \| If it's turing complete, it's a programming language. That's \| it. There's no other requirements. \| williamstein wrote: \| Turing complete is a property that a programming language may \| or may not have. Bloop is a non Turing complete programming \| language: https://en.m.wikipedia.org/wiki/BlooP_and_FlooP \| mepian wrote: \| Let me introduce you to \| https://en.wikipedia.org/wiki/Total_functional_programming \| jachee wrote: \| So, _Magic: the Gathering_ (being Turing complete[0]) is a \| programming language? \| \| [0]: https://arxiv.org/abs/1904.09828 \| ISV_Damocles wrote: \| I don't believe that I can change your mind on this, so I \| didn't intend to respond, but as this is the top comment, I do \| want to provide a rebuttal on why we do think this is actually \| a programming language, that the code we have written is \| actually a compiler, and why Marsha is a useful exploration of \| the programming language design space. \| \| First, a programming language is just a syntax to describe \| functionality that _could_ be turned into an actual program. \| Lisp[1] was defined in 1958 but didn 't have a full compiler \| until 1962. Was it not a programming language in the \| intervening 4 years? Marsha does not fall into this, since it \| can already generate working code, but the bar for what is a \| programming language, I believe, is lower than most would \| immediately think. \| \| Second, a programming language does not need to be imperative \| to be a programming language, or languages like Lean[2] that \| have you write proofs that the compiler then figures out how to \| generate the code to fulfill would not be programming \| languages. Lean, Coq, and other such languages are much more \| technically impressive than Marsha, true, but they share the \| property you describe the properties a function should have and \| then the compiler generates the program that fulfills those \| properties. \| \| Marsha differs from these Proof-based languages in that poor \| specificity still produces some sort of program instead of a \| compilation error, which makes it sort of like Javascript that \| will do _something_ with the code you write as long as it is \| syntactically valid. This is not a desirable property of \| Marsha, but it is a trade-off that in practice makes it more \| immediately usable to a larger number of people than Lean or \| Coq, because the skill level required is lower. \| \| This is also, as you allude to, the current state of the world \| in most software development -- project managers come up with \| high-level requirements for new features, technical leads on \| engineering teams convert this into tasks and requirements for \| individual contributors who then write the code and tests which \| are then peer reviewed by the team as a sanity check and then \| committed. This process may or may not cover all situations and \| the specifications at all levels are likely not as rigorous as \| what Lean would require of you. \| \| Marsha mimics this process, starting from the tech lead level \| and bleeding into the individual contributor level. The type \| and function descriptions are analogous to the tech lead \| requirements and the examples are analogous to the test suite \| the individual contributor would write. Just like in real world \| development, if these are not well specified, the resulting \| code will likely have logic bugs that would need to be \| addressed with a stricter definition and improved test cases. \| \| The compiler consumes this definition into an AST[3], walks the \| tree to generate intermediate forms, and generates an output in \| a format that can be executed by a computer. Some use \| "transpiler" for a compiler that targets another language, but \| that is a subset of compilers, not a separate kind of tool, in \| my opinion, or the Java compiler would be a "transpiler" for \| the JVM bytecode format that is also not directly executable by \| a computer. \| \| We are still in the _very_ early stages with Marsha and agree \| that more syntax could be helpful -- we already have 4 \| different syntactic components to Marsha versus the fully open- \| ended text entry behavior of Github Copilot or ChatGPT. But \| what makes Marsha interesting (to me) is that it makes it \| possible to explore a totally new dimension in programming \| language design: the formalization of the syntax to define a \| program itself. In many papers on new algorithms, the logic is \| often described in a human-readable list of steps without the \| hard specificity of programming languages, _improving_ the \| ability of the reader to understand the _core_ of the \| algorithm, rather than getting bogged down in the \| implementation details of this or that programming language. \| There is still a formalism, but it differs from that of \| traditional programming languages, and Marsha lets you work \| with your computer in a similar way. \| \| Are there cases where this is a bad idea? Absolutely. Just like \| there are cases where writing your code in Python is a bad idea \| versus writing it in Rust. There is no perfect programming \| language useful for all scenarios, and _probably_ never will \| exist. But there will be a subset of situations where the \| trade-offs Marsha provides makes sense. By being more forgiving \| than even the most forgiving interpreted languages out there, \| Marsha is in a good position to fill that niche if the primary \| barrier is difficulty. \| \| [1]: \| https://en.wikipedia.org/wiki/Lisp_(programming_language)#Hi... \| [2]: https://en.wikipedia.org/wiki/Lean_(proof_assistant) [3]: \| https://github.com/alantech/marsha/blob/main/marsha/parse.py... \| franciscomello wrote: \| Sounds very interesting. \| jsight wrote: \| This feels inevitable to me. All software engineering problems \| can be solved by the addition of another layer of abstraction. \| \| Why not abstract away the "how" layer and focus completely on the \| "what" layer? \| \| For a lot of data processing and integration problems, this would \| both eliminate a lot of work and increase reliability. \| ModernMech wrote: \| Because a lot of the time, exactly how is _very_ important. The \| devil is in the details. \| jsight wrote: \| That's what all the C engineers told me about manual memory \| management. \| \| Of course, there are cases where they were right. \| pombo wrote: \| I would say that it depends. Python and JS, two of the most \| used programming languages, abstract away memory management \| and threading details that most of the time you don't need, \| but you can "drop down" to write code that does pay attention \| to that within the language (building an object pool and \| reusing it, or memoization, etc) and if that's not enough, go \| to the "deeper" language like C/Rust to handle those cases \| with first-class primitives. \| pombo wrote: \| I'm not sure I agree with the fact that an abstraction is \| always the answer, but your last sentence outlines precisely \| one of our motivations behind Marsha. You can specify an \| imperative set of steps in the description of a \| function/program, or you can (in the future) write a Python \| function that you use from Marsha \| jsight wrote: \| The first sentence was a little tongue in cheek, so I'm glad \| that you didn't 100% agree with it. :) But it was a all-too- \| common paradigm in my early years as a Java developer. \| andreygrehov wrote: \| Sharing a comment of mine (that got downvoted) from another, \| unrelated, thread. IMHO, it somewhat applies here as well: \| \| > Looking back, we can see how Machine Code, with its intricate \| and challenging nature, paved the way for more accessible \| options. Assembly language then emerged, providing a higher level \| of abstraction and reducing the complexities of directly working \| with machine instructions. And of course, C followed suit, \| offering even greater simplicity and ease of use compared to \| Assembly. \| \| > Imagine a future where programming languages, as we know them \| today, become akin to CPU instructions - a foundational and low- \| level primitive. LLMs will revolutionize the way we interact with \| code, providing a unified interface where the complexities of \| various languages are distilled into a common representation. The \| proliferation of individual programming languages will wane. \| Knowing Java or C++ will become a rare skill, akin to individuals \| specializing in low-level optimizations using Assembly language \| these days. \| \| > As time progresses, even the convenience of LLMs may pose \| challenges, given our inherent tendency towards laziness, so an \| additional layer of abstraction will be introduced, bridging the \| gap between LLMs and spoken languages. BCIs will revolutionize \| the act of coding itself so that individuals can seamlessly \| "code" by simply "thinking" about their desired actions. ___________________________________________________________________ (page generated 2023-07-25 23:01 UTC)