[HN Gopher] Show HN: Marsha - An LLM-Based Programming Language
___________________________________________________________________
 
Show HN: Marsha - An LLM-Based Programming Language
 
Author : ISV_Damocles
Score  : 78 points
Date   : 2023-07-25 15:43 UTC (7 hours ago)
 
web link (github.com)
w3m dump (github.com)
 
| cabalamat wrote:
| Why not just define the function headers in Python? It's less
| verbose.
 
  | ISV_Damocles wrote:
  | Simply, Marsha is not Python. ;)
  | 
  | More seriously, in Python, types are optional and the exact
  | behavior is explicitly defined at runtime, but the author of
  | the code already had a behavior in mind when they wrote the
  | code. Getting the intended type information from the user to
  | provide to the LLM improves the quality of the generated code,
  | which is actually somewhat lossy. The "intended type
  | information" isn't "explicit type information", though. We
  | support any character besides these four as part of a type
  | `(),:`. That includes spaces and periods, so you can describe
  | the type as a short sentence if you aren't exactly sure what
  | the structure ought to be.[1]
  | 
  | And since in Marsha you don't actually write an imperative
  | function body, argument names don't matter, only the argument
  | order (as Marsha does not support named parameters, at least
  | right now) so there's no argument naming in the declaration,
  | which can make it more succinct than Python def lines in
  | certain situations, though to be fair none of our examples do
  | this.
  | 
  | [1]:
  | https://github.com/alantech/marsha/blob/main/examples/web/du...
 
| ilaksh wrote:
| This is great. Have you thought about adding tree of thoughts?
 
  | pombo wrote:
  | one of the contributors here. we have a statistical test script
  | we can run on our branches to test the compile time, cost and
  | reliability. we want to try tree of thought and also something
  | like https://www.reddit.com/r/ChatGPT/comments/14d7pfz/become_g
  | od.... that said, we found that when we asked GPT to first
  | explain why a test case is failing and then to correct that
  | failure instead of just asking it to correct the failure,
  | unexpectedly costs went up and reliability went down
 
    | ISV_Damocles wrote:
    | Btw, here's the test job: https://github.com/alantech/marsha/
    | blob/main/.github/workflo... And the core script for the job:
    | https://github.com/alantech/marsha/blob/main/marsha/.time.py
 
| sudosteph wrote:
| Is there a new trend to give AI/tech products common women's
| names, (Alexa, Macie, Ada, Clara, Julia, etc)? The only male
| named tech product I can think of really is Watson, and that's
| old hat by now. It honestly kind of weirds me out and I feel for
| people who have to share their names with a product. At least
| Siri and Cortana seem pretty unique in that respect. Fwiw, I'm
| sure plenty of products with traditionally male names exist, I
| just can't recall encountering many at work or casually.
 
  | FireInsight wrote:
  | I think it's some weird manifestation of women being culturally
  | more often in jobs as assistants to men and men in the leading
  | roles. That and wanting the AI to seem approachable and human.
 
  | swyx wrote:
  | there's also Jarvis and Alfred, on the male side, but yeah,
  | it's a known cognitive bias to implicitly associate subservient
  | things with female names. not much to do except be aware and
  | try to correct for it. i think there's like 3 AI assistants
  | named Samantha. obviously inspired by the movie, but does show
  | a lack of imagination.
 
  | ISV_Damocles wrote:
  | Well, a prior project we worked on was named Alan[1]. The
  | choice was somewhat arbitrary: https://marsha.ai was available
  | and we thought it was a fine name so here we are.
  | 
  | [1]: https://alan-lang.org
 
  | gs17 wrote:
  | ELIZA probably started it back in the 60s.
 
| chrisjj wrote:
| > Marsha uses this to provide more information to the LLM to
| generate the logic you want, but also uses it to generate a test
| suite to validate that what it has generated actually does what
| you want it to.
| 
| What I can't find here is the component that reads "what you want
| it to" from the mind of the user.
 
| jstarfish wrote:
| Heh...and here I thought I was being clever using Inform 7 as a
| metalanguage. This is way more concise.
 
  | ISV_Damocles wrote:
  | Well, to be fair, we do bulk up what you type quite a bit[1] to
  | improve GPT's response.
  | 
  | [1]:
  | https://github.com/alantech/marsha/blob/main/marsha/parse.py...
 
| Winse wrote:
| I thought this would kind of just be a pile of garbage, but I
| have to admit I was drawn in. There are some interesting novel
| pieces in Marsha and I am somewhat impressed with this project.
 
  | weego wrote:
  | The most interesting thing for me is them providing examples of
  | the function made me realise this is the logical conclusion of
  | TDD.
  | 
  | Write tests to build the input -> output contracts, have AI
  | build the logic that conforms to those contracts.
 
  | ISV_Damocles wrote:
  | yeah, the AI hype cycle can be a bit tiring, but I'm glad you
  | took a second look. What parts did you find interesting, in
  | particular?
 
| cabalamat wrote:
| How easy would it be to make it output programs in a functional
| language, such as Haskell? it would be nice if it could be made
| to produce functions that are guaranteed to be free from side
| effects.
| 
| In the long term I can imagine that the output languages of code
| generators like this might be ones specially designed to work
| well with AI code generators.
 
  | ISV_Damocles wrote:
  | So, targeting other languages is on our roadmap, and we have
  | found higher reliability when we stick to a functional style
  | with Marsha, so I _think_ Haskell will be one of the  "good
  | ones" amongst target languages, eventually.
  | 
  | But we are focusing on Python first because it's one of the
  | most popular languages on the planet and many LLMs out there
  | are able to generate high quality code for it, while less-used
  | languages tend to produce worse results. Haskell may or may not
  | need a manually-tuned LLM to generate solid results.
 
| vector_spaces wrote:
| I don't think it is correct to call this a programming language.
| 
| This program enforces some structure on your specifications for
| LLMs and provides some guardrails, which is absolutely a move in
| the right direction, but these are related more to formatting
| rather than specification, i.e. it provides syntax without
| transparent or unambiguous semantics. Ultimately this is more of
| a markup format than a programming language. Since:
| 
| > The next section is the description of the function. Here you
| explain what the function should do. Being more explicit here
| will reduce variability in the generated output and improve
| reliability in behavior, but it's up to you just how explicit you
| will be and how much you leave to the LLM to figure out.
| 
| There are reasons that natural language is problematic when
| specifying systems! I wish I could tape a sign that says this to
| the top of HN given the number of projects on the front page the
| past few months calling all sorts of things compilers that are
| just thin wrappers over the OpenAI API.
| 
| These projects frame programming with natural language as though
| it is something desirable, like it is the holy grail of computer
| science that will open up engineering to the masses. But we use
| programming languages not to make programming difficult, but
| because unrestricted natural language is problematic. Systems are
| best specified with highly constrained languages that are (more)
| semantically unambiguous. Without sufficient constraints, there's
| a good chance that we don't even know what we ourselves are
| talking about when we specify systems -- how can we hope then
| that computers will?
| 
| Even software engineers misinterpret and misunderstand
| requirements, requirements are poorly specified, or requirements
| are (apparently) well-understood but the space of possible inputs
| and use cases are not. This is why mathematicians use a (loosely)
| restricted jargon and notation, and even they run into
| difficulties
| 
| All of that said: LLMs are surely a useful tool for software
| engineering and systems-building -- I personally use them most
| days to that end. But make no mistake that this is a markup
| language with some added guardrails to help users make better
| specifications for LLMs and verify the output. It is not a
| programming language, and programming with natural language is
| not generally something that is generally possible or even
| desirable
 
  | FireInsight wrote:
  | > calling all sorts of things compilers that are thin wrappers
  | over the OpenAI API
  | 
  | Well compilation and transpilation are kind or related, and
  | translation is kind of an ambigous version of transpilation,
  | and LLMs are kind of a more nondeterministic superset of
  | natural language translation (processing), which really makes
  | calling it a compiler a huge stretch, I agree.
  | 
  | This all makes me wonder, could it be possible for an LLM to
  | spit out the bits, or even assembly, for a hello world program
  | written in another programming language. That'd kind of make it
  | a really bad compiler.
 
  | Der_Einzige wrote:
  | If it's turing complete, it's a programming language. That's
  | it. There's no other requirements.
 
    | williamstein wrote:
    | Turing complete is a property that a programming language may
    | or may not have. Bloop is a non Turing complete programming
    | language: https://en.m.wikipedia.org/wiki/BlooP_and_FlooP
 
    | mepian wrote:
    | Let me introduce you to
    | https://en.wikipedia.org/wiki/Total_functional_programming
 
    | jachee wrote:
    | So, _Magic: the Gathering_ (being Turing complete[0]) is a
    | programming language?
    | 
    | [0]: https://arxiv.org/abs/1904.09828
 
  | ISV_Damocles wrote:
  | I don't believe that I can change your mind on this, so I
  | didn't intend to respond, but as this is the top comment, I do
  | want to provide a rebuttal on why we do think this is actually
  | a programming language, that the code we have written is
  | actually a compiler, and why Marsha is a useful exploration of
  | the programming language design space.
  | 
  | First, a programming language is just a syntax to describe
  | functionality that _could_ be turned into an actual program.
  | Lisp[1] was defined in 1958 but didn 't have a full compiler
  | until 1962. Was it not a programming language in the
  | intervening 4 years? Marsha does not fall into this, since it
  | can already generate working code, but the bar for what is a
  | programming language, I believe, is lower than most would
  | immediately think.
  | 
  | Second, a programming language does not need to be imperative
  | to be a programming language, or languages like Lean[2] that
  | have you write proofs that the compiler then figures out how to
  | generate the code to fulfill would not be programming
  | languages. Lean, Coq, and other such languages are much more
  | technically impressive than Marsha, true, but they share the
  | property you describe the properties a function should have and
  | then the compiler generates the program that fulfills those
  | properties.
  | 
  | Marsha differs from these Proof-based languages in that poor
  | specificity still produces some sort of program instead of a
  | compilation error, which makes it sort of like Javascript that
  | will do _something_ with the code you write as long as it is
  | syntactically valid. This is not a desirable property of
  | Marsha, but it is a trade-off that in practice makes it more
  | immediately usable to a larger number of people than Lean or
  | Coq, because the skill level required is lower.
  | 
  | This is also, as you allude to, the current state of the world
  | in most software development -- project managers come up with
  | high-level requirements for new features, technical leads on
  | engineering teams convert this into tasks and requirements for
  | individual contributors who then write the code and tests which
  | are then peer reviewed by the team as a sanity check and then
  | committed. This process may or may not cover all situations and
  | the specifications at all levels are likely not as rigorous as
  | what Lean would require of you.
  | 
  | Marsha mimics this process, starting from the tech lead level
  | and bleeding into the individual contributor level. The type
  | and function descriptions are analogous to the tech lead
  | requirements and the examples are analogous to the test suite
  | the individual contributor would write. Just like in real world
  | development, if these are not well specified, the resulting
  | code will likely have logic bugs that would need to be
  | addressed with a stricter definition and improved test cases.
  | 
  | The compiler consumes this definition into an AST[3], walks the
  | tree to generate intermediate forms, and generates an output in
  | a format that can be executed by a computer. Some use
  | "transpiler" for a compiler that targets another language, but
  | that is a subset of compilers, not a separate kind of tool, in
  | my opinion, or the Java compiler would be a "transpiler" for
  | the JVM bytecode format that is also not directly executable by
  | a computer.
  | 
  | We are still in the _very_ early stages with Marsha and agree
  | that more syntax could be helpful -- we already have 4
  | different syntactic components to Marsha versus the fully open-
  | ended text entry behavior of Github Copilot or ChatGPT. But
  | what makes Marsha interesting (to me) is that it makes it
  | possible to explore a totally new dimension in programming
  | language design: the formalization of the syntax to define a
  | program itself. In many papers on new algorithms, the logic is
  | often described in a human-readable list of steps without the
  | hard specificity of programming languages, _improving_ the
  | ability of the reader to understand the _core_ of the
  | algorithm, rather than getting bogged down in the
  | implementation details of this or that programming language.
  | There is still a formalism, but it differs from that of
  | traditional programming languages, and Marsha lets you work
  | with your computer in a similar way.
  | 
  | Are there cases where this is a bad idea? Absolutely. Just like
  | there are cases where writing your code in Python is a bad idea
  | versus writing it in Rust. There is no perfect programming
  | language useful for all scenarios, and _probably_ never will
  | exist. But there will be a subset of situations where the
  | trade-offs Marsha provides makes sense. By being more forgiving
  | than even the most forgiving interpreted languages out there,
  | Marsha is in a good position to fill that niche if the primary
  | barrier is difficulty.
  | 
  | [1]:
  | https://en.wikipedia.org/wiki/Lisp_(programming_language)#Hi...
  | [2]: https://en.wikipedia.org/wiki/Lean_(proof_assistant) [3]:
  | https://github.com/alantech/marsha/blob/main/marsha/parse.py...
 
| franciscomello wrote:
| Sounds very interesting.
 
| jsight wrote:
| This feels inevitable to me. All software engineering problems
| can be solved by the addition of another layer of abstraction.
| 
| Why not abstract away the "how" layer and focus completely on the
| "what" layer?
| 
| For a lot of data processing and integration problems, this would
| both eliminate a lot of work and increase reliability.
 
  | ModernMech wrote:
  | Because a lot of the time, exactly how is _very_ important. The
  | devil is in the details.
 
    | jsight wrote:
    | That's what all the C engineers told me about manual memory
    | management.
    | 
    | Of course, there are cases where they were right.
 
    | pombo wrote:
    | I would say that it depends. Python and JS, two of the most
    | used programming languages, abstract away memory management
    | and threading details that most of the time you don't need,
    | but you can "drop down" to write code that does pay attention
    | to that within the language (building an object pool and
    | reusing it, or memoization, etc) and if that's not enough, go
    | to the "deeper" language like C/Rust to handle those cases
    | with first-class primitives.
 
  | pombo wrote:
  | I'm not sure I agree with the fact that an abstraction is
  | always the answer, but your last sentence outlines precisely
  | one of our motivations behind Marsha. You can specify an
  | imperative set of steps in the description of a
  | function/program, or you can (in the future) write a Python
  | function that you use from Marsha
 
    | jsight wrote:
    | The first sentence was a little tongue in cheek, so I'm glad
    | that you didn't 100% agree with it. :) But it was a all-too-
    | common paradigm in my early years as a Java developer.
 
| andreygrehov wrote:
| Sharing a comment of mine (that got downvoted) from another,
| unrelated, thread. IMHO, it somewhat applies here as well:
| 
| > Looking back, we can see how Machine Code, with its intricate
| and challenging nature, paved the way for more accessible
| options. Assembly language then emerged, providing a higher level
| of abstraction and reducing the complexities of directly working
| with machine instructions. And of course, C followed suit,
| offering even greater simplicity and ease of use compared to
| Assembly.
| 
| > Imagine a future where programming languages, as we know them
| today, become akin to CPU instructions - a foundational and low-
| level primitive. LLMs will revolutionize the way we interact with
| code, providing a unified interface where the complexities of
| various languages are distilled into a common representation. The
| proliferation of individual programming languages will wane.
| Knowing Java or C++ will become a rare skill, akin to individuals
| specializing in low-level optimizations using Assembly language
| these days.
| 
| > As time progresses, even the convenience of LLMs may pose
| challenges, given our inherent tendency towards laziness, so an
| additional layer of abstraction will be introduced, bridging the
| gap between LLMs and spoken languages. BCIs will revolutionize
| the act of coding itself so that individuals can seamlessly
| "code" by simply "thinking" about their desired actions.
 
___________________________________________________________________
(page generated 2023-07-25 23:01 UTC)