[HN Gopher] SantaCoder: A new 1.1B code model for generation and...
___________________________________________________________________
 
SantaCoder: A new 1.1B code model for generation and infilling
 
Author : moyix
Score  : 68 points
Date   : 2022-12-22 19:45 UTC (3 hours ago)
 
web link (huggingface.co)
w3m dump (huggingface.co)
 
| snowder wrote:
| I am having trouble getting the demo to run. It just errors out
 
  | enum wrote:
  | Give this notebook a shot:
  | 
  | https://github.com/arjunguha/BigCode-demos/blob/main/bigcode...
  | 
  | A GPU will help, but I found it passable on a CPU as well.
 
  | osanseviero wrote:
  | The demo is up again!
 
  | seinecle wrote:
  | Same here
 
    | moyix wrote:
    | Might be overloaded - if you have a GPU you can try running
    | it locally by getting the model weights here:
    | https://huggingface.co/bigcode/santacoder
 
      | bogwog wrote:
      | Any idea how much GPU memory you'd need to run this
      | locally?
      | 
      | EDIT: just tried it and it didn't seem to go past ~6gb
 
        | lossolo wrote:
        | It's 1 billion model with Fp16 precision so 4-6 GB max.
 
| 1024core wrote:
| Increase the number of tokens to a large number and you end up
| with masterpieces like this:
| 
| def all_elements_in_range_excluding_and_including_and_excluding_a
| nd_including_and_excluding(sequence, start, end):
 
| ilaksh wrote:
| Is anyone else here building AI programming services based on
| models like this? I see a lot of comments saying the models can't
| do much programming. But I just suspect there must be a silent
| contingent that is also working on services like that. And maybe
| less likely to promote the abilities of these models because it
| encourages competition.
 
  | aunch wrote:
  | We are at Codeium (codeium.com)! Not the SantaCoder model
  | specifically, but the same types of LLM architectures. We've
  | started with AI-based code autocomplete, but we think there is
  | a lot more we can do.
  | 
  | We wrote up some of our learnings so far in @swyx's blog
  | recently: https://lspace.swyx.io/p/what-building-copilot-for-x-
  | really
 
    | furyofantares wrote:
    | What I would really like is something I saw someone talking
    | about here; I'd like the editor to brighten text it finds
    | "unexpected" which could immediately alert to bugs, or to the
    | fact that the code I'm writing looks weird in some way and
    | might either be restructured or accompanied by a comment.
 
      | aunch wrote:
      | Yep, these kinds of applications are on our mind! We
      | consider autocomplete to be the "baseline" task since there
      | are plenty of benchmarks and research to compare our
      | model's performance to, but there's lots of things like
      | highlighting code, upgrading to new libraries/conventions,
      | etc that we can do with a good base model.
 
  | videlov wrote:
  | We built a semantic code search CLI tool (fully local and open
  | source) using a similar model that I tuned
  | https://github.com/sturdy-dev/semantic-code-search
 
  | notwokeno wrote:
  | I've been messing around some. Flan-T5 generates surprisingly
  | close stuff occasionally for simple prompts like #square x or
  | #sum the elements in the list.
 
  | morgante wrote:
  | We're building tools like this at Grit: https://www.grit.io/
  | 
  | These kinds of models are particularly good at repetitive,
  | boring work like refactoring legacy code and completing
  | framework migrations. Unlike Copilot, we've specialized
  | specifically in these areas and completing them end-to-end
  | (instead of just sitting in the IDE, we open already-verified
  | PRs).
 
    | ilaksh wrote:
    | May I ask what model you are using?
 
      | morgante wrote:
      | We use a few depending on the task (Codex, fine-tuned T5,
      | Bert models, etc.). Constantly experimenting with different
      | variations. Since we focus on solving narrower problems in
      | more depth, it leaves more room for optimizing accuracy.
 
| recursive wrote:
| I think my job is safe.                   def
| all_odd_prime_elements(sequence):             """Returns every
| odd prime element of the sequence."""             return [x for x
| in sequence if x % 2 == 1]                           def
| all_even_prime_elements(sequence):             """Returns every
| even prime element of the sequence."""             return [x for
| x in
 
| moyix wrote:
| Despite being only 1.1B params, SantaCoder outperforms Facebook's
| InCoder (6.7B params) and Salesforce's CodeGen-Multi-2.7B.
| 
| Paper:
| https://hf.co/datasets/bigcode/admin/resolve/main/BigCode_Sa...
| 
| Dataset search: https://huggingface.co/spaces/bigcode/santacoder-
| search
| 
| Model weights: https://huggingface.co/bigcode/santacoder
 
  | isoprophlex wrote:
  | Any idea how this, and those other two models, would compare to
  | GitHub Copilot?
 
    | moyix wrote:
    | Based on the reverse engineering done by Parth Thakkar [1],
    | the model used by Copilot is probably about 10x as large (12B
    | parameters), so I would expect Copilot to still win pretty
    | handily (especially since the Codex models are generally a
    | lot better trained than Salesforce CodeGen or InCoder). It's
    | also a little bit hard to compare directly because as Parth
    | documents, there are a lot of extra smarts that go into
    | Copilot on the client side.
    | 
    | The SantaCoder paper does have some benchmarks on MultiPL-E
    | though, so you could compare them to the Codex results on
    | that benchmark reported here (but keep in mind that code-
    | davinci-002 is probably _even larger_ than the model used by
    | Copilot): https://arxiv.org/abs/2208.08227
    | 
    | [1] https://thakkarparth007.github.io/copilot-
    | explorer/posts/cop...
 
| zomglings wrote:
| Looks like Santa's going to be too busy to deliver presents this
| Christmas:                   def all_prime_elements(sequence):
| """Returns every prime element of the sequence."""
| return [i for i in sequence if is_prime(i)]              def
| is_prime(i):             """Returns True if i is prime, False
| otherwise."""             if i == 2:                 return True
| if i == 3:                 return True             if i == 5:
| return True             if i == 7:                 return True
| if i == 11:                 return True             if i == 13:
| return True             if i == 17:                 return True
| if i == 19:                 return True             if i == 23:
| return True             if i == 29:                 return True
| if i == 31:
 
| rytill wrote:
| Are there model weights?
 
  | moyix wrote:
  | Yep! https://huggingface.co/bigcode/santacoder
 
    | Traubenfuchs wrote:
    | A few more "getting started" examples would be nice.
 
| TuringNYC wrote:
| Any VS Code extension?
 
___________________________________________________________________
(page generated 2022-12-22 23:00 UTC)