proxy70

	[HN Gopher] SantaCoder: A new 1.1B code model for generation and... ___________________________________________________________________ SantaCoder: A new 1.1B code model for generation and infilling Author : moyix Score : 68 points Date : 2022-12-22 19:45 UTC (3 hours ago)
	web link (huggingface.co)
	w3m dump (huggingface.co)
	\| snowder wrote: \| I am having trouble getting the demo to run. It just errors out \| enum wrote: \| Give this notebook a shot: \| \| https://github.com/arjunguha/BigCode-demos/blob/main/bigcode... \| \| A GPU will help, but I found it passable on a CPU as well. \| osanseviero wrote: \| The demo is up again! \| seinecle wrote: \| Same here \| moyix wrote: \| Might be overloaded - if you have a GPU you can try running \| it locally by getting the model weights here: \| https://huggingface.co/bigcode/santacoder \| bogwog wrote: \| Any idea how much GPU memory you'd need to run this \| locally? \| \| EDIT: just tried it and it didn't seem to go past ~6gb \| lossolo wrote: \| It's 1 billion model with Fp16 precision so 4-6 GB max. \| 1024core wrote: \| Increase the number of tokens to a large number and you end up \| with masterpieces like this: \| \| def all_elements_in_range_excluding_and_including_and_excluding_a \| nd_including_and_excluding(sequence, start, end): \| ilaksh wrote: \| Is anyone else here building AI programming services based on \| models like this? I see a lot of comments saying the models can't \| do much programming. But I just suspect there must be a silent \| contingent that is also working on services like that. And maybe \| less likely to promote the abilities of these models because it \| encourages competition. \| aunch wrote: \| We are at Codeium (codeium.com)! Not the SantaCoder model \| specifically, but the same types of LLM architectures. We've \| started with AI-based code autocomplete, but we think there is \| a lot more we can do. \| \| We wrote up some of our learnings so far in @swyx's blog \| recently: https://lspace.swyx.io/p/what-building-copilot-for-x- \| really \| furyofantares wrote: \| What I would really like is something I saw someone talking \| about here; I'd like the editor to brighten text it finds \| "unexpected" which could immediately alert to bugs, or to the \| fact that the code I'm writing looks weird in some way and \| might either be restructured or accompanied by a comment. \| aunch wrote: \| Yep, these kinds of applications are on our mind! We \| consider autocomplete to be the "baseline" task since there \| are plenty of benchmarks and research to compare our \| model's performance to, but there's lots of things like \| highlighting code, upgrading to new libraries/conventions, \| etc that we can do with a good base model. \| videlov wrote: \| We built a semantic code search CLI tool (fully local and open \| source) using a similar model that I tuned \| https://github.com/sturdy-dev/semantic-code-search \| notwokeno wrote: \| I've been messing around some. Flan-T5 generates surprisingly \| close stuff occasionally for simple prompts like #square x or \| #sum the elements in the list. \| morgante wrote: \| We're building tools like this at Grit: https://www.grit.io/ \| \| These kinds of models are particularly good at repetitive, \| boring work like refactoring legacy code and completing \| framework migrations. Unlike Copilot, we've specialized \| specifically in these areas and completing them end-to-end \| (instead of just sitting in the IDE, we open already-verified \| PRs). \| ilaksh wrote: \| May I ask what model you are using? \| morgante wrote: \| We use a few depending on the task (Codex, fine-tuned T5, \| Bert models, etc.). Constantly experimenting with different \| variations. Since we focus on solving narrower problems in \| more depth, it leaves more room for optimizing accuracy. \| recursive wrote: \| I think my job is safe. def \| all_odd_prime_elements(sequence): """Returns every \| odd prime element of the sequence.""" return [x for x \| in sequence if x % 2 == 1] def \| all_even_prime_elements(sequence): """Returns every \| even prime element of the sequence.""" return [x for \| x in \| moyix wrote: \| Despite being only 1.1B params, SantaCoder outperforms Facebook's \| InCoder (6.7B params) and Salesforce's CodeGen-Multi-2.7B. \| \| Paper: \| https://hf.co/datasets/bigcode/admin/resolve/main/BigCode_Sa... \| \| Dataset search: https://huggingface.co/spaces/bigcode/santacoder- \| search \| \| Model weights: https://huggingface.co/bigcode/santacoder \| isoprophlex wrote: \| Any idea how this, and those other two models, would compare to \| GitHub Copilot? \| moyix wrote: \| Based on the reverse engineering done by Parth Thakkar [1], \| the model used by Copilot is probably about 10x as large (12B \| parameters), so I would expect Copilot to still win pretty \| handily (especially since the Codex models are generally a \| lot better trained than Salesforce CodeGen or InCoder). It's \| also a little bit hard to compare directly because as Parth \| documents, there are a lot of extra smarts that go into \| Copilot on the client side. \| \| The SantaCoder paper does have some benchmarks on MultiPL-E \| though, so you could compare them to the Codex results on \| that benchmark reported here (but keep in mind that code- \| davinci-002 is probably _even larger_ than the model used by \| Copilot): https://arxiv.org/abs/2208.08227 \| \| [1] https://thakkarparth007.github.io/copilot- \| explorer/posts/cop... \| zomglings wrote: \| Looks like Santa's going to be too busy to deliver presents this \| Christmas: def all_prime_elements(sequence): \| """Returns every prime element of the sequence.""" \| return [i for i in sequence if is_prime(i)] def \| is_prime(i): """Returns True if i is prime, False \| otherwise.""" if i == 2: return True \| if i == 3: return True if i == 5: \| return True if i == 7: return True \| if i == 11: return True if i == 13: \| return True if i == 17: return True \| if i == 19: return True if i == 23: \| return True if i == 29: return True \| if i == 31: \| rytill wrote: \| Are there model weights? \| moyix wrote: \| Yep! https://huggingface.co/bigcode/santacoder \| Traubenfuchs wrote: \| A few more "getting started" examples would be nice. \| TuringNYC wrote: \| Any VS Code extension? ___________________________________________________________________ (page generated 2022-12-22 23:00 UTC)