|
| rattlesnakedave wrote:
| Reading about FOSS copyright is so exhausting. I find no
| meaningful distinction between reading code and learning from it,
| vs feeding it into a model. I've heard the "it spits out foss
| code verbatim" argument, and I really don't buy that. I've never
| seen it. AI assisted software tooling is so powerful we really
| should consider the social benefits ahead of what is part of our
| existing legal framework.
| A4ET8a8uTh0 wrote:
| There is a risk, but the legal risk to individual users is yet to
| be decided.
|
| What I think is more concerning is that copilot is an extension
| of effectively automatic copying stuff from stack overflow with
| even lesser understanding of what the code does by the prompt
| writer.
|
| Do not get me wrong. I absolutely see the benefits, but the risk
| listed in the article seems less material than a further general
| decline in code quality. "Built by a human" may need to end up
| being a thing same way "organic" became a part of daily
| vocabulary.
| nottorp wrote:
| The problem is, all those people supporting Copilot in this
| thread can actually write said code without Copilot's help.
| Namely, they know what they're doing and the tool just saves
| them some typing.
|
| What happens when this extends to the "specialists" that
| blindly copy code off Stack Overflow? What happens when this
| becomes part of learning to program? Will it be as useful for
| producing working, efficient code when used by people who don't
| know what they're doing?
| TacticalCoder wrote:
| I saw there was some (unofficial) package for Emacs, reusing some
| vim Copilot integration. Anyone here tried Emacs+Copilot yet? Is
| it working fine? Out of curiosity I'd like to try it and, who
| knows...
|
| Also: does Copilot work for Clojure and is it any useful for
| Clojure?
| dpedu wrote:
| Like many folks here, I can write and read a variety of different
| programming languages. Some I've been using for a long time and
| know very well and some I seldom use but retain the basics.
|
| I don't use Copilot when writing languages I am very comfortable
| with because I'd rather write code that I completely understand.
| Or at least, understand to the best of my ability. I find it
| easier to consider edge cases and side effects when writing
| original code. Or at least, compared to reading someone else's
| that was ripped from a project you don't even know the goals of.
| I don't buy that Copilot improves productivity for this reason as
| well.
|
| I also avoid using Copilot when writing in languages I am
| unfamiliar with because I feel like it's robbing me of a learning
| experience. Or robbing me of repetition that improves my memory
| of how to do various things in the language.
|
| I don't know. Copilot is certainly impressive but there are too
| many questions - what I've mentioned and the legal ones in the
| OP. But perhaps that is a good thing? It is a new angle on
| copyright that we're going to have to answer one way or another.
| In programming and other fields.
| [deleted]
| iio7 wrote:
| There is a major difference between the help you can get from an
| IDE or editor with a language server running in the background,
| and then GitHub Copilot stealing away other peoples code.
|
| I sincerely hope Microsoft looses this law suit.
| tester756 wrote:
| I'd rather have some "middle-ground" solution instead of losing
| such a tool
|
| I don't see anything wrong with "stealing" code that was meant
| to be public
|
| Banning it brings no value in compare to those tools.
|
| Also, how is that different from Google's scraping whole
| internet?
| everyone wrote:
| I have not used it, but I don't understand how copilot could be
| useful. As a game programmer I don't spend much time actually
| writing final code. Most of my time is spent working stuff out on
| paper or writing little tests which I will discard.
|
| In general I want to write as little code as possible as more
| code = more problems. The code I _do_ write I want to put great
| care and craft into in order to keep it maintainable. Giving up
| any of my agency in this critical area seems like a terrible idea
| to me.
|
| Something that will help me write more code, or write code faster
| is of no benefit to me.
| goosesanta wrote:
| timojeajea wrote:
| I think you need to try it if you want to understand how it can
| be useful. I also tend to write as little code as possible.
| Since I started using Copilot, I don't write more code nor less
| code. I write the exact same code I would have written without
| Copilot, I'm just 25% more productive with it.
| everyone wrote:
| Are you a webdev? Cus I have been purely a game dev my whole
| career. I never wrote a single web-app until very recently
| when I learned some web frameworks to make simple backends
| for hobby projects of mine in my spare time.. I was kinda
| shocked how much boilerplate there is and how proscriptive
| the web frameworks are (I have done some node.js and asp.net)
| Also for non typed or compiled langauges like javascript the
| IDE support and autocomplete seems almost non-existent
| compared to what I am used to. I would imagine something like
| autopilot would be more useful in that context.
| haolez wrote:
| I haven't used it yet. I believe when people say that it's the
| future of development and that every dev will have to use it or
| be left behind, but I can't fathom how people are comfortable
| sending every iteration of their code to a big tech corporation.
| I can't wait to see the day where we can run such solutions in
| our personal computers (or personal cloud servers), but I feel
| that, in 2022, this type of tool is not yet worth the risk. I
| hope this is just a temporary obstacle in our way to our future
| AI-assisted programming.
| PUSH_AX wrote:
| > but I can't fathom how people are comfortable sending every
| iteration of their code to a big tech corporation.
|
| I assume you only ever use self hosted source control then? And
| then where is it hosted?
| TillE wrote:
| For private business code, yes? Of course.
|
| It's very easy to host your own GitLab server if you need a
| fancy web interface, and even easier to just put Git anywhere
| if you don't.
| hipsterstal1n wrote:
| Programmer: Uploads code to Github for the public to see / use
| Github: Uses code uploaded by programmers to learn and make other
| code better Programmer: NO FAIR! My code can only be used the way
| I want it to be and my code is absolutely unique and no one else
| has coded something like it
| Acen wrote:
| I think it kind of flows into two trains of thought in the
| against category. First off, that some people are worried about
| copywrited, private stuff being included in the training data.
| I've not read up on copilot recently, so not sure if this is a
| reasonable thing to be worried about or not.
|
| The other, is that people might be using Github to share stuff
| they've come up with other developers, but having an AI parse
| that information means that there's a disconnect between giver
| and receiver. It removes a chunk of the feedback loop being
| possible, which makes it so rather than it being a community of
| developers, it becomes something more akin to content creators
| and lurkers. That's not necessarily a bad thing, due to it
| opening up the sheer number of possible usages that end up
| using something. But it would minimize community feedback.
| tick_tock_tick wrote:
| People are way to attached to single function examples. I'm
| struggling to find any example that actually rise up to the
| requested "originality, creativity, and fixation" for copyright
| to apply.
|
| Just because something looks similar or is even identical doesn't
| mean copyright applies.
| shakna wrote:
| You might want to take a look at some of the pieces of code
| examined in Google vs Oracle before you decide small and
| obvious cannot bear copyright in the way that you think it
| does.
|
| That horrifying back and forth showed that lawyers can consider
| very small and obvious fragments of code to be absolutely
| copyrightable. And that it went on for nearly a decade, should
| tell you that none of this is simple.
| reidjs wrote:
| Can you give an example? Are they trademarking `for` loops or
| something?
| jpollock wrote:
| https://guides.lib.umich.edu/c.php?g=791114&p=5747565
|
| "Google also copied the nine-line rangeCheck function in
| its implementing code"
|
| Comparison between the two, discussed back in 2012:
|
| https://news.ycombinator.com/item?id=3940683
| Jaygles wrote:
| Whether or not a court would ultimately decide an instance of
| CoPilot code is copyright infringement or not isn't the main
| issue in my opinion. The creating opportunities for other
| people to sue you will be much more damaging. Lawsuits that you
| win will also be very expensive and its not guaranteed you'll
| get lawyers fees paid for by the losing party.
| renewiltord wrote:
| I'm going to keep using it. You won't stop me. You won't catch
| me. And I just need to read the next five tokens to know whether
| it's right.
| khiqxj wrote:
| you still on about copyright? what about the fact that it will
| just add vulns and bugs to your code? or is the industry so bad
| at this point that a gimmicky AI tool can do better
| powera wrote:
| This feels like another drama in the style of SCO v. Linux. Lots
| of FUD, little to nothing for end-users to actually worry about.
| tevon wrote:
| Yup. There is no chance in hell of them coming after USERS of
| copilot.
| falcolas wrote:
| If a company's code is audited (internally or externally),
| and GPL code is found, you can bet your ass the dev who
| committed that GPLed code will get a stern talking too, and
| the company will have to re-write that code.
|
| And that's just for GPL code. Code not under an OSS license
| could get way worse.
| ugh123 wrote:
| I'm really getting tired of lawyers, and collectively our "inner-
| lawyer", poo-pooing this merely for licensing and GPL issues,
| neither of which have any practical implication on anything a
| software engineer does.
|
| All this "controversy" around Copilot just reeks of a kind of
| technological "social justice" that most people didn't sign up
| for but seem happy to sit, watch, and commiserate on.
| goosesanta wrote:
| ianlevesque wrote:
| The structural completions are way more useful than the entire
| function completions, even in IntelliJ, where autocomplete is
| already extremely high quality.
|
| The part that I find unsettling when using Copilot is the risk
| that credentials or secrets embedded in the code, or being edited
| in (.gitignore'd) config files, are being sent off to Microsoft
| for AI-munging and possible human review for improvements to the
| model.
| PartiallyTyped wrote:
| > The structural completions are way more useful than the
| entire function completions, even in IntelliJ, where
| autocomplete is already extremely high quality.
|
| I needed to run a comparison over a window of a numpy array,
| and given the sheer size of my data, I needed it to be fast and
| efficient, which means vectorized operations with minimal
| python interaction. Copilot figured a solution that is orders
| of magnitude faster than what I could conjure up in 10 minutes,
| most of which I'd spent searching for similar solutions in SO.
| patmorgan23 wrote:
| You shouldn't have any credentials in your git repos anyway.
| GitHub will already scan your repos and alert you if it thinks
| there are any credentials in their.
| ianlevesque wrote:
| You've never temporarily put a key into a file while testing?
| Or accidentally pasted one for a second then deleted it? Can
| you say the same for your entire team or company?
|
| Since Copilot is constantly making new suggestions, a
| momentary entry is all it takes.
| ehutch79 wrote:
| credentials should never be committed. By the time you're
| ready to commit code, you should be reading from the
| environment or a config outside of the codebase, or at
| least .gitignore'd
|
| Once that key is in your git history, it's in the history.
| You might be able to edit it, but it's going to be a
| nightmare to do it.
| PartiallyTyped wrote:
| Copilot doesn't retrain on data generated by you in the
| moment; so I don't see why this is an issue unless you push
| the files - with the keys - to github.
| bugfix-66 wrote:
| It's interesting to consider how you might prevent training using
| a license without being too restrictive.
|
| Here is an example of a license that attempts to directly
| prohibit training. The problem is that you can imagine such
| software can't be used in any part of a system that might be used
| for training or inference (in the OS, for example). Somehow you
| need to additionally specify that the software is used
| directly... But how, what does that mean? This is left as an
| exercise for the reader and I hope someone can write something
| better: The No-AI 3-Clause License
|
| _This is the BSD 2-Clause License, unmodified except for the
| addition of a third clause. The intention of the third clause is
| to prohibit, e.g., use in the training of language models. The
| intention of the third clause is also to prohibit, e.g., use
| during language model inference. Such language models are used
| commercially to aggregate and interpolate intellectual property.
| This is performed with no acknowledgement of authorship or
| lineage, no attribution or citation. In effect, the intellectual
| property used to train such models becomes anonymous common
| property. The social rewards (e.g., credit, respect) that often
| motivate open source work are undermined._
| License Text:
|
| https://bugfix-66.com/7a82559a13b39c7fa404320c14f47ce0c304fa...
| echelon wrote:
| This is such a Luddite behavior.
|
| How much hubris we have as a species to think that our
| professions will endure until the end of the stars. To think
| that the software we write will be eternal.
|
| The thing that we do now is no different than spinning cotton.
|
| I'd be shocked if the total duration of human-authored
| programming lasted more than a hundred years.
|
| I'll also wager that in thirty years, "we'll" write more
| software in any given year than all of history up until that
| point.
| AlexandrB wrote:
| I'm all on board if the Microsoft's of the world are. But
| they choose to train their AI on OSS code and not their own
| codebase. So clearly they think similarly to the parent, they
| just want you to forget about that part when it suits them.
| echelon wrote:
| If we pass laws restricting the training on copyrighted
| information, the only organizations that will be able to
| train will be institutional.
|
| Microsoft would benefit from restriction. Not us.
| blibble wrote:
| would you pay for a product trained on say, the MS Teams,
| Sharepoint or Skype codebases?
|
| no, and no-one else would either
| EMIRELADERO wrote:
| What about fair use? (both in the copying made for training
| itself and the resulting output from the service)
| bugfix-66 wrote:
| We are witnessing a monstrous perversion of "fair use" and
| the greatest theft of intellectual property in human history.
| EMIRELADERO wrote:
| Do you measure IP's value using the amount of work/effort
| that was put into creating it, or only the end result?
|
| Currently US copyright law only cares about the end result.
| Effort has no meaning or bearing in any legal analysis of
| copyright matters.
| lloeki wrote:
| This is the BSD 2-Clause License: 1.
| Redistributions of source code must retain the above copyright
| notice, this list of conditions and the following disclaimer.
| 2. Redistributions in binary form must reproduce the above
| copyright notice, this list of conditions and the
| following disclaimer in the documentation and/or
| other materials provided with the distribution.
| THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
| CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
| WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
| WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
| PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
| COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY
| DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
| CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
| PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
| DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
| ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
| LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
| ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
| EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
| Presumably, as long as GitHub Copilot:
|
| a) fails to respect these itself, or
|
| b) present the user that is going to use its output verbatim or
| produce derivative code from it so that the user can respect
| these
|
| Then GitHub Copilot is either in violation of the license or a
| tool assisting in such a violation by stripping the license
| away+.
|
| From TFA:
|
| > David Heinemeier Hansson, creator of Ruby on Rails, argues
| that the backlash against Copilot runs contrary to the whole
| spirit of open source. Copilot is "exactly the kind of
| collaborative, innovative breakthrough that I'm thrilled to see
| any open source code that I put into the world used to enable,"
| he writes. "Isn't this partly why we share our code to begin
| with? To enable others to remix, reuse, and regenerate with?"
|
| I don't mean to disrespect DHH, but the "spirit of open source"
| isn't to wildly share code around as if it were public domain,
| because it is not, an author gets to choose within which
| framework their code gets to be used and modified++, otherwise
| one would have used public domain as a non-license + WTFPL for
| those jurisdictions where one can't relinquish their own
| creation into public domain.
|
| + depending on whether the "IA"/Microsoft can be held liable of
| the automated derivative, or if the end user is.
|
| ++ cue GPL vs MIT/BSD
| voz_ wrote:
| The spirit of this is good, but the implementation is garbage -
| you need a lawyer or team of lawyers to do this right. You
| grandstand and soapbox in this weakly written paragraph, and it
| hurts the whole thing. You discuss social rewards, intentions,
| etc. This just reads like a stallman-esque tirade
| [deleted]
| bugfix-66 wrote:
| Listen Mike, the fact that you look down on Richard Stallman
| says a lot.
|
| You attempted to dox me yesterday, and revealed my personal
| information in a public forum. I've reported you twice for
| this behavior, which I believe to be in violation of the
| rules of Hacker News.
|
| You and I work in the same city and in the same industry, at
| two companies that do a lot of business together. I write the
| cuda kernels you launch from your python program. I know
| exactly who you are, and you're probably going to meet me one
| day in the course of business (but you won't know it, because
| you don't know exactly who I am). You're making real-life
| enemies through your atrocious behavior, your personal
| attacks and violations of forum etiquette. It's going to
| catch up with you.
|
| I've had enough: stay clear of me.
| tevon wrote:
| Seems this article completely misses the benefits of copilot. Its
| a massive step forward in productivity. For me, its about
| suggesting proper syntax across the various libraries we use. It
| really does cut time by 10s of percent.
|
| I don't buy the argument that the risk of a yet-to-be-litigated
| case against a different company, who will certainly fight this
| hard; is greater than the productivity gain of using copilot.
|
| Additionally, the security argument feels ridiculous to me. We
| lift code examples from gists and stackoverflow ALL THE TIME! But
| any good dev doesn't just paste it in and go, instead we review
| the code snippet to ensure its secure. Same thing with copilot,
| of course its going to write buggy/insecure code, but instead of
| going to stackoverflow for a snippet its suggested in my IDE and
| with my current context.
| caseydm wrote:
| I was a naysayer but find copilot makes me more productive.
| Especially at writing tests. It's very good at recognizing
| patterns in your own work, and completing an entire test based
| on the function name.
| nsxwolf wrote:
| I tried to do this and I couldn't figure it out. I never got
| the sense that it knew anything about the code I had written,
| just that it was dreaming stuff up from its training set.
| ed_balls wrote:
| > It really does cut time by 10s of percent.
|
| I used it for about a month. It gave me a few false positive
| that really burned me - it's not worth the risk. Maybe future
| versions would be better.
| tevon wrote:
| What're the examples of false positives?
|
| Agreed it gets things wrong very frequently. But I've found
| it much easier to use its suggestion as another "input" to
| writing code.
| lolinder wrote:
| I've gotten plenty of false positives, but the mistakes turn
| up in testing and are pretty easy to spot when reviewing the
| code. Anything more subtle is likely to have been missed when
| written by hand anyway.
|
| What happened to burn you so badly?
| patrickthebold wrote:
| I don't understand how it improves productivity _that_ much.
| Most of my time isn't actually spent on syntax but rather
| reading Hacker news and making irrelevant comments.
| terracatta wrote:
| Using it in practice, the sheer quantity of suggestions (often
| one for every line) is fatiguing especially when 99% of the
| time they seem fine.
|
| I posit it becomes increasingly likely over large periods of
| time over many engineers that severe bug or security issue will
| be introduced via an AI provided suggestion.
|
| This risk to me is inherently different than the risk accepted
| that engineers will use bad code from Stack Overflow. Even
| Stack Overflow has social signals (upvotes, comments) that
| allow even an inexperienced engineer to quickly estimate
| quality. The amount of code used by engineers from Stack
| Overflow or blogs etc, is much smaller.
|
| Github Copilot is constantly recommending things and does not
| gives you any social signals lower experienced engineers can
| use to discern quality or correctness. Even worse, these are
| suggestions that are written by an AI that does not have any
| self-preserving motivations.
| visarga wrote:
| > I posit it becomes increasingly likely over large periods
| of time over many engineers that severe bug or security issue
| will be introduced via an AI provided suggestion.
|
| AI can also do code review and documentation helping us
| reduce the number of bugs. Overall it might actually help.
| throwaway675309 wrote:
| I would argue that this kind of problem is going to become
| less of an issue overtime, since they're going to have to
| also solve the issue of suggesting code samples from
| deprecated API versions - it's likely that eventually they'll
| figure out a similar way to promote more secure types of code
| in the suggestions based on Stack overflow or other types of
| ranking systems.
| visarga wrote:
| With millions of users accepting suggestions, then fixing
| them, they get tons of free labeling. They also train us to
| write better prompts and comments, helping them get quality
| data that is also in-distribution.
|
| Another path for evolution is to execute code and see the
| outcome. Language model -> code -> execution results ->
| feedback for learning.
| tevon wrote:
| This is a very solid argument. How do we fix that?
|
| THIS is the article I want to read!
| redleggedfrog wrote:
| "I posit it becomes increasingly likely over large periods of
| time over many engineers that severe bug or security issue
| will be introduced via an AI provided suggestion."
|
| I'll go one further with the "Co-pilot is stupid."
|
| It's supposed to be artificial _intelligence_. Why in the eff
| is it suggesting code with a bug or security issue? Isn 't
| the whole point that it can use that fancy AI to analyze the
| code and check for those kind of things on top of suggesting
| code?
|
| Half-baked.
| lolinder wrote:
| Copilot's default behavior is stupid. You can turn off auto-
| suggest so that it only recommends something when you prompt
| it to, and that should really be the default behavior. This
| would encourage more thoughtful use of the tool, and solve
| the fatigue problem completely.
|
| In IntelliJ, disabling auto complete just requires clicking
| on the Copilot icon in the bottom and disabling it. Alt+\
| will then trigger a prompt. I know there's a way to do this
| in VSCode as well, but I don't know how.
| joenot443 wrote:
| > I know there's a way to do this in VSCode as well, but I
| don't know how.
|
| I dug into this a bit, since I want the same functionality,
| I found I needed an extension called settings-cycler (https
| ://marketplace.visualstudio.com/items?itemName=hoovercj...)
| which lets one flip the
| 'github.copilot.inlineSuggest.enable' setting on and off
| with a keybind.
|
| Not sure who's in charge of the Copilot extension for VS
| Code, but if you're out there reading this, the people
| definitely want this :) Otherwise of course, your tool
| rocks!
| nprateem wrote:
| I switched it off and never remember to bother using it.
| It's obvious why it's enabled by default.
| khalilravanna wrote:
| This. If copilot suggests anything more than basic syntax or
| boilerplate I don't use it. If it writes code I don't
| understand or wouldn't be able to write myself I won't use it.
| Why? Because at the end of the day it's _my_ code. In what
| world is a good engineer submitting a PR for coworkers to look
| over that isn't their code?
|
| If this is a real issue the solution is not banning yet another
| tool. It's education. Teaching engineers how to properly
| understand code attribution and licenses.
| echelon wrote:
| Do you think we'll be writing software 200 years from now?
|
| 50? 25?
|
| I'll bet the people spinning cotton thought that would endure
| forever.
|
| (Sorry if my tone comes across as fervent. I'm excited to be
| displaced by this, because what follows is the stuff of
| dreams.)
| rmbyrro wrote:
| Yeah, in the future there will be only AIs developing apps
| and AIs using apps.
|
| There won't be apps, actually, they'll do everything
| programmatically.
|
| And all humans would have been killed by then in an AI
| doom.
| ThrowawayR2 wrote:
| These assisted coding systems are tremendously exciting but
| they are only the analogue of moving from a shovel to a
| powered excavator; it still needs a trained individual who
| knows what the final result needs to look like to a fairly
| high technical level to be effective. So, yes, 25-50 years
| from now humans will still be be the principal element in
| writing software.
| ben_w wrote:
| Between 2016 and 2021, I've been of the opinion that I
| cannot make any reasonable forecast of even vague large-
| scale social/technological/economic development past 2030,
| because the trends in technology go all funky around then.
|
| Thanks to recent developments in AI (textual and visual), I
| no longer feel confident predicting any of those things
| past about the beginning of 2028.
|
| It's not a singularity, it's an event horizon:
| https://kitsunesoftware.wordpress.com/2022/09/20/not-a-
| singu...
| Waterluvian wrote:
| Whenever I watch Geordi and Data doing something in
| engineering, they're often talking to the computer about
| constructing models and sims and such.
|
| To me this is the most ultimate form of declarative
| programming. Not that we will all be talking it out, but
| that we will explain in natural language what we're after.
|
| It maximizes how much time we spend in the "problem
| understanding/solving" phase and minimizes the tedium of
| actually setting up the apparatus.
| yonaguska wrote:
| The invention of the cotton gin simply moved people from
| spinning cotton to picking cotton. And increased demand for
| slaves.
|
| I'm not excited to be displaced personally, but I'm also
| not really worried about being displaced. If displacement
| is inevitable, I don't see how the average programmer is
| going to leverage this for the "stuff of dreams". Usually,
| tech advancements result in a greater consolidation of
| wealth into the hands of those that already own capital.
| Recent tech is no exception. Yes, there has been a lot of
| wealth created for regular people, but we're still working
| 40+ hour weeks, and earnings have not matched the increase
| in productivity.
|
| What I am concerned about is that our field is becoming
| increasingly arcane magic for the younger generations,
| especially the masses that are being completely and utterly
| failed by the education system.
| bravetraveler wrote:
| I apologize ahead of time for rambling, but I'm with you
| on this!
|
| In my coworkers and many of the applicants we see,
| there's a trend of over optimization. The common meme is
| the 'leet code' interview process.
|
| I suppose the best way I can convey this is... I think
| there's hyper focus on the mechanics of doing things.
| Making people not afraid of the code, unaware of the
| world around it
|
| Abandoning a lot of thought for process. Or even the
| physical systems it runs on. I recently learned about the
| term 'mechanical sympathy'
|
| Sometimes it's important to ask if you need the code or
| system at _all_!
|
| I know it's not fair to people but I groan any time I see
| a CS degree
| tines wrote:
| I mean, yes? People will be doing math as long as there are
| people around to do it. It'll look different, sure. But
| there will always be problems, and math/programming is
| problem solving par excellence.
| [deleted]
| tick_tock_tick wrote:
| I don't see a world where programming isn't the last thing
| to go. We pretty much have a general intelligence when a
| "programmer" is no longer needed. That doesn't mean
| programming will look anything like it does today in 200
| years but will the profession, doing kinda the sameish
| thing, still exist? Absolutely!
| bcrosby95 wrote:
| It's interesting to think about. If programming can be
| automated away, then you can use that automation to
| automate away any job in the world that can be automated.
| tevon wrote:
| Yes! Exactly.
|
| The article suggests that he wants to know "who wrote the
| code" if a senior dev he trusts submits a PR. He doesn't want
| to be surprised that "the AI" wrote some of this code.
|
| But its ALL written by the senior dev. If he trusts that dev,
| that means that dev has thoroughly read and tested his code!
| That's the important bit. Remembering proper
| syntax/imports/nesting levels is the tiniest piece of writing
| good code. And copilot can take that off our hands.
| sieabahlpark wrote:
| falcolas wrote:
| That's like saying that code copy/pasted from OSS projects
| on github was "written by the developer". Which is not
| true.
|
| The speed of your developer and the correctness and test
| coverage of your code _doesn 't matter_ when it comes to
| license compliance.
|
| And license compliance could cost your company 100x (if not
| more) the value of your best software developer -
| especially for the non-OSS licenses.
| khalilravanna wrote:
| > That's like saying that code copy/pasted from OSS
| projects on github was "written by the developer".
|
| I don't think that's what OP is saying. What I think OP
| is saying (and I agree) is that submitted code is trusted
| if you trust the source. If you take the person putting
| code in front of you and ask "Would this person copy
| someone else's code and submit it as their own" and the
| answer is "No they would not copy code" then every step
| that trusted-person took to get to that code is
| immaterial. Whether they used StackOverflow or Copilot or
| whatever AI assisted code generating tools do or don't
| get developed in the future. At the end of the day a
| good, trustworthy engineer isn't going to use licensed
| software by "accident"[1].
|
| 1. I put "accident" in quotes because it seems so crazy
| to me that someone would start writing a method "doThing"
| and then CoPilot spits out a licensed implementation of
| "doThing" and the engineer would look at it and go "This
| seems fine."
| falcolas wrote:
| > every step that trusted-person took to get to that code
| is immaterial.
|
| Which is, unfortunately, completely useless when it comes
| to copyright infringement. Trust in the individual will
| not change the output of an audit for copyrighted code,
| or the results from said audit.
|
| The only thing that a "trusted" individual can contribute
| in a copyright infringement investigation is attesting
| that they did not know that the code they put in the
| codebase was copyrighted. And all that does is save the
| company from getting the higher "willful infringement"
| fines, if it should get that far.
|
| Wilful Infringement Damages:
| https://www.ce9.uscourts.gov/jury-instructions/node/708
| iceburgcrm wrote:
| It was written by the developer. If I write down lyrics I
| remember I still wrote it. Whether I have the copyright
| to make money off of it or whether it is trademarked are
| different things.
|
| You could state they are not the first to write this
| which would be more correct.
| falcolas wrote:
| GitHub Copilot has been concretely demonstrated to emit
| significant chunks of OSS licensed code.
|
| Significant enough that if the license is GPL (which some
| has been) it will "taint" the entire codebase and license
| it under GPL. Significant enough to be found by automated
| OSS audit tools, which would trigger a re-write and
| education for the developer who committed it.
|
| EDIT:
|
| > If I write down lyrics I remember I still wrote it.
|
| Not from a copyright point of view. The rights to those
| lyrics belong to the songwriter. It's kinda like
| photographs. You don't automatically have the right to
| distribute a photograph of yourself that was taken by
| someone else.
| warkdarrior wrote:
| > Significant enough that if the license is GPL (which
| some has been) it will "taint" the entire codebase and
| license it under GPL. Significant enough to be found by
| automated OSS audit tools, which would trigger a re-write
| and education for the developer who committed it.
|
| That "significant enough [...] to taint the entire
| codebase" remains to be decided in court.
| ekidd wrote:
| Several of the byte-for-byte copies pointed out by open
| source authors were longer than 20 lines, and contained
| verbatim comments.
|
| I am not a lawyer, but that's been enough to get people
| in legal trouble in the US.
| [deleted]
| LeifCarrotson wrote:
| In Intellij or Visual Studio, syntax suggestion/tab completion
| are already great. Those technologies - which involve none of
| the legal risks of Copilot- are a massive step forward in
| productivity. Copilot does help extend these benefits to other
| languages that I occasionally dabble in, like Lua and embedded
| C, though it's clearly better in languages which are better
| represented in its dataset.
|
| I don't find the natural language comment to buggy algorithm
| part of Copilot to be particularly useful. I know some people
| asked to be able to write a "DoWhatIMean(), method, but
| programmers really only wanted that to auto-expand to
| "protected virtual void DoWhatIMean() {}" without having to
| wait 30 seconds to check for a compile error and see if it was
| protected void virtual or protected virtual void...
| lolinder wrote:
| > In Intellij or Visual Studio, syntax suggestion/tab
| completion are already great. Those technologies - which
| involve none of the legal risks of Copilot- are a massive
| step forward in productivity. Copilot does help extend these
| benefits to other languages that I occasionally dabble in,
| like Lua and embedded C, though it's clearly better in
| languages which are better represented in its dataset.
|
| Copilot is _so_ much beyond regular autocomplete that it 's
| playing a completely different game.
|
| I've been using it today while writing a recursive descent
| parser for a new toy language. I built out the AST in a
| separate module, and implemented a few productions and tests.
|
| For all subsequent tests, I'm able to name the test and ask
| Copilot to write it. It will write out a snippet in my custom
| language, the code to parse that snippet, and construct the
| AST that my parser should be producing, then assert that my
| output actually does match. It does this with about 80%
| accuracy. The result is that writing the tests to verify my
| parser takes easily 25% of the time that it has when I've
| done this by hand.
|
| In general, this is where I have found Copilot really shines:
| tests are important but boring and repetitive and so often
| don't get written. Copilot has a good enough understanding of
| your code to accurately produce tests based on the method
| name. So rather than slogging through copy paste for all the
| edge cases, you can just give it one example and let it
| extrapolate from there.
|
| It can even fill in gaps in your test coverage: give it a
| @Test/#[test] as input and it will frequently invent a test
| case that covers something that no test above it does.
| lambdadmitry wrote:
| Thing is, for something like an AST parser you want a
| property test, not a bunch of autogenerated boilerplate.
|
| Generally, if something is boring and repetitive it's
| probably shouldn't be written, better code generation is
| rarely a good answer.
| insanitybit wrote:
| Indeed. I had to write a graph traversal iterator in Rust and
| Copilot wrote the entire thing for me. I could have written it
| myself, it would have looked similar, but it just... did it. It
| was trivially to test and verify correctness.
|
| That's minutes of work, maybe even 10 minutes, turned into
| seconds. That is huge.
|
| The risk here is extremely low. Who is going to sue consumers
| of Copilot? It makes no sense. They'll sue Microsoft and, in a
| decade, we'll see if the win or lose (IMO Microsoft will win,
| but it's not important).
| boxed wrote:
| Did it "write it for you"? Or did it "illegally copy it for
| you"? That's a very big difference.
|
| I'm not claiming that you can't get big productivity boosts
| by ripping off code like a crazy person. I bet you can! But
| should you?
| zackees wrote:
| insanitybit wrote:
| I don't really care, it's a trivial algorithm that I would
| have written virtually identically.
| cauefcr wrote:
| Yes, software copyright and patents are a mistake.
| thesuperbigfrog wrote:
| >> Yes, software copyright and patents are a mistake.
|
| Richard Stallman would agree, but there are many of us
| who make a living writing software.
|
| Is software valuable enough that people will pay money
| for it?
|
| If you write original software that solves a problem,
| shouldn't you be able to license it how you want and
| profit from it?
|
| You are welcome to license the software you create how
| you want. Let me license the software I create how I
| want.
|
| If I dual license my software as GPL and commercial and
| GitHub Copilot reproduces my GPLed code without
| attribution and without the license, how it that not
| copyright violation?
| rattlesnakedave wrote:
| Do you find meaningful distinction between an individual
| reading your code and copying patterns vs an AI model
| doing the same?
| thesuperbigfrog wrote:
| No, provided that both give proper attribution and follow
| the license the code is released under.
| rattlesnakedave wrote:
| That's a hilarious expectation. How often do you give
| attribution to inventors of patterns you use in your
| software?
| throwaway675309 wrote:
| Nothing has been decided in a court of law so saying that
| it's "illegal" is disingenuous.
|
| Even if it's remarkably similar to another function from a
| completely different code base but some of the symbols or
| variable names or function name has been changed, I would
| argue that it still falls under fair use, and is
| sufficiently transformative.
| woah wrote:
| I always get the impression that CoPilot critics have never
| actually used it to get any work done and are basing their
| criticism solely on a tweet they saw about the Quake square
| root copy pasta function
| lelandfe wrote:
| The article itself lists three other recent examples, two of
| which are clearly copyright infringement
| https://twitter.com/DocSparse/status/1581461734665367554
|
| It is not a theoretical concern
| falcolas wrote:
| Oof. LGPL. That "time saver" will infect your entire
| codebase and open your company to sizable liability.
|
| Even if they're never sued, companies will do internal OSS
| scans to limit their risks which would catch this. The
| result would be (at minimum) a talking to for the dev who
| committed it, and developer time spent doing a clean room
| re-write.
| charcircuit wrote:
| >will infect your entire codebase
|
| No, it won't. It will only infect the resulting binary.
| cromka wrote:
| > Same thing with copilot, of course its going to write
| buggy/insecure code, but instead of going to stackoverflow for
| a snippet its suggested in my IDE and with my current context.
|
| copilot actually can have the benefit here of being able to
| retroactively mark some snippet as insecure, if it gets flagged
| as such by the moderators. Any user who used it could get
| automatic notification.
| lelandfe wrote:
| > I don't buy the argument that the risk... is greater than the
| productivity gain of using copilot.
|
| How does your company's general counsel feel?
|
| This article is written at CTOs, not engineers.
| tevon wrote:
| We don't have GC (too small), so caveat my take with the fact
| that I'm writing from a smaller companies perspective.
|
| May be different for a larger, value-preserving company who
| would face more scrutiny.
|
| That being said, I still find it extremely unlikely that
| there would be legal ramifications from using a product being
| pushed by one of the largest software companies in the world.
| Why go for a user and not Microsoft themselves?
| fuckstick wrote:
| jen20 wrote:
| > Why go for a user and not Microsoft themselves?
|
| 1) the user likely doesn't have the legal resources of
| Microsoft.
|
| 2) the user is the one committing the infringement.
|
| If Microsoft stood behind this they could offer to
| indemnify users against lawsuits relating to CoPilot usage,
| but they don't.
| tsimionescu wrote:
| > That being said, I still find it extremely unlikely that
| there would be legal ramifications from using a product
| being pushed by one of the largest software companies in
| the world.
|
| Microsoft is _explicitly_ saying it 's your responsibility
| to check if the Copilot's output that you add to your
| codebase is not infringing on anyone's license.
|
| Also, it's actually a complex legal question if Copilot
| itself is infringing anyone's copyright. But, there is no
| doubt _whatsoever_ that you don 't have the right to
| distribute someone else's copyrighted code (without a
| license) just because it was produced by Copilot and not
| manually copied by you. And it is also very clear that
| Copilot can occasionally generate larger pieces of someone
| else's code.
|
| Edit: fixed typos
| ninkendo wrote:
| > Microsoft is explicitly saying it's your responsibility
| to check if the Copilot's output that you ads to your
| codebase is infringing on anyone's license.
|
| (Never used copilot)
|
| Wow, this is kinda shocking IMO. It kind of negates the
| entire value proposition of the tool.
|
| How am I supposed to find out whether a snippet is
| infringing? Should I paste it into google or something?
| Shouldn't Copilot be the one to _tell_ me if a snippet
| too-closely matches some existing code it learned from?
|
| If MS is indeed saying this, I feel like it's something
| they put in the agreement to cover their own asses.
| There's no way they'd really expect everyone to do this
| sort of thing. Moreover I don't feel that's a very strong
| defense MS could use in court if somebody decides to go
| after MS for making the tool that makes infringement so
| easy. It sounds like one of those "wink wink" types of
| clauses that they know full well nobody will follow.
| tsimionescu wrote:
| From the official FAQ [0]:
|
| > Other than the filter, what other measures can I take
| to assess code suggested by GitHub Copilot?
|
| > You should take the same precautions as you would with
| any code you write that uses material you did not
| independently originate. These include rigorous testing,
| _IP scanning_ [emphasis mine], and checking for security
| vulnerabilities. You should make sure your IDE or editor
| does not automatically compile or run generated code
| before you review it.
|
| I think lots of companies do run tools such as BlackDuck
| and others to scan their entire code base and ensure (or
| at least have some ass-covering) that there is no
| accidental copyright infringement.
|
| [0] https://github.com/features/copilot#other-than-the-
| filter-wh...
| coredog64 wrote:
| How much of what you save by using Copilot will then be
| spent on BlackDuck licenses?
| warkdarrior wrote:
| Capex vs opex, huge difference
| jzelinskie wrote:
| I suspect prohibiting Copilot will just become another
| checkbox on compliance security questionnaires. The fact that
| Kolide can detect it and that Kolide can feed compliance
| suites like Vanta or SecureFrame means the infrastructure is
| already there. It's not only your lawyers that want these
| guarantees, it's often your customers.
| yamtaddle wrote:
| Does Microsoft let their developers use it? Say, when working
| on Windows? If not, I'd say the very _vendor of the software_
| considers it radioactive, so I 'll keep treating it that way,
| too.
| aliqot wrote:
| Is 'what would a microsft dev do' really the bar we want to
| live by, though?
| patmcc wrote:
| Say what you want about microsoft - they've got some of the
| best lawyers in the world on this kind of stuff. If they're
| not doing it they either don't trust the tech or don't
| trust the law.
| patmorgan23 wrote:
| No but it's not a bad litmus test in this situation.
| yamtaddle wrote:
| In this case, yes, of course--I don't really get your
| objection. If their own legal counsel is advising them not
| to let their developers use _their own product_ over legal
| concerns (and what else could be the reason?) that would be
| a pretty good argument against anyone else using it.
|
| Nb. I don't know whether they do or do not, in fact, let
| their developers use it.
| falcolas wrote:
| > instead we review the code snippet to ensure its secure
|
| Doesn't matter. A developer's speed and test completeness and
| code quality matter not one whit when it comes to licensing.
| That 10x developer could mire the company in fines and code re-
| writes if they include copyrighted code, especially if it's not
| OSS.
| whateveracct wrote:
| Physically coding is not at all where I spend the majority of
| my time at work or on personal projects. I exclusively use
| Haskell though, so maybe that has more to do with it.
|
| But why optimize a non-critical path?
| endisneigh wrote:
| If copilot is fine, then software licenses are meaningless imo
| charcircuit wrote:
| This article makes a big mistake. It assumes copyright
| infringement is extremely bad and would never be worth doing. In
| practice when have people been sued over misusing open source
| software? You most likely won't be caught. And even if you are
| you can rewrite the code / give attribution then. Even if you do
| end up having to pay damages, the productivity increase for your
| company using copilot may be worth the damages.
| plgonzalezrx8 wrote:
| How to make it to the front page in any tech forum:
|
| Step 1: "GitHub Copilot Bad.... amirite!>"
|
| Snark aside, most of these articles miss the mark to the point
| where they seem like the author is tech illiterate and is just
| parroting soundbites from others' opinions.
| no_butterscotch wrote:
| It isn't worth price, I was in the beta and thought it was good.
| But I'm hoping a better alternative that's cheaper comes about.
| eloff wrote:
| Do you make less than minimum wage? Because even at minimum
| wage it saves me enough time a month to pay for itself. In my
| opinion it has a positive ROI after a single day.
| Kiro wrote:
| I would say the risk is minimal. You need to bait Copilot really
| hard for it to produce anything coherent from existing code.
| That's simply not how you use it.
|
| Regardless, the risk need to be really big for me to stop using
| it. It's such an essential tool for me now that I get shocked how
| crippled I feel when internet stops working and I realize how
| much I depend on it.
| samiam_iam wrote:
| File under bullshit
| mring33621 wrote:
| 1) Starting off, I support AI/ML-based code
| generation/completion. I would be very happy for the day when I
| can figuratively wave my hand and get 80-90% of what I need.
|
| 2) It might be fair to allow authors to submit repos, along with
| some sort of 'proof of ownership' to Copilot, in order to exclude
| them from the training set. There might have to be an documented
| (agreed-upon?) schedule for 'retraining', in order for the
| exclusion list to take effect in a timely manner.
|
| 3) Or just allow authors to add a robots.txt to their repos,
| which specifies rules for training.
|
| Just a few thoughts...
| VBprogrammer wrote:
| Pushing the responsibility onto copyright owners rather than
| GitHub / Microsoft / Copilot seems unreasonable. I'm all for AI
| being used like this but it also needs to come with some checks
| and balances to ensure it's not just regurgitation copyright
| code.
| mring33621 wrote:
| OK, then just use existing copyright licensing:
|
| If a permissive, biz-friendly license (Apache 2.0, maybe
| others) is found in a given Repo, then it can be used in
| training set
|
| Otherwise, the repo cannot be used in a training set
| mbreese wrote:
| And then every snippet ever created with that trained data
| would have to include an acknowledgement for every
| repository included in the training set.
|
| The LICENSE file would be longer than the rest of the code.
|
| (FWIW, I agree with you theoretically, but practically it's
| hard to get your head around what the ramifications of that
| would mean)
| coredog64 wrote:
| If Joe Bag'O'Donuts copies and pastes LGPL code into his
| own personal repository that has MIT license attached, is
| it safe for Copilot to train on it?
|
| I'm really of the opinion that MS needs to document the
| training set and include a high bar for inclusion of
| additional repos.
| leni536 wrote:
| Many permissive licenses (including Apache 2.0) require
| attribution.
| leni536 wrote:
| Re 2: So a DMCA notice?
| thwayunion wrote:
| Context: Kolide just launched a "GitHub Copilot Check" which you
| can get (along with other features) for $7/device/month. The
| article is marketing -- an attempt to induce demand among CTOs
| for an already developed product.
|
| That said: I generally agree with the assessment. Github should
| at the very least be telling users when it is generating code
| that they trained on. Until it does that, it's kind of dangerous
| to use. The security stuff is imo more of a red herring.
|
| But the more important point is that you can just wait a year and
| hire a consultant to build a better product (for you) at pretty
| low cost. Within a year, any organization with a non-trivial
| number of developers will have the option of hosting their own
| model trained on The Stack (all permissively licensed) and fine-
| tuning it on their internal code or their chosen stack. That's
| probably the best path forward for most organizations. If you can
| afford $7/dev/month for Slack-integrated nannybots you can
| definitely afford to pay a consultant/contractor to setup a
| custom model and get the best of both worlds -- not giving MSFT
| your company's IP while also improving your dev's productivity
| and happiness beyond what a generic product could deliver.
| cdolan wrote:
| I usually complain about "thought pieces" that push a product
| at the end.
|
| But now I realize I like that _a lot more_ than being aware
| that the article I 'm reading is going to push me to take an
| action (start a discussion with my team) and a probable outcome
| is "enforce no Co Pilot on company machines".
|
| Sneaky! Good catch. Article should have a disclaimer at the
| bottom
| freefaler wrote:
| There is some legal risk, but what percent of code you write is
| potentialy affected by audits before you sell it? So you're
| trading as a single developer real productivity gain and as a
| company lower costs for a potential "liability" when you're
| selling your company. Looks like a good bet. A lot of code will
| be thrown out or never be sold to anyone.
| donatj wrote:
| A couple days ago I wrote a new class. Went to write a unit test,
| it wrote several hundred lines of functioning unit test for me.
| It's worth it.
| ralph84 wrote:
| "You might get sued if you use this software you paid for" is
| already covered via an indemnification clause in any reasonable
| enterprise software license agreement. I'm sure Microsoft/GitHub
| will be no different in indemnifying their customers who purchase
| Copilot.
| abelaer wrote:
| I have been writing my PhD thesis in VSCode with copilot enabled,
| and it it absurdly good at suggestions in Latex, from generating
| tables to writing whole paragraphs of text in the discussion.
| eloff wrote:
| Please don't use copilot, decide it's not worth the risk for your
| company. In the great competition that is the labor market,
| copilot is giving me a leg up on everyone who isn't using it.
| It's the biggest single tool based improvement to my productivity
| since JetBrains.
| smcleod wrote:
| I was sort of thinking the same thing - it's been such a
| positive impact on my productivity and time. If other people
| don't want to use it - don't, but you're not going to stop me
| and it's only going to get better as more competition arises
| and we finally have decent on-device options.
___________________________________________________________________
(page generated 2022-11-17 23:01 UTC) |