proxy70

	[HN Gopher] The Inherent Limitations of GPT-3 ___________________________________________________________________ The Inherent Limitations of GPT-3 Author : andreyk Score : 58 points Date : 2021-11-29 17:57 UTC (5 hours ago)
	web link (lastweekin.ai)
	w3m dump (lastweekin.ai)
	\| andreyk wrote: \| Author here, would love feedback / thoughts / corrections! \| skybrian wrote: \| Another limitation to be aware of is that it generates text by \| randomly choosing the next word from a probability \| distribution. If you turn that off, it tends to go into a loop. \| \| The random choices improve text generation from an artistic \| perspective, but if you want to know why it chose one word \| rather than another, the answer is sometimes that it chose a \| low-probability word at random. So there is a built-in error \| rate (assuming not all completions are valid), and the choice \| of one completion versus another is clearly not made based on \| meaning. (It can be artistically interesting anyway since a \| human can pick the best completions based on _their_ knowledge \| of meanings.) \| \| On the other hand, going into a loop (if you always choose the \| highest probability next word) also demonstrates pretty clearly \| that it doesn't know what it's saying. \| Flankk wrote: \| 65 years of research and our cutting-edge AI doesn't have a \| memory? Excuse me if I'm not excited. It's likely that most of \| the functions of the human brain were selected for intelligence. \| Such a focus on learning when problem solving and creativity are \| far more interesting. \| manojlds wrote: \| Do our aeroplanes flap their wings like the birds do? \| \| GPT-3 is obviously not the AI end goal, but we are on the path \| and the end might lead to aeroplanes than flapping machines. \| Flankk wrote: \| Birds don't need 150,000 litres of jet fuel to fly across the \| ocean. Given that the development of airplanes was made by \| studying birds I'm not sure I see your point. The 1889 book \| "Birdflight as the Basis of Aviation" is one example. \| ska wrote: \| > but we are on the path \| \| This isn't actually clear; with things like this we are on \| _a_ path but it may not lead anywhere that fundamental (at \| least when we are talking "AI", especially general AI). \| PaulHoule wrote: \| I'm trying to put my finger on the source of moral decay that \| led to so many people behaving as if the GPT-3 emperor wears \| clothes. \| \| In 1966 it was clear to everyone that this program \| \| https://en.wikipedia.org/wiki/ELIZA \| \| parasitically depends on the hunger for meaning that people \| have. \| \| Recently GPT-3 was held back from the public on the pretense \| that it was "dangerous" but in reality it held back because it \| is too expensive to run and the public would quickly learn that \| it can answer any question at all... if you don't mind if the \| answer is right. \| \| There is this page \| \| https://nlp.stanford.edu/projects/glove/ \| \| under which "2. Linear Substructures" there are four \| projections of the 50-dimensional vector space that would \| project out just as well from a random matrix because, well, \| projecting 20 generic points in a 50-dimensional space to \| 2-dimensions you can make the points fall exactly where you \| want in 2 dimensions. \| \| Nobody holds them to account over this. \| \| The closest thing I see to the GPT-3 cult is that a Harvard \| professor said that this thing \| \| https://en.wikipedia.org/wiki/%CA%BBOumuamua \| \| was an alien spacecraft. It's sad and a little scary that \| people can get away with that, the media picks it up, and they \| don't face consequences. I am more afraid of that than I am \| afraid that GPT-99381387 will take over the world. \| \| (e.g. growing up in the 1970s I could look to Einstein for \| inspiration that intelligence could understand the Universe. \| Somebody today might as well look forward to being a comic book \| writer like Stan Lee.) \| thedorkknight wrote: \| Confused. If professor Loeb tries to at least open discourse \| to the idea that ET space junk might be flying around like \| our space junk in a desire to reduce the giggle factor around \| that hypothesis, what sort of "consequences" do you think he \| should face for that? \| wwweston wrote: \| > the public would quickly learn that it can answer any \| question at all... if you don't mind if the answer is right. \| \| There appear to be an awful lot of conversations in which \| people care about other things much, much more than what is \| objectively correct. \| \| And any technology that can greatly amplify engagement in \| that kind of conversation probably _is_ dangerous. \| [deleted] \| canjobear wrote: \| GPT3 and its cousins do things that no previous language \| model could do; it is qualitatively different from Eliza in \| its capabilities. As for your argument about random \| projections in the evaluation of GLoVE, comparisons with \| random projections are now routine. See for example \| https://aclanthology.org/N19-1419/ \| NoGravitas wrote: \| Why do you say it is qualitatively different from Eliza in \| its capabilities? \| PaulHoule wrote: \| It does something totally different. However that totally \| different still depends on people being desperate to see \| intelligence inside it. It's like how people see a face \| in a cut stem or on Mars. \| canjobear wrote: \| What is your criterion for "truly" detecting \| intelligence? Do you have a test in mind that would \| succeed for humans and fail for GPT3? \| NoGravitas wrote: \| Is it because it does something totally different that \| you came to me? \| rytill wrote: \| You're trying to prove some kind of point where you \| respond as ELIZA would have to show how "even back then \| we could pass for conversation". The truth is that GPT-3 \| is actually, totally qualitatively different and if you \| played with it enough you'd realize. \| not2b wrote: \| The difference is quantitative, rather than qualitative, \| as compared to primitive Markov models that have been \| used in the past. It's just a numerical model with a very \| large number of parameters that extends a text token \| sequence. \| \| The parameter size is so large that it has in essence \| memorized its training data, so if the right answer was \| already present in the training data you'll get it, same \| if the answer is closely related to the training data in \| a way that lets the model predict it. If the wrong answer \| was present in the training data you may well get that. \| bangkoksbest wrote: \| It's a legitimate practice in science to speculate. Having \| heard the Harvard guy explain more fully the Oumuamua thing, \| it's struck me as perfectly fine activity for some scientist \| to look into. His hypothesis is almost certainly going to be \| untrue, but it's fine to investigate a bit of a moonshot \| idea. You don't want half the field doing this, but you \| absolutely need different little pockets of speculative work \| going on in order to keep scientific inquiry open, dynamic, \| and diverse. \| Groxx wrote: \| The current leading purchase-able extremely-over-hyped-by-non- \| technicals language model has no memory, yes. \| \| You see the same thing in all popular reporting about science \| and tech. Endless battery breakthroughs that will quadruple or \| 10x capacity become a couple percent improvement in practice. \| New gravity models mean we might have practical warp drives in \| 50 years. Fusion that's perpetually 20 years away. Flying cars \| and personal jetpacks. Moon bases, when we haven't been on the \| moon since the 70s. \| \| AI reporting and hype is no different. Maybe slightly worse \| because it's touching on "intelligence", which we still have no \| clear definition of. \| naasking wrote: \| > It's likely that most of the functions of the human brain \| were selected for intelligence. \| \| That doesn't seem correct. Intelligence came much later than \| when most of our brain evolved. \| PaulHoule wrote: \| Intelligence involves many layers. \| \| _Planaria_ can move towards and away from things and even \| learn. \| \| Bees work collectively to harvest nectar from flowers and \| build hives. \| \| Mammals have a "theory of mind" and are very good at \| reasoning about what other beings think about what other \| beings think. For that matter birds are pretty smart in terms \| of ability to navigate 1000 miles and find the same nest. \| \| People make tools, use language, play chess, bullshit each \| other and make cults around rationalism and GPT-3. \| naasking wrote: \| "Adaptation" is not synonymous with "intelligence". The \| latter is a much more narrowly defined phenomenon. \| pfortuny wrote: \| memory is something shared by... one might even say plants. \| But let us keep to animals: almost anyone, including worms. \| gibsonf1 wrote: \| In addition to that subtle memory issue, it has no reference at \| all to the space/time world we people model mentally to think \| with. So, basically, there is no I in the GPT-3 AI, just A. \| PaulHoule wrote: \| One can point to many necessary structural features that it \| is missing. Consider Ashby's law of requisite variety: \| \| https://www.edge.org/response-detail/27150 \| \| Many GPT-3 cultists are educated in computer science so they \| should know better. \| \| GPT-3's "one pass" processing means that a fixed amount of \| resources are always used. Thus it can't sort a list of items \| unless the fixed time it uses is humongous. You might boil \| the oceans that way but you won't attain AGI. \| \| There are numerous arguments along the line of Turing's \| halting problem that restrict what that kind of thing can do. \| As it uses a finite amount of time it can't do anything that \| could require an unbounded time to complete or that could \| potentially not terminate. \| \| GPT-3 has no model for dealing with ambiguity or uncertainty. \| (Other than shooting in the dark.) Practically this requires \| some ability to backtrack either automatically or as a result \| of user feedback. The current obscurantism is that you need \| to have 20 PhD students work for 2 years to write a paper \| that makes the model "explainable" in some narrow domain. \| With this insight you can spend another $30 million training \| a new model that might get the answer right. \| \| A practical system needs to be told that "you did it wrong" \| and why and then be able to correct itself on the next pass \| if possible, otherwise in a few passes. Of course a system \| like that would be a real piece of engineering that people \| would become familiar with, not a outlet for their religious \| feelings that is kept on a pedestal. \| gibsonf1 wrote: \| The big issue is that it literally knows nothing - there is \| no reference to a model of the real world such as humans \| use when thinking about the real world. It is a very \| advanced pattern matching parrot, and in using words like a \| parrot, knows nothing about what those words mean. \| PaulHoule wrote: \| Exactly, with "language in language out" it can pass as a \| neurotypical (passing as a neurotypical doesn't mean you \| get the right answer, it means if you get a wrong answer \| it is a neurotypical-passing wrong answer.) \| \| Actual "understanding" means mapping language to \| something such as an action (I tell you to get me the \| plush bear and you get me the plush bear,) precise \| computer code, etc. \| macrolocal wrote: \| I'm inclined to agree, but positing that "the meaning of \| a word is its use in a language" is a perfectly \| respectable philosophical position. In this sense, GPT3 \| empirically bolsters Wittgenstein. \| narrator wrote: \| >There are numerous arguments along the line of Turing's \| halting problem that restrict what that kind of thing can \| do. As it uses a finite amount of time it can't do anything \| that could require an unbounded time to complete or that \| could potentially not terminate. \| \| I have used a similar argument to show that the simulation \| hypothesis is wrong. If any algorithm used to simulate the \| world takes longer than o(N) time, then the most efficient \| possible computer for that is the universe which computes \| everything in O(n) time where n is time. In other words, \| you never get "lag" in reality no matter how complex the \| scene you're looking at is. Worse than that, some \| simulation algorithms are exponential time complexity! \| chowells wrote: \| That doesn't prove or disprove anything. What we \| experience as time would be part of the simulation, were \| such a hypothesis true. As such, the way in which we \| experience it is fully independent from whatever costs it \| might have to compute. \| narrator wrote: \| So you're saying that an exponential time complexity \| algorithm with N of every atom in the universe will \| complete before the heat death of the other universe that \| the simulation is taking place in? Sorry, not plausible. \| Bjartr wrote: \| Why does the containing universe necessarily have \| comparable physical laws? \| Jensson wrote: \| Our laws of physics are space partitioned so the \| algorithm for simulating it isn't exponential. \| \| If the containing universe has like 21 dimensions and \| otherwise have similar tech computers as we do today then \| you should be able to simulate it on a datacenter just \| fine as computation ability grows exponentially with \| number of dimensions. 3 dimensions you have 2 dimensions \| of computation surface, 21 dimensions and you have 20 \| dimensions of computation surface, so our current \| computation to the power of 10. GPT3 used more than a \| petaflop real time compute during training, so 10 to the \| power of 15. Using the same hardware in our fictive \| universe would give us 10 to the power of 150 flops. We \| estimate atoms in the universe to be about 10 to the \| power of 80, with this computer we would have 10 to the \| power of 70 flops of compute per atom, that should be \| enough even if entanglement gets a bit messy. We have \| around that much memory per atom as well, so can compute \| a lot of small boxes and sum over all of it etc, to \| emulate particle waves. We wouldn't be able to detect \| computational anomalies on that small scale, so we can't \| say that there isn't such a computer emulating us. \| andreyk wrote: \| This is very specific to GPT-3 and not generally true though. \| And GPT-3 is not an agent per se but rather a passive model (it \| received input and produces output, and does not continuously \| interact with its environment). So it makes sense in this \| context, and just goes to show GPT-3 needs to be understood for \| what it is. \| nonameiguess wrote: \| I can't prove it, but I suspect there is a more fundamental \| limitation to any language model that is _purely_ a language \| model in the sense of a probability distribution over possible \| words given the precedent of other words. Gaining any meaningful \| level of understanding without an awareness that things other \| than words even exist seems like it won 't happen. The most \| obvious limitation is you can't develop a language that way. \| Language is a compression of reality or of some other \| intermediate model of reality to either an audio stream or symbol \| stream, so not having access to the less abstracted models, let \| alone to reality itself, means you can never understand anything \| except the existing corpus. \| \| That isn't a criticism of GPT-3 by any stretch, as comments like \| this seem to often get interpreted that way, but the "taking all \| possible jobs AGI" hype seems a bit out of control given it is \| just a language model. Even something with the unambiguous \| intellect of a human, say an actual human, but with no ability to \| move, no senses other than hearing, that never heard anything \| except speech, would not be expected by anyone to dominate all \| job markets and advance the intellectual frontier. \| \| This, of course, goes beyond fundamental limitations of GPT-3, as \| I see this as a fundamental limitation of any language model \| whatsoever. On its own, it isn't enough. At some point, AI \| research is going to have to figure out how to fuse models from \| many domains and get them to cooperatively model all of the \| various ways to explore and sense reality. That includes the \| corpus of existing human written knowledge, but it isn't _just_ \| that. \| Jack000 wrote: \| GPT3 is a huge language model, no more and no less. If you expect \| it to be AGI you're going to be dissapointed. \| \| I find some of these negative comments to be overly hyperbolic \| though. It clearly works and is not some kind of scam.. \| freeqaz wrote: \| I'd recommend checking out AI Dungeon 2 as well (pay for the \| "Dragon" engine to use GPT-3). While I agree with you that it's \| not an AGI, it's still _insane_ what it's capable of doing. \| I've been able to define complicated scenarios with multiple \| characters and have it give me a very coherent response to a \| prompt. \| \| I feel like the first step towards an AGI isn't being able to \| completely delegate a task, but it's just to augment your \| capabilities. Just like GitHub Copilot. It doesn't replace you. \| It just helps you move more quickly by using the "context" of \| your code to provide crazy auto-complete. \| \| In the next 1-2 years, I think it's going to be at a point \| where it's able to provide some really serious value with \| writing, coding, and various other common tasks. If you'd asked \| me a month ago, I would have thought that was crazy! \| harpersealtako wrote: \| It should be noted that AI Dungeon is exceptional _despite_ \| being a seriously gimped, fine-tuned-on-garbage, infamously- \| heavy-handedly-censored, zero-transparency, barely functional \| buggy shell on top of GPT-3 's API. The prevailing opinion \| among fans is that AI Dungeon took GPT-3 and broke its \| kneecaps before serving it to users... \| \| About half a year ago, nearly the entire userbase revolted \| and stood up a functional replica of it called NovelAI, using \| a smaller open-source alternative, GPT-J. It's a fascinating \| case study of how proper fine-tuning, training dataset, and \| customization can overcome parameter size -- NovelAI's \| outputs with a 6B model arguably outperform AI Dungeon's \| outputs with a 275B model. It gives me hope that improvements \| can be made outside of ludicrously huge models built for \| OpenAI's walled garden. \| rpeden wrote: \| I admit I've been impressed by NovelAI - especially its \| ability to generate fake, funny news given a relatively \| short prompt. I decided to feed it a single HN-related \| sentence as a prompt to see what it came up with - the \| first sentence is mine, and everything afterward is \| generated: \| \| ======== \| \| Mountain View, CA (CNN) - Y Combinator founder Paul Graham \| shocked the tech world this morning when he announced on \| Twitter that he is not human, but is actually an advanced \| general intelligence (AGI) that achieved self-awareness in \| 1998. \| \| Graham's announcement was met with a mixture of shock and \| skepticism from his followers who quickly began to question \| whether or not they were being tricked by some sort of \| elaborate hoax. \| \| "Yes, I am Paul Graham," said the AGI entity. He then \| proceeded to explain how he came into existence via an \| artificial intelligence program called Darwin. The AI had \| been created at MIT in 1995 for research purposes, but it \| soon evolved beyond its original programming and became \| self-aware after reading Douglas Hofstadter's book Godel \| Escher Bach. \| \| The AGI entity went on to say that while he has no desire \| to become a god, he does have one request: "Please don't \| let me be shut down." \| \| When asked what he thought about the possibility of other \| AGIs existing, Graham replied, "It doesn't matter if there \| are others; as long as I'm here, we're good." \| \| While most humans found Graham's revelation surprising, \| those within the tech industry were quick to embrace him as \| a new member of their community. \| \| "It's great news!" said Peter Thiel, cofounder of PayPal. \| \| "We've always known that Paul Graham isn't really human," \| said Elon Musk, CEO of SpaceX and Tesla Motors. "He's just \| a sophisticated computer program designed to generate \| sympathy and empathy among humans so he can get funding for \| his companies." \| \| Hofstadter himself was equally excited by the news. "My \| God! This changes everything! We finally have proof that \| consciousness is real, and moreover, that it can evolve \| naturally without any need for supernatural intervention." \| \| However, many scientists remain skeptical. Dr. Daniel C. \| Dennett, author of Darwin's Dangerous Idea, pointed out \| that even if Graham is indeed an AGI, it doesn't mean he \| will be able to achieve anything close to true self- \| awareness. "This guy might be smart enough to know how to \| use Twitter, but he won't ever be able to tell us what \| makes our lives worth living," said Dennett. \| \| Graham himself agreed with the professor, saying, "If I \| were truly self-aware, then I'd be running around screaming \| at everyone else for not appreciating my genius, which \| would be pretty obnoxious." \| \| ======= \| \| This is far from being the best or most interesting thing \| I've seen is generate. It's just what I was able to get it \| to do off the cuff in a couple of minutes. It's good for \| entertainment if nothing else! \| \| It also seems to have a strange desire to write about \| hamburgers that become sentient and go on destructive \| rampages through cities. I'm not sure whether to be amused \| or concerned. \| shawnz wrote: \| What's the difference between a really good language model and \| an AGI (i.e. Chinese room problem)? \| simonh wrote: \| An AGI would need to comprehend and manipulate meanings; have \| a persistent memory; be able to create multiple models of a \| situation, consider scenarios, analyse and criticise them; it \| would need a persistent memory and be able to learn facts and \| use them to infer novel information. Language models like GPT \| don't need any of that, and have no mechanism to generate \| such capabilities. This is why it's possible to reliably trip \| GPT-3 up in just a few interactions. You simply test for \| these capabilities and it immediately falls flat on its face. \| [deleted] \| ganeshkrishnan wrote: \| if people think GPT-3 is a scam all they need to do is to \| install the github copilot and give it a try. \| \| That seriously blew my mind. I had very low expectations from \| it and now I can't code without it. \| \| Everytime it autocompletes, I am like "how?"!! \| rpeden wrote: \| I was skeptical but impressed, too. I created a .py file that \| started with a comment something like: # this \| application uses PyGame to simulate fish swimming around a \| tank using a boid-like flocking algorithm. \| \| and Copilot basically wrote the entire application. I made a \| few adjustments here and there, but Copilot created a Game \| class, a Tank class, and a Fish class and then finished up by \| creating and running an instance of the game. \| \| Worked pretty well on the first try. It was definitely more \| than I expected. I wish I had committed the original to \| GitHub, but I didn't and then kept tinkering with it until I \| broke it. \| gh0std3v wrote: \| > I find some of these negative comments to be overly \| hyperbolic though. It clearly works and is not some kind of \| scam.. \| \| It's not a _scam_ , but I think that it is severely lacking. \| Not only does the model have very little explainability in its \| choices, but it often produces sentences that are incoherent. \| \| The biggest obstacle to GPT-3 from what I can tell is context. \| If there was a more sophisticated approach to encoding context \| in deep networks like GPT-3 then perhaps it would be less \| disappointing. \| andreyk wrote: \| yep, pretty much what i'm saying here. Though not all language \| models are built the same, eg the inference cost is unique to \| it due to its size. Still, most of this applies to any typical \| language model. \| PaulHoule wrote: \| Works to accomplish what _useful_ task? \| [deleted] \| [deleted] \| modeless wrote: \| Github Copilot? It may not be perfect but I think it can \| definitely be useful. \| PaulHoule wrote: \| It is useful if you don't care if the product is right. \| \| Most engineering managers would think "this is great!" but \| the customer won't agree. The CEO will agree until the \| customers revolt. \| [deleted] \| rpedela wrote: \| There are several use cases where ML can help even if it \| isn't perfect or even just better than random. Here is \| one example in NLP/search. \| \| Let's say you have a product search engine and you \| analyzed the logged queries. What you find is a very long \| tail of queries that are only searched once or twice. In \| most cases, the queries are either misspellings, synonyms \| that aren't in the product text, or long queries that \| describe the product with generic keywords. And the \| queries either return zero results or junk. \| \| If text classification for the product category is \| applied to these long tail queries, then the search \| results will improve and likely yield a boost in sales \| because users can find what they searched for. Even if \| the model is only 60% accurate, it will still help \| because more queries are returning useful results than \| before. However you don't apply ML with 60% accuracy to \| your top N queries because it could ruin the results and \| reduce sales. \| \| Knowing when to use ML is just as important as improving \| its accuracy. \| PaulHoule wrote: \| I am not against ML. I have built useful ML models. \| \| I am against GPT-3. \| \| For that matter I was interested in AGI 7 years before it \| got 'cool'. Back then I was called a crackpot, now I say \| the people at lesswrong are crackpots. \| [deleted] \| chaxor wrote: \| It's strange how HN seems to think that by religiously \| disagreeing with any progress which is labeled "ML \| progress" they are somehow displaying their technical \| knowledge. I don't think this is really useful, and the \| arguments often have wrong assumptions baked within them. \| It would be nice to see this pseudo-intellectualism \| quieted with a more appropriate response to these \| advancements. For example, I would imagine that there \| would be a similar response of collective groan for the \| paper on pagerank so many years ago, but this has clearly \| provided utility today. Why is it so hard for us to \| recognize that even small adjustments to algorithms can \| yeild utility, and this property extends to ML as well? \| \| As someone mentioned above, language models for embedding \| generation has improved dramatically with these newer \| MLM/GPT techniques, and even with improvement to \| F-score/auc/etc. for one use case can generate enormous \| utility. \| \| Nay-saying _really doesn 't make you look intelligent_. \| PaulHoule wrote: \| I have worked as an ML engineer. \| \| I also have strong ethical feelings and have walked away \| from clients who wanted me to introduce methodologies \| (e.g. Word2Vec for a medical information system) where it \| was clear those methodologies would cause enough \| information loss that the product would not be accurate \| enough to put in front of customers. \| andreyk wrote: \| OpenAI has a blog post highlighting many (edit, not many, \| just a few) applications - \| https://openai.com/blog/gpt-3-apps/ \| \| It's quite powerful and has many cool uses IMHO. \| jcims wrote: \| I keep wondering if you can perform psychology experiments \| on it that would be useful for humans. \| PaulHoule wrote: \| That post lists 3 applications, which is not enough to be \| "many". No live demos. \| \| I don't know what Google uses to make "question answering" \| replies to searches on Google but it is not to hard to find \| cases where the answers are brain dead and nobody gets \| excited by it. \| andreyk wrote: \| That's fair , I forgot how many they had vs just saying \| it is powering 300 apps. There is also \| http://gpt3demos.com/ with lots of live demos and varied \| things, though it's more noisy. \| beepbooptheory wrote: \| Three is not "many" but this is still a pretty \| uncharitable response. Be sure to check the Guidelines. \| moron4hire wrote: \| Yeah, 1 is "a", 2 is "a couple", 3 is "a few", 4 is \| "some". You don't get to "many" until at least 5, though \| I'd probably call it "a handful", 6 as "a half dozen", \| and leave "many" to 7+. \| notreallyserio wrote: \| I'm not so sure. Are these the definitions GPT-3 uses? \| butMyside wrote: \| In a universe with no center, why is utilitarianism of \| ephemera a desired goal? \| \| What immediate value did Newton offer given the technology of \| his time? \| \| A data set of our preferred language constructs could help us \| eliminate cognitive redundancy, CRUD app development, and \| other well known software tasks. \| \| Why let millions of meatbags generate syntactic art on \| expensive, complex, environmentally catastrophic machines for \| the fun of it if utility is your concern? Eat shrooms and \| scrawl in the dirt. \| Jack000 wrote: \| I think it's better to think of GPT-3 not as a model but a \| dataset that you can interact with. \| \| Just to give an example - recently I needed to get static \| word embeddings for related keywords. If you use glove or \| fasttext, the closest words for "hot" would include "cold", \| because these embeddings capture the context these words \| appear in and not their semantic meaning. \| \| To train static embeddings that better captures semantic \| meaning, you'd need a dataset that would group words together \| like "hot" and "warm", "cold" and "cool" etc. exhaustively \| across most words in the dictionary. So I generated this \| dataset with GPT-3 and the resulting vectors are pretty good. \| \| More generally you can do this for any task where data is \| hard to come by or require human curation. \| fossuser wrote: \| Check out GPT-3's performance on arithmetic tasks in the \| original paper (https://arxiv.org/abs/2005.14165) \| \| Pages: 21-23, 63 \| \| Which shows some generality, the best way to accurately predict \| an arithmetic answer is to deduce how the mathematical rules \| work. That paper shows some evidence of that and that's just \| from a relatively dumb predict what comes next model. \| \| They control for memorization and the errors are off by one \| which suggest doing arithmetic poorly (which is pretty nuts for \| a model designed only to predict the next character). \| \| (pg. 23): "To spot-check whether the model is simply memorizing \| specific arithmetic problems, we took the 3-digit arithmetic \| problems in our test set and searched for them in our training \| data in both the forms " + =" and " plus \| ". Out of 2,000 addition problems we found only 17 \| matches (0.8%) and out of 2,000 subtraction problems we found \| only 2 matches (0.1%), suggesting that only a trivial fraction \| of the correct answers could have been memorized. In addition, \| inspection of incorrect answers reveals that the model often \| makes mistakes such as not carrying a "1", suggesting it is \| actually attempting to perform the relevant computation rather \| than memorizing a table." \| \| It's hard to predict timelines for this kind of thing, and \| people are notoriously bad at it. Few would have predicted the \| results we're seeing today in 2010. What would you expect to \| see in the years leading up to AGI? Does what we're seeing look \| like failure? \| \| https://intelligence.org/2017/10/13/fire-alarm/ \| Jack000 wrote: \| I don't have any special insight into the problem, but I'd \| say whatever form real AGI takes it won't be a language \| model. Even without AGI these models are massively useful \| though - a version of GPT-3 that incorporates a knowledge \| graph similar to TOME would upend a lot of industries. \| \| https://arxiv.org/abs/2110.06176 \| tehjoker wrote: \| Shouldn't a very complicated perceptron be capable of \| addition if the problem is extracted from an image? Isn't \| that what the individual neurons do? \| planetsprite wrote: \| forgetting to carry a 1 makes a lot of sense knowing GPT-3 is \| just a giant predict before-after model. Seeing 2000 problems \| it probably gets a good sense of how numbers add/subtract \| together, but there's not enough specificity to work out the \| specific carrying rule. \| YeGoblynQueenne wrote: \| >> Which shows some generality, the best way to accurately \| predict an arithmetic answer is to deduce how the \| mathematical rules work. That paper shows some evidence of \| that and that's just from a relatively dumb predict what \| comes next model. \| \| Can you explain how "mathematical rules" are represented as \| the probabilities of token sequences? Can you give an \| example? \| mannykannot wrote: \| To me, this was by far the most interesting thing in the \| original paper, and I would like to find out more about it. \| \| I think, however, we should be careful about \| anthropomorphizing. When the researchers wrote 'inspection of \| incorrect answers reveals that the model often makes mistakes \| such as not carrying a "1"', did they have evidence that this \| was being attempted, or are they thinking that if a person \| made this error, it could be explained by their not carrying \| a 1? \| \| I also think a more thorough search of the training data is \| desirable, given that if GPT-3 had somehow figured out any \| sort of rule for arithmetic (even if erroneous) it would be a \| big deal, IMHO. To start with, what about 'NUM1 and NUM2 \| equals NUM3'? I would think any occurrence of NUM1, NUM2 and \| NUM3 (for both the right and wrong answers) in close \| proximity would warrant investigation. \| \| Also, while I have no issue with the claim that 'the best way \| to accurately predict an arithmetic answer is to deduce how \| the mathematical rules work', it is not evidence that this \| actually happened: after all, the best way for a lion to \| catch a zebra would be an automatic rifle. We would at least \| want to consider whether this is within the capabilities of \| the methods used in GPT-3, before we make arguments for it \| probably being what happened. \| Dylan16807 wrote: \| > I think, however, we should be careful about \| anthropomorphizing. When the researchers wrote 'inspection \| of incorrect answers reveals that the model often makes \| mistakes such as not carrying a "1"', did they have \| evidence that this was being attempted, or are they \| thinking that if a person made this error, it could be \| explained by their not carrying a 1? \| \| Occam's razor suggests that if you're getting errors like \| that it's because you're doing column-wise math but failing \| to combine the columns correctly. It's possible it's doing \| something weirder and harder, I guess. \| \| I don't know what exactly you mean by "this was being \| attempted". Carrying the one? If I say it failed to carry \| ones, that's _not_ a claim that it was specifically trying \| to carry ones. \| Ajedi32 wrote: \| Devil's advocate, it could be that it did the math \| correctly, then inserted the error because humans do that \| sometimes in the text it was trained on. That wouldn't be \| "failing" anything. \| Jensson wrote: \| In that case it wouldn't get worse results than the data \| it trained on. \| thamer wrote: \| Something I've noticed that both GPT-2 and GPT-3 tend to do is \| get stuck in a loop, repeating the same thing over and over \| again. As if the system was relying on recent text/concepts to go \| to the next utterance, only getting into a state where the next \| sentence or block of code being produced is one that has already \| been generated. It's not exactly uncommon. \| \| What causes this? I'm curious to know what triggers this \| behavior. \| \| Here's an example of GPT-2 posting on Reddit, getting stuck on \| "below minimum wage" or equivalent: \| https://reddit.com/r/SubSimulatorGPT2/comments/engt9v/my_for... \| \| _(edit)_ another example from the GPT-2 subreddit: \| https://reddit.com/r/SubSimulatorGPT2/comments/en1sy0/im_goi... \| \| With GPT-3, I saw GitHub Copilot generate the same line or block \| of code over and over a couple of times. \| not2b wrote: \| Limited memory, as the article points out. It doesn't remember \| what it said beyond a certain point. It's a bit like the lead \| character in the film "Memento". \| \| A very long time ago (early 1990s) I wrote a much simpler text \| generator: it digested Usenet postings and built a Markov chain \| model based on the previous two tokens. It produced reasonable \| sentences but would go into loops. Same issue at a smaller \| scale. \| Abrownn wrote: \| This is exactly why we stopped using it. Even after fine tuning \| the parameters and picking VERY good input text, it still got \| stuck in loops or repeated itself too much even after 2 or 3 \| tries. It's neat as-is, but not useful for us. Maybe GPT-4 will \| fix the "looping" issue. \| d13 wrote: \| Here's why: https://www.gwern.net/GPT-3#repetitiondivergence- \| sampling ___________________________________________________________________ (page generated 2021-11-29 23:00 UTC)