|
| dr_dshiv wrote:
| \\ \\ \\ ,__, \\ (oo)____
| (__) )\\ ||--|| *
|
| "Draw an ASCII unicorn" (GPT4)
| Mystery-Machine wrote:
| Would be great if these days would have dates as well. Otherwise,
| there's little use of "Day 69". If I could see "Day 69 (June 21
| 2023)"
| thomasfromcdnjs wrote:
| Might as well make a Twitter account! or get AutoGPT to do it.
| bestcoder69 wrote:
| It can draw a penis fyi
| MH15 wrote:
| Would be useful if the prompts used to generate the drawing code
| were included in the site.
| abrichr wrote:
| They appear to be here:
|
| https://github.com/adamkdean/gpt-unicorn/blob/master/src/lib...
| { role: 'system', content: `You are a helpful assistant that
| generates SVG drawings. You respond only with SVG. You do not
| respond with text.` }, { role: 'user', content: `Draw
| a unicorn in SVG format. Dimensions: 500x500. Respond ONLY with
| a single SVG string. Do not respond with conversation or
| codeblocks.` }
| dmix wrote:
| Sadly it outputs raw svg code so you have to save it locally
| as .svg to see it. Or just insert it into an HTML page via
| devtools if you're lazy like me.
| bee_rider wrote:
| "You are a helpful assistant" seems like it is always
| included in these sort of prompts. I wonder if it really
| helps...
| LeoPanthera wrote:
| It's quite funny to tell it that it is an unhelpful
| assistant. During the first few responses it is amusingly
| obstinate.
|
| It always seems to revert back to "helpful assistant" after
| a few messages, whatever the prompt says.
| Guillaume86 wrote:
| It's too generic I think, my prompt immediately gave me a
| better result that the ones in his post:
| You are a SVG expert, when asked by the user to draw
| something, you reply to the best of your ability with SVG
| code that satisfies the request.
| ShamelessC wrote:
| As is noted in the paper from which this is inspired from:
| GPT-4's image generation capabilities were severely diminished by
| the instruction/safety-tuning process. Unfortunately this means
| the currently available model from the API won't be very capable
| - certainly not as capable as the early version of GPT-4 that
| Microsoft had access to.
|
| edit: I'm specifically referring to the "image generation by
| trickery (e.g. SVG)" technique being diminished. Other tasks were
| diminished as well though - is my understanding.
| og_kalu wrote:
| It's not just image generation the rlhf worsens too.
| Calibration (confidence on solving a question in relation to
| ability to solve that problem) went from excellent to non
| existent. and you can see from the report that the base model
| performed better on a number of tests. Basically a dumber
| model.
| tbalsam wrote:
| Not dumber. More biased.
|
| Important distinction, especially if we're looking to push
| back out towards the Pareto Frontier of the problem.
|
| RLHF is still very much in its infancy and does not maximize
| the bias-variance tradeoff by a long shot, in my personal
| experience.
| og_kalu wrote:
| No dumber. Sure more biased too if you want but also
| dumber. Open ai have indicated as much.
| psychphysic wrote:
| Also generally less creative and insightful.
|
| "No I won't do it" becomes a good option no matter what
| if you turn safety too high.
| ShamelessC wrote:
| My understanding is that OpenAI did indeed find diminished
| capability across a range of tasks after doing RLHF. You're
| correct to question this though - as I believe the opposite
| was true of GPT-3 where it improved certain tasks.
|
| The benefits from a business perspective were still clear
| however, and of course the instruction-tuned GPT-4 model
| still outperformed GPT-3, in general.
|
| There are probably some weird edge cases and nuances that
| I'm missing - and I'd be happy to be corrected.
| arthurcolle wrote:
| Are you saying this specifically for the GPT-4 API endpoint
| compared to idealized described GPT-4 from the paper?
| og_kalu wrote:
| yes the public api (or on paid chatGPT) vs the base model
| from the paper
| Varqu wrote:
| Is anyone else also getting tired of seeing "GPT" prefix / suffix
| in the name of 90% new AI-related products?
| mustacheemperor wrote:
| Given this is a process specifically to evaluate the changing
| performance of GPT-4 over time, it seems appropriate.
| squeaky-clean wrote:
| This isn't a new AI product. It's a (seemingly auto updating)
| blog entry about GPT-4
| ansk wrote:
| This is a great rorschach test. Show these four images to someone
| hyping AI, and if they see evidence of a growing/emerging
| intelligence, you can diagnose them as being wholly unqualified
| to comment on anything related to AI.
| syntaxing wrote:
| I don't get it, wouldn't something like HuggingGPT be able to
| command stable diffusion to do this? Just because GPT can't do
| this natively doesn't mean it's not possible with the right
| framework?
| ansk wrote:
| These images were all generated by an identical model. The
| fact that this individual has convinced themself that the
| model is improving indicates that they don't understand how
| these models are trained and deployed. Furthermore, any
| conclusions reached on such limited data reveal more about
| one's predisposed opinions than anything about the nature of
| the data. Show this person an ink blot and they very well may
| see an image of a superintelligent AGI.
| einpoklum wrote:
| Perhaps you should ask it to draw you a sheep.
| dangond wrote:
| > The idea behind GPT Unicorn is quite simple: every day, GPT-4
| will be asked to draw a unicorn in SVG format. This daily
| interaction with the model will allow us to observe changes in
| the model over time, as reflected in the output. Is it useful to
| do this every day? Correct me if I'm wrong, but my understanding
| is that OpenAI does not update the models available in production
| incrementally on a day-to-day basis.
| sacred_numbers wrote:
| They do update the model in the background, although I'm not
| sure how often or how much they update it. To avoid issues with
| this practice they offer gpt-4-0314 which says this in the
| documentation:
|
| "Snapshot of gpt-4 from March 14th 2023. Unlike gpt-4, this
| model will not receive updates, and will only be supported for
| a three month period ending on June 14th 2023."
|
| Unfortunately this experiment is using the frozen snapshot
| model gpt-4-0314 instead of the unfrozen gpt-4 or gpt-4-32k
| models, so any differences are literally 100% noise. This would
| be a somewhat interesting experiment if someone were to use an
| unfrozen model, though. I do appreciate the author for
| captioning the images with the exact model they used for
| generation so that this bug could be caught quickly.
|
| [0]https://platform.openai.com/docs/models/gpt-4
| charcircuit wrote:
| Similarly the quality of the model can't be judged with a
| single sample. These end up canceling out.
| sp0rk wrote:
| Did you generate a bunch all at once before starting to get some
| idea of what the natural variance looks like? I would think it's
| important to verify some level of progression over time, because
| with the current four it seems entirely possible that the
| examples could have all been generated at the same time with no
| changes to the model.
| gwern wrote:
| Also unclear if he's sampling at temp=0. Looks like he doesn't
| set a temp? https://github.com/adamkdean/gpt-
| unicorn/blob/8ad76ec7161682... So not sure what he's really
| doing.
| ratg13 wrote:
| Aren't they using the March 14 model like the general public?
|
| It's frozen in time, there are no updates to it..
|
| All of these will be drawn using the same model until they push
| a new update, or you switch to a different GPT
|
| But I already think they proved the point that the generation
| is random enough that it would be extremely difficult to track
| progress this way.
| williamstein wrote:
| GPT's output is by default somewhat random. If you ask the
| same exact question several times, you'll potentially get
| several different answers. Each successive word in the output
| is chosen from a distribution of possibilities -- that
| distribution is fixed, but that actual sample chosen from the
| distribution is not fixed. See, e.g.,
| https://platform.openai.com/docs/api-
| reference/completions/c...
| startupsfail wrote:
| Sampling a single noisy sample from a model that doesn't update
| that often is hardly correlated with the claim of "Daily
| exploration".
| dang wrote:
| The unicorn example is discussed at length in Bubeck's recent
| talk:
|
| https://www.youtube.com/watch?v=qbIk7-JPB2c#t=22m6s
| dmix wrote:
| Why would the model change over time when asking the same
| question? Just it's generation dataset for generating similar
| images? Or is this just tracking GPT's explicit model
| improvements over time?
| pps wrote:
| "GPT 5 Will be Released 'Incrementally' - 5 Points from
| Brockman Statement" -
| https://www.youtube.com/watch?v=1NAmLp5i4Ps
| atleastoptimal wrote:
| gpt-4-0314 is a snapshot model and won't be updated, they
| shouldn't use that for this experiment.
| tbalsam wrote:
| The models seem to have been changing in the background, though
| as another commenter pointed out.... having a variance-
| calibrayion baseline for humans would be great too. :'))))
| m3kw9 wrote:
| Are they banking on OpenAI updating their model every day, or
| just prompting the same thing everyday wishing for a different
| outcome?
| qumpis wrote:
| In the "sparks of AGI" paper, authors noted that the unicorn
| shape degrees as more "alignment" is injected to to. If openai
| adjust the model (say by training more), the picture should
| reflect it. If they make the model be more "aligned", it should
| reflect as well.
|
| So I'd guess the answer is the former.
| atleastoptimal wrote:
| if GPT-4 will update based on recent web training data, the fact
| that people are bringing much more attention to the "draw a
| unicorn" task magnifies the chance someone will have posted a
| perfect version of an svg unicorn, leading the model to leverage
| that rather than the aim of this experiment which I imagine is
| GPT-4's capacity to extrapolate.
|
| EDIT: Also it makes no sense to constantly retry it every day on
| the gpt-4-0314 model, since OpenAI specified that that is a
| snapshot model that will not be updated.
___________________________________________________________________
(page generated 2023-04-13 23:00 UTC) |