[HN Gopher] Generate images fast with SD 1.5 while typing on Gradio
___________________________________________________________________
 
Generate images fast with SD 1.5 while typing on Gradio
 
Author : smusamashah
Score  : 76 points
Date   : 2023-11-12 15:54 UTC (7 hours ago)
 
web link (twitter.com)
w3m dump (twitter.com)
 
| Der_Einzige wrote:
| The fact that LCM loras turn regular SD models into psudo-LCM
| models is insane.
| 
| Most people in the AI world don't understand that ML is like
| actual alchemy. You can merge models like they are chemicals. A
| friend of mine called it "a new chemistry of ideas" upon seeing
| many features in Automatic1111 (including model and token merges)
| used simultaneously to generate unique images.
| 
| Also, loras exist on a spectrum based on their dimensionality.
| Tiny loras should only be capable of relatively tiny changes. My
| guess is that this is a big lora, nearly the same size as the
| base checkpoint.
 
  | keonix wrote:
  | Wait until you hear about frankenmodels. You rip parts of one
  | model (often attention heads) and transplant them in another
  | and somehow that produces coherent results! Witchcraft
  | 
  | https://huggingface.co/chargoddard
 
    | GaggiX wrote:
    | >somehow that produces coherent results
    | 
    | with or without finetuning? Also is there a practical
    | motivation for creating them?
 
      | keonix wrote:
      | > with or without finetuning?
      | 
      | With, but it's still bonkers that it works so well
      | 
      | >Also is there a practical motivation for creating them?
      | 
      | You could get in-between model sizes (like 20b instead of
      | 13b or 34b). Before better quantization it was useful for
      | inference (if you are unlucky with vram size), but now I
      | see this being useful only for training because you can't
      | train on quants
 
  | GaggiX wrote:
  | lcm-lora-sdv1-5 is 67.5M, lcm-lora-sdxl is 197M, so they are
  | much smaller than the entire model, would be cool to check the
  | rank used with these LoRAs tho
 
    | liuliu wrote:
    | 64.
 
  | temp72840 wrote:
  | This is nuts. I did a double take at this comment - I thought
  | you _must_ have been talking about LoRAing a LCM distilled from
  | Stable Diffusion.
  | 
  | LCMs are spooky black magic, I have no intuitions about them.
 
    | ttul wrote:
    | When I was taking Jeremy Howard's course last fall, the
    | breakthrough in SD was going from 1000 steps to 50 steps via
    | classifier-free guidance, which is this neat hack where you
    | run inference with your conditioning and without and then mix
    | the result. To this day I still don't get it. But it works.
    | 
    | Now we find this way to skip to the end by building a model
    | that learns the high dimensional curvature of the path that a
    | diffusion process takes through space on its way to an
    | acceptable image, and we just basically move the model along
    | that path. That's my naive understanding of LCM. Seems to
    | good to be true, but it does work and it has a good
    | theoretical basis too. Makes you wonder what is next? Will
    | there be a single step network that can train on LCM to
    | predict the final destination? LoL that would be pushing
    | things too far..
 
      | hadlock wrote:
      | Sounds like we've invented the kind of psychic time travel
      | they use in Minority report. Let me show you right over to
      | the Future Crimes division. We're arresting this guy making
      | cat memes today because the curve of his online history
      | traces that of a radicalized blah blah blah
 
  | ttul wrote:
  | To me, the crazy thing about LoRA is they work perfectly well
  | adapting models checkpoints that were themselves derived from
  | the base model on which the LoRA was trained. So you can take
  | the LCM LoRA for SD1.5 and it works perfectly well on, say,
  | RealisticVision 5.1, a fine-tuned derivative of SD1.5.
  | 
  | You'd think that the fine tuning would make the LCM LoRA not
  | work, but it does. Apparently the changes in weights introduced
  | through even pretty heavy fine tuning does not wreck the
  | transformations the LoRA needs to make in order to make LCM or
  | other LoRA adaptations work.
  | 
  | To me this is alchemy.
 
    | yorwba wrote:
    | Finetuning and LoRAs both involve additive modifications to
    | the model weights. Addition is commutative, so the order in
    | which you apply them doesn't matter for the resulting
    | weights. Moreover, neural networks are designed to be
    | differentiable, i.e. behave approximately linearly with
    | respect to small additive modifications of the weights, so as
    | long as your finetuning and LoRA change the weights only a
    | little bit, you can finetune with or without the LoRA,
    | respectively train the LoRA on the finetuned model or its
    | base, and get mostly the same result.
    | 
    | So this is something that can be somewhat explained using not
    | terribly handwavy mathematics. Picking hyperparameters on the
    | other hand...
 
  | smusamashah wrote:
  | Ok. I have seen the term LCM Lora a number of times. I have
  | used both stable Diffusion and LORAs for fun for quite a while.
  | But I always thought this LCM Lora is a new thing. It's simply
  | not possible using current samplers to return an image under 4
  | steps. What you are saying is that just by adding a Lora we can
  | get existing models and samplers to generate a good enough
  | image in 4 steps?
 
    | jyap wrote:
    | Yes check out this blog post:
    | https://huggingface.co/blog/lcm_lora
    | 
    | I've used it with my home GPU. Really fast which makes it
    | more interactive and real-time.
 
    | catwell wrote:
    | It's a different sampler too.
 
| jimmySixDOF wrote:
| And here is a demo mashed up using LeapMotion free space hand
| tracking and a projector to manipulate a "bigGAN's high-
| dimensional space of pseudo-real images" to make it more like a
| modern dance meets sculpting meets spatial computing with a hat
| tip to the 2008 work of Johnny Chung Lee while at Carnage Mellon.
| 
| https://x.com/graycrawford/status/1100935327374626818
 
| smlacy wrote:
| https://nitter.net/abidlabs/status/1723074108739706959
 
| r-k-jo wrote:
| Here is a collection of demos with fast LCM on HuggingFace
| 
| https://huggingface.co/collections/latent-consistency/latent...
 
___________________________________________________________________
(page generated 2023-11-12 23:00 UTC)