[HN Gopher] Stable Zero123: Quality 3D Object Generation from Si...
___________________________________________________________________
 
Stable Zero123: Quality 3D Object Generation from Single Images
 
Author : homarp
Score  : 41 points
Date   : 2023-12-14 21:34 UTC (1 hours ago)
 
web link (stability.ai)
w3m dump (stability.ai)
 
| AndrewKemendo wrote:
| One of the key limiting factors to the adoption of augmented
| reality was the lack of available 3-D objects that you could then
| put into the AR space.
| 
| If you think about a company who has physical objects, the
| hardest thing you do as somebody who is a model builder is
| creating a 3-D model that's accurate to whatever the product is.
| 
| This is such a big problem, that we devised a fairly novel large
| scanning system, that I proposed to Amazon that was part of our
| digitization suite when I was running my computer vision and AR
| company. That was one of a dozen projects that we were trying to
| do to get after this problem of rapid digitization of objects.
| 
| One of the key things we were trying to do starting in 2017, was
| come up with structure from motion, algorithms or otherwise have
| a large database and a similarity match such that we could
| inherit a series of objects, the most likely object types that we
| saw in the environment.
| 
| Of the major challenges of this is that it isn't good enough for
| anybody like a corporation to pay for. The majority of time and
| money spent for companies they were trying to get into the AR
| space was in sending catalog of catalogs to India, Pakistan, etc.
| for thousands of three modelers to create the 3-D models.
| 
| You can certainly understand how this becomes complicated
| quickly, including what is considered a canonical model, is there
| licensing for a certain types of models, who has the official
| authorization for a 3-D model, etc.
| 
| All this to say, what is presented here seems pretty darn close,
| or at least close enough that we can see that we're going to be
| able to fully automate this process, and hopefully that will
| actually allow for the adoption of some of these things that were
| previously rate, limited by the Linear growth rate of objects in
| the available space.
| 
| Edit: I'll be curious to see what their meshes look like and if
| they optimize for polygons and in what way. Similarly if they are
| a single volume or if they have discrete objects composing a new
| object class (extremely doubtful)
| 
| The last time I saw any major updates on this kind of thing was a
| Stanford paper that was trying to derive Voxel spaces if I recall
| correctly from images and that was back in 2017 or 2018
 
| ansible wrote:
| So it generates a fully rigged 3D model that can be animated by
| conventional means?
| 
| If it can do all that, and you add in motion capture from just a
| video, and that will drastically cut the costs for all kinds of
| animation projects.
| 
| Given that it is possible to render photo-realistic people now
| from 3D models (subsurface scattering for the skin, etc.), we are
| well on the way to a full video production pipeline. Just give it
| some scans of the people and objects you want, type in a
| description of the scenes, generate the voices via text to
| speech, and press "render".
| 
| The next few years are going to be crazy.
 
  | joewhatkins wrote:
  | I don't think it rigs the models - I think that video is
  | comprised of models generated by Stable Zero123 that were then
  | rigged/animated/postprocessed in Blender.
 
___________________________________________________________________
(page generated 2023-12-14 23:00 UTC)