|
| eurekin wrote:
| Didn't see batching taken into equation, might skew a bit
| sidnb13 wrote:
| Yep, batching is a feature I really wish the OpenAI API had.
| That and the ability to intelligently cache frequently used
| prompts. Much easier to achieve this with a hosted OS model, so
| I guess it's a speed + customizability/cost tradeoff for the
| time being.
| advaith08 wrote:
| imo they dont have batching because they pack sequences
| before passing through the model. so a single sequence in a
| batch on OpenAI might have requests from multiple customers
| in it
| jonplackett wrote:
| Is this a reflection of OpenAI's massive scale making it so cheap
| for them?
|
| Or is it the deal with Microsoft for cloud services making it
| cheap?
|
| Or are they just operating at a massive loss to kill off other
| competition?
|
| Or something else?
| 4death4 wrote:
| Probably all three:
|
| 1) They hiring too talent to make their models as efficient as
| possible.
|
| 2) They have a sweetheart deal with MS.
|
| 3) They're better funded than everyone else and bringing in
| substantial revenue.
| smachiz wrote:
| deleted
| ryduh wrote:
| Is this a guess or is it informed by facts?
| sebzim4500 wrote:
| Are just suggesting this as an option or do you have
| evidence that it is true?
| ugjka wrote:
| They are also trying to lobby the government for AI
| "regulation" in order limit any competitors ability achieve
| OpenAI's level
| wkat4242 wrote:
| They basically are MS by now. Everyone at Microsoft I work
| with literally calls it an 'aquisition'. Even though they
| only own a share. It's pretty clear what their plans are.
| SkyMarshal wrote:
| Probably the first two, plus first-mover brand recognition.
| Millions of $20 monthly subs for GPT4 add up.
|
| They might also be operating at a loss afaik, but I suspect
| they're one of the few that can break even just based on scale,
| brand recognition, and economics.
| michaelbuckbee wrote:
| $20/mo subs which is also the lead in to also unlocking paid
| API access.
| sarchertech wrote:
| I haven't heard any evidence that they have millions of Plus
| subscribers.
|
| I've seen 100 to 200 million active users, but nothing about
| paid users from them. The surveys I saw when doing a quick
| google search reported much less than 1% of users paying.
| SkyMarshal wrote:
| Yeah I don't know what the actual subscription numbers are,
| would be surprised if OpenAI is publishing that info.
| ShadowBanThis01 wrote:
| They're mining the gullible for phone numbers, among other
| things.
| vsreekanti wrote:
| Probably some combination of all the above! I think 1 and 2 are
| interlinked though -- the cheaper they can be, the more they
| build that moat. They might be eating the cost on these APIs
| too, but unlike the Uber/Lyft war, it'll be way stickier.
| te_chris wrote:
| There's also just the benefits of being in market, at scale and
| being exposed to the full problem space of serving and
| maintaining services that use these models. It's one thing to
| train and release and OSS model, it's another to put it into
| production and run all the ops around it.
| iliane5 wrote:
| I think it's mostly the scale. Once you have a consistent user
| base and tons of GPUs, batching inference/training across your
| cluster allows you to process requests much faster and for a
| lower marginal cost.
| ilaksh wrote:
| I think the weird thing about this is that it's completely true
| right now but in X months it may be totally outdated advice.
|
| For example, efforts like OpenMOE
| https://github.com/XueFuzhao/OpenMoE or similar will probably
| eventually lead to very competitive performance and cost-
| effectiveness for open source models. At least in terms of
| competing with GPT-3.5 for many applications.
|
| Also see https://laion.ai/
|
| I also believe that within say 1-3 years there will be a
| different type of training approach that does not require such
| large datasets or manual human feedback.
| sidnb13 wrote:
| > I also believe that within say 1-3 years there will be a
| different type of training approach that does not require such
| large datasets or manual human feedback.
|
| I guess if we ignore pretraining, don't sample-efficient fine-
| tuning on carefully curated instruction datasets sort of
| achieve this? LIMA and OpenOrca show some really promising
| results to date.
| sharemywin wrote:
| distilbert was trained from Bert. there might be an angle
| using another model to train the model especially if your
| trying to get something to run locally.
| nico wrote:
| > I also believe that within say 1-3 years there will be a
| different type of training approach that does not require such
| large datasets or manual human feedback
|
| This makes a lot of sense. A small model that "knows" enough
| English and a couple of programming languages should be enough
| for it to replace something like copilot, or use plug-ins or do
| RAG on a substantially larger dataset
|
| The issue right now is that to get a model that can do those
| things, the current algorithms still need massive amounts of
| data, way more than what the final user needs
| Dwedit wrote:
| Abbreviate Mix of Experts as "MoE" and the Anime fans
| immediately start rushing in...
| daft_pink wrote:
| I'm confused don't a100s cost 10,000 to buy? Why would you pay
| 166k per year to rent?
| sidnb13 wrote:
| I would assume the datacenter and infra needed would also
| contribute a sizeable chunk to the costs when you consider
| upkeep to run it 24/7
| latchkey wrote:
| For the same reason people use AWS.
|
| Spending the capex/opex to run a cluster of compute isn't easy
| or cheap. It isn't just the cost of the GPU, but the cost of
| everything else around it that isn't just monetary.
| etothepii wrote:
| This could be an interesting comparison. My experience with
| AWS is that it was super easy and cheap to start on. By the
| time we _could_ use whole servers we were using so much AWS
| orchestration that it 's going to be put off until we are at
| least $1M ARR, and probably til we are at $5M.
|
| Make adoption easy, give a free base tier but charge more
| could be a very effective model to get start ups stuck on
| you. It even probably makes adoption by small teams in big
| companies possible that can then grow ...
| dekhn wrote:
| How much does an A100 consume in power a year (in dollar
| costs)? How much does it cost to hire and retain datacenter
| techs? How long does it take to expand your fleet after a user
| says "we're gonna need more A100s?" How many discounts can you
| get as a premier customer?
|
| Answer these questions, and the equation shifts a bunch!
| shrubble wrote:
| Not really.
|
| A full rack with 16 amps usable power and some bandwidth is
| $400/month in Kansas City, MO. That is enough to power 5x
| A100s 24x7, so 10k plus $80 per month each, amortized, of
| course many more A100s would drop the price.
|
| Once installed in the rack ($250 1 time cost) you shouldn't
| need to touch it. So 10k plus $1250 per A100, per year
| including power. You can put 2 or 3 A100s per cheapo Celeron
| based CPU with motherboards.
|
| Of course if doing very bursty work then it may well make
| sense to rent...
| akomtu wrote:
| And how many A100s do you need to do something meaningful
| with LLMs?
| shrubble wrote:
| The funding has to come from somewhere, right? You either
| pay up front and save money over time, or pay as you go
| and pay more...
| dekhn wrote:
| Did you also include the network required to make the A100s
| talk to each other? Both the datacenter network (so the
| CPUs can load data) and the fabric (so the A100s can talk?)
|
| You also left out the data tech costs- probably at least
| $50K/individual-year in KC (although I guess I'd just work
| for free ribs).
|
| If you're putting A100s into celeron motherboards... I
| don't know what to say. You're not saving money by putting
| a ferrari engine in a prius.
| latchkey wrote:
| $50m GPU capex (which is A LOT) is about 2-3MW of power, it
| isn't that much.
|
| The problem though is that getting 2-3MW of power in the US
| is increasingly difficult and you're going to pay a lot more
| for it since the cheap stuff is already taken.
|
| Even more distressing is that if you're going to build new
| data center space, you can't get the rest of the stuff in the
| supply chain... backup gennies, transformers, cooling towers,
| etc...
| amluto wrote:
| Those are 8x A100 systems.
| joefourier wrote:
| AWS is extremely overpriced for nearly every service. I don't
| know why anyone else outside of startups with VC money to burn
| or bigcos that need the "no one ever got fired for buying IBM"
| guarantee would use them. You're better off with Lambdalabs or
| others which charge only $1.1/h per A100.
|
| Also that is a 8xA100 system as others have noted, but it is
| the 40GB one which can be found on eBay for as low as $3k if
| you go with the SXM4 one (although the price of supporting
| components may vary) or $5k for the PCI-e version.
| wg0 wrote:
| There are only two services that are dirt cheap and way too
| reliable, useful.That's S3 and SQS. Rest can get very
| expensive very soon.
|
| You can build a lot of stuff on top of these two.
| ommpto wrote:
| Even for S3 while the storage is dirt cheap they still have
| exorbitant bandwidth pricing.
| charcircuit wrote:
| S3 is not dirt cheap. Bandwidth is ludicrously expensive.
| charlesischuck wrote:
| You pay for the system not the gpu with AWS.
|
| It's absolutely worth the money when you look at the whole
| picture. Also lambda labs never has availability. I actually
| can schedule a distributed cluster on AWS.
| AndroTux wrote:
| > It's absolutely worth the money when you look at the
| whole picture.
|
| That highly depends on many things. If you run a business
| with a relatively steady load that doesn't need to scale
| quickly multiple times per day, AWS is definitely not for
| you. Take Let's Encrypt[1] as an example. Just because
| cloud is the hype doesn't mean it's always worth it.
|
| Edit: Or a personal experience: I had a customer that
| insisted on building their website on AWS. They weren't
| expecting high traffic loads and didn't need high
| availability, so I suggested to just use a VPS for $50 a
| month. They wanted to go the AWS route. Now their website
| is super scalable with all the cool buzzwords and it costs
| them $400 a month to run. Great! And in addition, the whole
| setup is way more complex to maintain since it's built on
| AWS instead of just a simple website with a database and
| some cache.
|
| [1] https://news.ycombinator.com/item?id=37536103
| nharada wrote:
| Sometimes I need 512 GPUs for 3 days.
| charlesischuck wrote:
| A top end gpu now to make you competitive cost 20-50k per gpu.
|
| To train a top model you need hundreds of them in a very
| advanced datacenter.
|
| You can't just plug gpus into standard systems and train,
| everything is custom.
|
| The technical talent required for these systems is rare to say
| the least. The technical talent to make a model is also rare.
|
| I trained a few foundation models with images, and I would
| NEVER buy any of them. These guys are on a wildly different
| scale than basically everyone.
| SkyMarshal wrote:
| I think OpenAI may eventually have to go upmarket, as basic "good
| enough" AI becomes increasingly viable and cheap/free on consumer
| level devices, supplied by FOSS models and apps.
|
| Apple may be leading the way here, with Apple Silicon
| prioritizing AI processing and built into all their devices.
| These capabilities are free (or at least don't require an extra
| sub), and just used to sell more hardware.
|
| OpenAI is clearly going to compete in that market with its
| upcoming smart phone or device [1]. But what revenue model can
| OpenAI use to compete with Apple's and not get undercut by it? I
| suppose hardware + free GPT3.5, and optional subscription to GPT4
| (or whatever their highest end version is). Maybe that will be
| competitive.
|
| I also wonder what mobile OS OpenAI will choose. Probably not
| Android, otherwise they would have partnered with Google. A
| revamped and updated Microsoft mobile OS maybe, given their MS
| partnership? Or something new and bespoke? I could imagine Johnny
| Ive demanding something new, purpose-built, and designed from
| scratch for a new AI-oriented UI/UX paradigm.
|
| A market for increasingly sophisticated AI that can only be done
| in huge GPU datacenters will exist, and that's probably where the
| margins will be for a long time. I think that's what OpenAI,
| Microsoft, Google, and the others will be increasingly competing
| for.
|
| [1]:https://www.reuters.com/technology/openai-jony-ive-talks-
| rai...
| vsreekanti wrote:
| Yep, we agree that the obvious direction of innovation for OSS
| models is smaller and cheaper, likely at roughly the same
| quality: https://generatingconversation.substack.com/p/open-
| source-ll...
| smcleod wrote:
| Also more privacy respecting, and more customisable /
| flexible.
| mensetmanusman wrote:
| Please Apple let me replace worthless Siri with ChatGPT on my
| iPhone.
|
| Would completely change how I use the device.
| bitcurious wrote:
| If you have the new iPhone with the action button, you can
| set a shortcut to ask questions of ChatGPT. It's not as fluid
| as Siri, and can't control anything, but still much more
| useful.
| CamperBob2 wrote:
| Just yesterday, while driving: "Read last message."
|
| Siri: "Sorry. Dictation service is unavailable at the
| moment."
|
| It's past time for excuses. High-level people at Apple need
| to be fired over this. Hello? Tim? Do your job. Hello?
| Anybody home...?
| freedomben wrote:
| Nobody is switching away from Apple over this, so
| ultimately Tim _is_ doing his job. Under his watch Apple
| has become the defacto choice for entire generations.
| Between vendor-lockin /walled gardens and societal/cultural
| pressures (don't want to be a green bubble!), they have one
| of the stickiest user bases there are.
| mensetmanusman wrote:
| True, but that doesn't mean we shouldn't complain.
|
| My hope is that the upcoming eu rulings allow competition
| here. Ie force Apple to get out of the way of making
| their hardware better with better software.
| CamperBob2 wrote:
| Stop excusing shitty work from trillion-dollar companies.
| It makes the world a worse place.
| smoldesu wrote:
| I think it's shitty and has no excuse, but the parent is
| right. Apple has no incentive to respond to their users
| since all roads lead to first-party Rome. It's why stuff
| like the Digital Market Act is more needed than some
| people claim.
|
| You know what would get Apple to fix this? Forced
| competition. You know what Apple spends their trillions
| preventing?
| layer8 wrote:
| Apple is ramping up spending in that area:
| https://www.macrumors.com/2023/09/06/apple-conversational-
| ai...
|
| It'll probably take a while though.
| grahamplace wrote:
| > OpenAI is clearly going to compete in that market with its
| upcoming phone
|
| What phone are you referring to? A quick google didn't seem to
| pull up anything related to OpenAI launching a hardware
| product?
| BudaDude wrote:
| They are most likely referring to this in collaboration with
| Jony Ive:
|
| https://www.yahoo.com/entertainment/openai-jony-ive-talks-
| ra...
| SkyMarshal wrote:
| Yes that one.
| jimkoen wrote:
| > OpenAI is clearly going to compete in that market with its
| upcoming phone.
|
| Excuse me, I'm not an english native, you mean like a smart
| phone? Or do you mean some sort of other new business
| direction? Where did you get the info thtat they're planning to
| launch a phone?
| MillionOClock wrote:
| I believe there has been rumors that OpenAI was working with
| Jony Ive to create a wearable device, but it was unclear
| wether it would be a phone or something else.
| SkyMarshal wrote:
| Yes a smartphone, /corrected. It's a recent announcement:
|
| https://www.nytimes.com/2023/09/28/technology/openai-
| apple-s...
| sharemywin wrote:
| It's not a really a phone. they mention ambient computing.
| SkyMarshal wrote:
| Oh, smart device then.
| layer8 wrote:
| https://www.reuters.com/technology/openai-jony-ive-talks-
| rai...
| layer8 wrote:
| Where are you taking the confidence that Apple will be able to
| catch up to OpenAI's GPT? "Apple's built-in AI capabilities"
| are very weak so far.
| filterfiber wrote:
| Not OP,
|
| In my experience apple's ML on iphones is seamless. Tap and
| hold on your dog in a picture and it'll cut out the
| background, your photos are all sorted automatically
| including by person (and I think by pet).
|
| OCR is seamless - you just select text in images as if it was
| real text.
|
| I totally understand these aren't comparable to LLMs - rumor
| has it apple is working on an llm - if their execution is
| anything like their current ML execution it'll be glorious.
|
| (Siri objectively sucks although I'm not sure it's fair to
| compare siri to an LLM as AFAIK siri does not do text
| prediction but is instead a traditional "manually crafted
| workflow" type of thing that just uses S2T to navigate)
| blackoil wrote:
| >OCR is seamless
|
| Wasn't that solved about a decade ago. Does anyone suck at
| that?
| filterfiber wrote:
| > Does anyone suck at that?
|
| Does android even have native OCR? Last I checked
| everything required an OCR app of varying quality
| (including windows/linux).
|
| On ios/macos you can literally just click on a picture
| and select the text in it as if it wasn't a picture. I
| know for sure on iOS you don't even open an app to do it,
| just any picture you can select it.
|
| Last I checked the Opensource OCR tools were decent but
| behind the closed source stuff as well.
|
| Random google result of OCR on android (could be
| outdated) - https://www.reddit.com/r/androidapps/comments
| /10te5et/why_oc...
| smoldesu wrote:
| > Does android even have native OCR?
|
| Tesseract? https://github.com/tesseract-ocr/tesseract
| SkyMarshal wrote:
| I'm not saying they will on the high-end, but maybe on the
| low end. Apple's strategy is to embed local AI in all their
| devices. Local AI will never be as capable as AI running in
| massive GPU datacenters, but if it can get to a point that
| it's "good enough" for most average users, that may be enough
| for Apple to undercut the low end of the market.
| freedomben wrote:
| > _Local AI will never be as capable as AI running in
| massive GPU datacenters_
|
| I'm not sure this is true, even in the short term. For some
| things yes, that's definitely true. But for other things
| that are real-time or near real-time where network latency
| would be unacceptable, we're already there. For example,
| Google's Pixel 8 launch includes real-time audio
| processing/enhancing which is made possible by their new
| Tensor chip.
|
| I'm no fan of Apple, but I think they're on the right path
| with local AI. It may even be possible that the tendency of
| other device makers to put AI in the cloud might give Apple
| a much better user experience, unless Google can start
| thinking local-first which kind of goes against their
| grain.
| SkyMarshal wrote:
| _> But for other things that are real-time or near real-
| time where network latency would be unacceptable, we 're
| already there._
|
| Agreed. Something else I wonder is if local AI in mobile
| devices might be better able to learn from its real-time
| interactions with the physical world than datacenter-
| based AI.
|
| It's walking around in the world with a human with all
| its various sensors recording in real-time (unless
| disabled) - mic, camera, GPS/location, LiDAR, barometer,
| gyro, accelerometer, proximity, ambient light, etc. Then
| the human uses it to interact with the world too in
| various ways.
|
| All that data can of course be quickly sent to a
| datacenter too, and integrated into the core system
| there, so maybe not. But I'm curious about this
| difference and wonder what advantages local AI might
| eventually confer.
| sharemywin wrote:
| I wonder if you could send the embeddings or some higher
| level compressed latent vector across the cloud you
| couldn't get the best of both worlds.
|
| GPS, phone orientation, last 5 apps you were in, etc. -->
| embedding
|
| you might even have like "what time is it?" compressed as
| it's own embedding.
| huevosabio wrote:
| OpenAI will make its money on enterprise deals for finetuning
| their latest and greatest on corporate data. They are already
| having this big enterprise deals and I think that's where the
| money is.
|
| They will keep pricing the off-the-shelf AI at-cost to keep
| competitors at bay.
|
| As for competitors, Anthropic is the most similar to OpenAI
| both in capabilities and business model. I am not sure what
| Google is up to, since historically their focus has been in
| using AI to enhance their products rather than making it a
| product. The "dark horses" here are Stability and Mistral which
| both are OSS and European and will try to make that their edge
| as they give the models for _free_ but to institutional clients
| that are more sensitive to the models being used and where is
| the data being handled.
|
| Amazon and Apple are probably catching up. Apple likely thinks
| that all of this just makes their own hardware more attractive.
| It's not clear to me what Meta's end goal is.
| tmpz22 wrote:
| > I think OpenAI may eventually have to go upmarket
|
| Let me introduce you to the VC business model. Get comical
| amounts of money. Charge peanuts for an initial product. Build
| a moat once you trap enough businesses inside it. Jack up
| prices.
| sharemywin wrote:
| don't forget the sneaky TOS changes you have to agree to
| robertlagrant wrote:
| OpenAI'd better hope no one else does it too, if that's all
| it takes.
| latchkey wrote:
| I just paid the $20 for a month to try it out. In my super
| limited experience, GPT-4 is actually impressive and worth the
| money.
| smileysteve wrote:
| I've spent the last few weeks comparing Google Duet with Chat
| GPT 3.5, and Chat GPT seems years ahead.
| a_wild_dandan wrote:
| The value I get for that $20/month is astonishing. It's by far
| the best discretionary subscription I've ever had.
|
| That scares me. I hate moats and actively want out. Running the
| uncensored 70B parameter Llama 2 model on my MacBook is great,
| but it's just not a competitive enough general intelligence to
| entirely substitute for GPT-4 yet. I think our community will
| get there, but the surrounding water is deepening, and I'm
| nervous...
| sharemywin wrote:
| tentatively called "Claude-Next" -- that is 10 times more
| capable than today's most powerful AI, according to a 2023
| investor deck TechCrunch obtained earlier this year.
|
| this is the thing that scare me.
|
| when do these models stop getting smarter? or at least slow
| down?
| minimaxir wrote:
| When the ChatGPT API was released 7 months ago, I posted a
| controversial blog post that the API was so cheap, it made other
| text-generating AI obsolete:
| https://news.ycombinator.com/item?id=35110998
|
| 7 months later, nothing's changed surprisingly. Even open-source
| models are trickier to get to be more cost-effective despite the
| many inference optimizations since. Anthropic Claude is closer to
| price and quality effectiveness now, but there's no reason to
| switch.
| cainxinth wrote:
| These are still early days. All the major players are willing
| to lose billions to be top of mind with consumers in an
| emerging market.
|
| Either there will be some major technological breakthrough that
| lowers their costs, or they will all eventually start raising
| prices.
| Eumenes wrote:
| "too cheap to beat" sounds anti-competitive and monopolistic.
| Large LLM providers are not dissimilar to industrial operations
| at scale - it requires alot of infrastructure and the more you
| buy/rent, the cheaper it gets. Early bird gets the worm I guess.
| stevenae wrote:
| Not sure I understand your comment, but generally you have to
| prove anti-competitiveness /beyond/ too cheap to beat (unless
| it is a proven loss-leader which, viz all big tech companies,
| seems very hard to prove)
| Havoc wrote:
| Yep. Building a project that needs some LLMs. I'm very much of
| the self-hosting mindset so will try DIY, but it's very obviously
| the wrong choice by any reasonable metric.
|
| OpenAI will murder my solution by quality, by availability, by
| reliability and by scalability...all for the price of a coffee.
|
| It's a personal project though & partly intended for learning
| purposes so there is scope for accepting trainwreck level
| tradeoffs.
|
| No idea how commercial projects are justifying this though.
| nine_k wrote:
| One small caveat: OpenAI gets to see all your prompts, and all
| the responses.
|
| Sometimes this can be unacceptable. Law,, medicine, finance,
| all of them would prefer a self-hosted, private GPT.
| kevlened wrote:
| Their data retention policy on their APIs is 30 days, and
| it's not used for training [0]. In addition, qualifying use
| cases (likely the ones you mentioned) qualify for zero data
| retention for most endpoints.
|
| [0] - https://platform.openai.com/docs/models/how-we-use-
| your-data
| nine_k wrote:
| In sensitive cases you do not think about the normal
| policy, you think about the worst case. You just can't
| afford a leak. Your local installation may be much better
| protected than a public service, by technology and by
| policy.
| BoorishBears wrote:
| For years people have essentially made a living off FUD
| like "ignore the literal legal agreement and imagine all
| the worst case scenarios!!!" to justify absolutely
| farcical on-premise deployments of a lot of software, but
| AI is starting to ruin the grift.
|
| There _are_ some cases where you really can 't afford to
| send Microsoft data for their OpenAI offering... but
| there are a lot more where some figurehead solidified
| their power by insisting the company build less secure
| versions of public offerings instead of letting their
| "gold" go to a 3rd party provider.
|
| As AI starts to appear as a competitive advantage, and
| the SOTA of self-hosted lagging so ridiculously far
| behind, you're seeing that work less and less. Take
| Harvey.ai for example: it's a frankly non-functional
| product and still manages to spook top law firms with
| tech policies that have been entrenched for decades into
| paying money despite being OpenAI based on the simple
| chance they might get outcompeted otherwise.
| littlestymaar wrote:
| > and it's not used for training [0].
|
| It's "not be used to train or improve OpenAI models",
| doesn't mean it's not used to get knowledge about your
| prompts, your business use case. In fact, the wording of
| the policy is lose enough they could train a policy model
| on it (just not the LLM itself).
| Der_Einzige wrote:
| A lot of tools for constraint, creativity, and related rely on
| manipulating the entire log probability distribution. OpenAI
| won't expose this information and is therefor shockingly
| uncompetitive on things like poetry generation
| fulafel wrote:
| This focuses on compute capacity but wouldn't the algorithmic
| improvements be much more important in bang for the buck at this
| stage as there's so much low hanging fruit as evidenced by
| constant stream of news about getting better results with less
| hardware.
| debacle wrote:
| Open source always wins, in the end. This is a fluff piece.
| downWidOutaFite wrote:
| Where's the open source web search that is beating Google?
| serjester wrote:
| I think this is under appreciated. I run a "talk-to-your-files"
| website with 5ish K MRR and a pretty generous free tier. My
| OpenAI costs have not exceeded $200 / mo. People talk about using
| smaller, cheaper models but unless you have strong data security
| requirements you're burdening yourself with serious maintenance
| work and using objectively worse models to save pennies. This
| doesn't even consider OpenAI continuously lowering their prices.
|
| I've talked to a good amount of businesses and 90% of custom use
| cases would also have negligible AI costs. In my opinion, unless
| you're in a super regulated industry or doing genuinely cutting
| edge stuff, you should probably just be using the best that's
| available (OpenAI).
| vsreekanti wrote:
| I completely agree -- open-source models and custom deployments
| just can't compete with the cost and efficiency here. The only
| exception here is _if_ open-source models can get way smaller
| and faster than they are now while maintaining existing
| quality. That will make private deployments and custom fine-
| tuning way more likely.
| SkyMarshal wrote:
| Or FOSS models remain the same size and speed, but hardware
| for running them, especially locally, steadily improves till
| the AI is "good enough" for a large enough segment of the
| market.
| hobs wrote:
| How do you deal with the fact that Azure et al are not
| appearing to sell anyone additional capacity?
| jejeyyy77 wrote:
| how do ur customers feel about you uploading potentially
| confidential documents to a 3rd party?
| CDSlice wrote:
| If they are confidential they probably shouldn't be uploaded
| to any website no matter if it calls out to OpenAI or does
| all the processing on their own servers.
| yunohn wrote:
| It's simple really, lots of businesses share data with 3rd
| parties to enable various services. OpenAI provides a service
| contract claiming they do not mine/reshare/etc the data
| shared via their API. As the SaaS provider, you just need to
| call it out your user service agreement.
| euazOn wrote:
| Just curious, could you briefly mention some of the custom use
| cases with negligible AI costs? Thanks
| cyode wrote:
| Are any OpenAI powered flows available to public, logged-out
| user traffic? I've worried (maybe irrationally) about doing
| this in a personal project and then dealing with malicious
| actors and getting stuck with a big bill.
| Bukhmanizer wrote:
| The bleeding obvious is that OpenAI is doing what most tech
| companies for the last 20 years have done. Offer the product
| for dirt cheap to kill off competition, then extract as much
| value from your users as possible by either mining data or
| hiking the price.
|
| I don't understand how people are surprised by this anymore.
|
| So yeah, it's the best option right now, when the company is
| burning through cash, but they're planning on getting that
| money back from you _eventually_.
| jaredklewis wrote:
| > Offer the product for dirt cheap to kill off competition,
| then extract as much value from your users as possible by
| either mining data or hiking the price.
|
| Genuine question, what are some examples of companies in that
| "hiking the price" camp?
|
| I can think of tons of tech companies that sold or sell stuff
| at a loss for growth, but struggling to find examples where
| the companies then are able to turn dominant market share
| into higher prices.
|
| To be clear, I'm definitely not implying they are not out
| there, just looking for examples.
| loganfrederick wrote:
| Uber, Netflix and the online content streaming services.
| These are probably the most prominent examples from this
| recent 2010s era.
| spacebanana7 wrote:
| The Google Maps API price hike of 2018 [1] is a relevant
| example.
|
| [1] https://kobedigital.com/google-maps-api-changes
| beezlebroxxxxxx wrote:
| Uber is probably the biggest pure example. When I was in
| uni when they first spread, Uber's entire business model
| was flood the market with hilariously low prices and steep
| discounts. People overnight started using them like crazy.
| They were practically giving away their product. Now,
| they're as expensive, if not sometimes more expensive, than
| any other taxi or ridesharing service in my area.
|
| One thing I'll add is that it's not always that this ends
| with higher prices in an absolute sense, but that the tech
| company is able to essentially cut the knees out of their
| competitors until they're a shell of their former selves.
| Then when the prices go "up", they're in a way a return to
| the "norm", only they have a larger and dominant market
| share because of their crazy pricing in the early stages.
| wkat4242 wrote:
| Yeah I kinda wonder why people even use them anymore.
| I've long gone back to real taxis because their cheaper
| and I don't have to book them, I can just grab one on the
| street. Much more efficient than waiting for slowly
| watching my driver edge his way to me from 3 kilometers
| away.
| jdminhbg wrote:
| The number of places where you can reliably walk out onto
| the street and hail a taxi is pretty small. Everywhere
| else, the relevant decision is whether calling a
| dispatcher or using a taxi company's app is
| faster/cheaper/more reliable than Uber/Lyft.
| mikpanko wrote:
| - Uber/Lyft increased prices significantly (and partially
| transition it into longer wait times) since they got into
| profitability mode
|
| - Google is showing more and more ads over time to power
| high revenue growth YoY
|
| - Unity has just tried to increase its prices
| jaredklewis wrote:
| I think Google fits more in the "extract as much value
| from your users" bucket more than the price hiking one.
|
| Uber/Lyft did raise prices, but interestingly (at least
| to me) is that if the strategy was the smother the
| competition with low prices, it didn't seem to work.
|
| Unity is interesting too, though I'm not sure it would
| make a good poster child for this playbook. It raised
| prices but seems to be suffering for it.
| HillRat wrote:
| Everyone's in "show your profits" mode, as befitting a
| mature market with smaller growth potential relative to
| the last few decades. Some of what we're talking about
| here is just what happens when a company tries to use
| investment capital to build a moat but fails (the
| Uber/Lyft issue you mentioned -- there's no obvious moat
| to ride-hailing, as with many software and app domains).
| My theory is that, going forward, we're going to see a
| much lower ceiling on revenue coupled with lots of
| competition in the market as VC investments cool off and
| companies can't spend their way into ephemeral market
| dominance.
|
| As for Unity, they're certainly dealing with a bunch of
| underperforming PE and IPO-enabled M&A on the one hand
| (really should have considered that AppLovin offer,
| folks), but also just a failure to extract reasonable
| income from their flagship product on the other; I don't
| think their problems come from raising prices _per se_
| (game devs pay for a lot already, an engine fee is
| nothing new to them) as much as how they chose to do it
| and the original pricing model they tried to force on
| their clients. What they chose to do and the way they
| handled it wasn 't just bad, it was "HBS case study bad."
| dboreham wrote:
| VMWare, Docker.
| zarzavat wrote:
| OpenAI doesn't own transformers, they didn't even invent
| them. They just have the best one at this particular time.
| They have no moat.
|
| At some point, someone else will make a competitive model, if
| it's Facebook then it might even be open source, and the
| industry will see price competition _downwards_.
| strangemonad wrote:
| This argument has always felt to me like saying "google has
| no moat in search, they just happen to currently have the
| best page rank. Nothing is stopping yahoo from creating a
| better one"
| jdminhbg wrote:
| Google has a flywheel where its dominant position in
| search results in more users, whose data refines the
| search algorithm over time. The question is whether
| OpenAI has a similar thing going, or whether they just
| have done the best job of training a model against a
| static dataset so far. If they're able to incorporate
| customer usage to improve their models, that's a moat
| against competitors. If not, it's just a battle between
| groups of researchers and server farms to see who is best
| this week or next.
| zarzavat wrote:
| It's a different situation computationally. Transformers
| are asymmetric: hard to train but easy to run.
|
| There is no such thing as an open source Google because
| Google's value is in its vast data centers. Search is
| hard to train and hard to run.
|
| GPT4 is not that big. It's about 220B parameters, if you
| believe geohot, or perhaps more if you don't.
|
| _One_ hard drive.
| shihab wrote:
| My understanding is that Google search is a lot more than
| just Pagerank (Map reduce for example). They had lots of
| heuristics, data, machine learning before anyone else
| etc.
|
| Whereas the underlying algorithms behind all these GPTs
| so far are broadly same. Yes, OpenAI does probably have
| better data, model finetuning and other engineering
| techniques now, but I don't feel it's anything special
| that'll allow themselves to differentiate themselves from
| competitors in the long run.
|
| (If the data collected from a current LLM user in
| improving model proves very valuable, that's different. I
| personally think that's not the case now but who knows).
| YetAnotherNick wrote:
| The difference between openai and next best model seems to
| be increasing and not decreasing. Maybe Google's gemini
| could be competitive, but I don't believe open source will
| match OpenAI's capability ever.
|
| Also OpenAI gets significant discount on compute due to
| favourable deals from Nvidia and Microsoft. And they could
| design their server better for their homogenous needs. They
| are already working on AI chip.
| goosinmouse wrote:
| Are you using 3.5 turbo? Its always funny when i test a new fun
| chatbot or something and see my API usage 10x just from a
| single GPT 4 API call. Although i only usually have a $2 bill
| every month from openAI.
| littlestymaar wrote:
| > you should probably just be using the best that's available
| (OpenAI).
|
| Sure, if you want to let a monopoly have all the added value
| while you get to keep the rest you can do that.
|
| Just make sure you're never successful enough to inspire them
| though, otherwise you're dead the next minute. Oops.
| zzbn00 wrote:
| p4d.24xlarge spot price is $8.2 / hour in US East 1 at the
| moment...
| charlesischuck wrote:
| Good luck getting that lol
| tester756 wrote:
| >iPhone of artificial intelligence
|
| It feels like the biggest investor bait of this year
|
| Will it beat ARM IPO?
| lossolo wrote:
| It's also worth noting that if you build your business on using
| OpenAI's LLM or Anthropic etc, then, in the majority of cases
| I've seen so far (no fine tuning etc), your competitor is just
| one prompt away from replicating your business.
| beauHD wrote:
| I signed up for OpenAI's ChatGPT tool, and entered a query, like
| 'What does the notation 1e100 mean?' (just to try it out). And
| then when displaying the output it would start outputting the
| reply in a slow way, like, it was dripfeeded to me, and I was
| like: 'what? surely this could be faster?'
|
| Maybe I'm missing something crucial here, but why does it
| dripfeed answers like this? Does it have to think really hard
| about the meaning of 1e100? Why can't it just spit it out
| instantly without such a delay/drip, like with the near-instant
| Wolfram Alpha?
| baby wrote:
| You can but it'll take longer. So one way to get faster answers
| is to stream the response as it is generated. And in GPT-based
| apps the response is generated token by token (~4chars), hence
| what you're seeing.
| maccam912 wrote:
| Its a result of how these transformer models work. It's pretty
| quick for the amount of work it does, but it's not looking up
| anything, it's generating it a token a time.
| notRobot wrote:
| Under the hood, GPT works by predicting the next token when
| provided with an input sequence of words. At each step a single
| word is generated taking into consideration all the previous
| words.
|
| https://ai.stackexchange.com/questions/38923/why-does-chatgp...
| swatcoder wrote:
| The non-technical way to think about it is that ChatGPT "thinks
| out loud" and can _only_ "think out loud".
|
| Future products would be able to hide some of that, but for
| now, that's what the ChatGPT / Bing Assistant product does.
| codedokode wrote:
| Because it needs to do billions of arithmetic operations to
| generate a reply. Replying to questions is not an easy task.
| iambateman wrote:
| This is _the_ playbook for big, fast scaling companies...Uber
| subsidized every ride for _a decade_ before finally charging
| market price, just to make sure that Uber was the only option
| which made sense.
|
| While it's nice to consume the cheap stuff, it is not good for
| healthy markets.
| matteoraso wrote:
| It's not even just the cost of finetuning. The API pricing is so
| low, you literally can't save money by buying a GPU and running
| your own LLM, no matter how many tokens you generate. It's an
| incredible moat for OpenAI, but something they can't provide is
| an LLM that doesn't talk like an annoying HR manager, which is
| the real use case for self-hosting.
| rosywoozlechan wrote:
| The service quality sucks. You're getting what you pay for. We
| switched to Azure Open AI APIs because of all the service quality
| issues.
| layer8 wrote:
| Isn't OpenAI too cheap to be sustainable, and currently living
| off Microsoft's $10B investment?
| xnx wrote:
| Nothing in that article convinces me the situation couldn't
| change entirely in any given month. Google Gemini could be more
| capable. Any number of new players (AWS, Microsoft, Apple) could
| enter the market in a serious way. The head-start OpenAI has in
| usage data is small and probably eclipsed by the clickstream and
| data stores that Google and Microsoft have access to. I see no
| durable advantage for OpenAI.
| freedomben wrote:
| Gemini very well might be the biggest threat to OpenAI. ChatGPT
| has first-mover advantage so has a decent moat, but the amount
| of people willing to pay $20 per month for something worse[1]
| than they get for free with google.com is going to dwindle. I'd
| be very worried if I were them.
|
| [1]: That knowledge cutoff and terrible UX of browse the web is
| brutal compared to the experience of Bard
| appplication wrote:
| The premise of this is flawed. OpenAI is cheap because of has to
| be right now. They need to establish market dominance quickly,
| before competitors slide in. The winner of this horse race is not
| going to be the company with the best performing AI, it's going
| to be the one who does the best job at creating an outstanding
| UX, ubiquitously presence, entrenching users, and building
| competitive moats that are not feature differentiated because at
| best even cutting edge features are only 6-12 months ahead of
| competition cloning or beating.
|
| This is Uber/AirBnB/Wework/literally every VC subsidized hungry-
| hungry-hippos market grab all over again. If you're falling in
| love because the prices are so low, that is ephemeral at best and
| is not a moat. Someone try calling an Uber in SF today and tell
| me how much that costs you and how much worse the experience is
| vs 2017.
|
| OpenAI is the undisputed future of AI... for timescales 6 months
| and less. They are still extremely vulnerable to complete
| disruption and as likely to be the next MySpace as they are
| Facebook.
| shaburn wrote:
| Your Uber/AirBnB/Wework all have physical base units with
| ascending costs due to inflation and theoretical economies of
| scale.
|
| AI models have some GPU constraints but could easily reach a
| state where the cost to opperate falls and becomes relatively
| trivial with almost no lowerbound, for most use cases.
|
| You are correct there is a race for marketshare. The crux in
| this case will be keeping it. Easy come, easy go. Models often
| make the worst business model.
| monocasa wrote:
| Probably why Altman has been talking so much about how
| dangerous it is and how regulations are needed. No natural
| moat, so building a regulatory one.
| blackoil wrote:
| This point is discussed in the article. Title is not for
| Google/Meta, they'll invest all the billions that they have to.
|
| It is for the consumers of these models, is there even a point
| to train your own or experiment with OSS!
| hendersoon wrote:
| Sure, open models often require much less hardware than
| chatGPT3.5 and offer ballpark (and constantly improving)
| performance and accuracy. ChatGPT3.5 scores 85 in ARC and the
| huggingface leaderboard is up to 77.
|
| If you need chatGPT4-quality responses they aren't close yet,
| but it'll happen.
| toddmorey wrote:
| Just heard Steve today from Builder.io who did an impressive
| launch of Figma -> code powered by AI.
|
| They trained a custom model for this. Better accuracy, sure,
| but I was a little surprised to watch how much faster it is
| than GPT4.
|
| Based on their testing, they've become believers in domain
| specific smaller models, especially for performance.
| ldjkfkdsjnv wrote:
| Completely wrong, the best AI will win. There is insane demand
| for better models.
| datadrivenangel wrote:
| There is insane demand for good enough models at extremely
| good prices.
|
| Better beyond a certain point is unlikely to be competitive
| with the cheaper models.
| oceanplexian wrote:
| Yep, quality over quantity. The difference between 99.9%
| accurate and 99.999% accurate can be ridiculously valuable in
| so many real world applications where people would apply
| LLMs.
| gbmatt wrote:
| Only Big Tech (Microsoft,Google,Facebook) can crawl the web
| at scale because they own the major content companies and
| they severly throttle the competition's crawlers, and
| sometimes outright block them. I'm not saying it's impossible
| to get around, but it is certainly very difficult, and you
| could be thrown in prison for violating the CFAA.
| PaulHoule wrote:
| I'm not sure if training on a vast amount of content is
| really necessary in the sense that linguistic competence
| and knowledge can probably be separated to some extent.
| That is, the "ChatGPT" paradigm leads to systems that just
| confabulate and "makes shit up" and making something
| radically more accurate means going to something retrieval-
| based or knowledge graph-based.
|
| In that case you might be able to get linguistic competence
| with a much smaller model that you end up training with a
| smaller, cleaner, and probably partially synthetic data
| set.
| wkat4242 wrote:
| The improvements seem to be leveling off already. GPT-4 isn't
| really worth the extra price to me. It's not that much
| better.
|
| What I would really want though is an uncensored LLM. OpenAI
| is basically unusable now, most of its replies are like "I'm
| only a dumb AI and my lawyers don't want me to answer your
| question". Yes I work in cyber. But it's pretty insane now.
| bugglebeetle wrote:
| GPT-4, correctly prompted, is head and shoulders above
| everything for coding. All the text generation stuff and
| NLP tasks, it's a toss-up.
| jrockway wrote:
| I haven't played with the self-hosted LLMs at all yet, but
| back when Stable Diffusion was brand new I had a ton of fun
| creating images that lawyers wouldn't want you to create.
| ("Abraham Lincoln and Donald Trump riding a battle
| elephant." It's just so much funnier with living people!) I
| imagine that Llama-2 and friends offer a similar
| experience.
| PaulHoule wrote:
| Depends how you define quality. This paper reflects my own
| experience
|
| https://arxiv.org/abs/2305.08377
|
| and shows how LLM technology has a lot more to offer than
| "ChatGPT". The real takeaway is that by training LLMs with
| real training data (even with a "less powerful" model) you
| can get an error rate more than 10x less than you get with
| the "zero shot" model of asking ChatGPT to answer a question
| for you the same way that Mickey Mouse asked the broom to
| clean up for him in _Fantasia._ The "few-shot" approach of
| supplying a few examples in the attention window was a little
| better but not much.
|
| The problem isn't something that will go away with a more
| powerful model because the problem has a lot to do with the
| intrinsic fuzziness of language.
|
| People who are waiting for an exponentially more expensive
| ChatGPT-5 to save them will be pushing a bubble around under
| a rug endlessly while the grinds who formulate well-defined
| problems and make training sets will actually cross the
| finish line.
|
| Remember that Moore's Law is over in the sense that
| transistors are not getting cheaper generation after
| generation, that is why the NVIDIA 40xx series is such a
| disappointment to most people. LLMs have some possibility of
| getting cheaper from a software perspective as we understand
| how they work and hardware can be better optimized to make
| the most of those transistors, but the driving force of the
| semiconductor revolution is spent unless people find some
| entirely different way to build chips.
|
| But... people really want to be like Mickey in _Fantasia_ and
| hope the grinds are going to make magic for them.
| sbierwagen wrote:
| > Remember that Moore's Law is over in the sense that
| transistors are not getting cheaper generation after
| generation, that is why the NVIDIA 40xx series is such a
| disappointment to most people.
|
| Huh? The NVIDIA H100 has twice the FLOPS of the A100 on a
| smaller die. How is that not Moore's law?
| mg wrote:
| I don't think Uber and AirBnB are good comparisons.
|
| Both are B2C and have network effects.
| paul7986 wrote:
| The PI iPhone app has a solid UX and even better UX if Apple
| (bought it) integrated into Siri.
| kcorbitt wrote:
| Eh, OpenAI is too cheap to beat at their own game.
|
| But there are a ton of use-cases where a 1 to 7B parameter fine-
| tuned model will be faster, cheaper and easier to deploy than a
| prompted or fine-tuned GPT-3.5-sized model.
|
| In fact, it might be a strong statement but I'd argue that _most_
| current use-cases for (non-fine-tuned) GPT-3.5 fit in that
| bucket.
|
| (Disclaimer: currently building https://openpipe.ai; making it
| trivial for product engineers to replace OpenAI prompts with
| their own fine-tuned models.)
| kristjansson wrote:
| This article might have a point about the data flywheel, but it's
| lost in the confused economics in the second half. Why would we
| expect to hire one engineer per p4.24x instance? Why do we think
| OpenAI needs a whole p4.24x to run fine tuning? Why do we ignore
| the higher costs on the inference side for fine-tuned models? Why
| do we think OpenAI spends _any_ money on racking-and-stacking
| GPUs rather than just take them at (hyperscaler) cost from Azure?
| oceanplexian wrote:
| Has anyone actually used GPT4? It's not "cheap".
|
| It was roughly $150 for me to build a small dataset with a few
| thousand quarter-page chunks of text for a data project using
| GPT4. GPT3 is substantially cheaper but it would hallucinate 30%
| of the time; honestly a nice fine-tune of LlaMA is on-par with
| GPT3 and after the sunk cost all it costs is a few $0.01 in
| electricity to generate the same sized dataset.
| slowhadoken wrote:
| It's insanely expensive to run and operate "AI". Meredith
| Whittaker's talk on AI is very insightful
| https://www.youtube.com/watch?v=amNriUZNP8w
| slowhadoken wrote:
| Thanks to traumatized $2 an hour Kenyan labor, yeah
| https://time.com/6247678/openai-chatgpt-kenya-workers/
| pimpampum wrote:
| Classic anti-competition strategy, sell below cost and burn money
| until competition is out, then sell higher than you could have
| ever sold with competition.
| BrunoJo wrote:
| We just started a service different open source models and with
| an OpenAI compatible API [1]. The pricing isn't final and we
| haven't officially launched yet but you should be able to save at
| least 75% compared to GPT 3.5.
|
| [1] https://lemonfox.ai/
| Meegul wrote:
| Are you doing this profitably? If so, does that entail owning
| your own hardware or renting from cheaper services such as
| Lambda?
| slowhadoken wrote:
| none of it is cheap, "AI" insanely expensive. Meredith Whittaker
| talks about it in this interview
| https://www.youtube.com/watch?v=amNriUZNP8w She's the president
| of the Signal Foundation.
| AJRF wrote:
| I read this and think "That won't last long".
|
| The pricing is too good to be true with you think about it
| rationally. If they raise prices they seem much, much less
| attractive than using AWS or Azure.
|
| Amazon seem to have a much better business built around their
| Bedrock offering. And all their other tools are available there
| like SageMaker, ec2, integration with MLFlow, etc, etc.
|
| I guess the same goes for Azure, if you are already using it it's
| much easier to just stick with whatever they are offering for LLM
| Ops.
|
| OpenAI offering just models doesn't seem like it can last
| forever, and to compete with AWS or Azure at enterprise level
| they need to build all the things Amazon/MS have built.
|
| The other side of that coin seems much more realistic.
| DominikPeters wrote:
| > While per-token inference costs for fine-tuned GPT-3.5 is 10x
| more expensive than GPT-3.5 it is still 10x cheaper than GPT-4!
|
| Not quite accurate; finetuned 3.5 is only 4x cheaper than GPT-4.
| Cost per million output tokens from https://openai.com/pricing $2
| - GPT-3.5 $16 - finetuned GPT-3.5 $60 - GPT 4
___________________________________________________________________
(page generated 2023-10-12 21:00 UTC) |