proxy70

	[HN Gopher] OpenAI is too cheap to beat ___________________________________________________________________ OpenAI is too cheap to beat Author : cgwu Score : 162 points Date : 2023-10-12 18:16 UTC (2 hours ago)
	web link (generatingconversation.substack.com)
	w3m dump (generatingconversation.substack.com)
	\| eurekin wrote: \| Didn't see batching taken into equation, might skew a bit \| sidnb13 wrote: \| Yep, batching is a feature I really wish the OpenAI API had. \| That and the ability to intelligently cache frequently used \| prompts. Much easier to achieve this with a hosted OS model, so \| I guess it's a speed + customizability/cost tradeoff for the \| time being. \| advaith08 wrote: \| imo they dont have batching because they pack sequences \| before passing through the model. so a single sequence in a \| batch on OpenAI might have requests from multiple customers \| in it \| jonplackett wrote: \| Is this a reflection of OpenAI's massive scale making it so cheap \| for them? \| \| Or is it the deal with Microsoft for cloud services making it \| cheap? \| \| Or are they just operating at a massive loss to kill off other \| competition? \| \| Or something else? \| 4death4 wrote: \| Probably all three: \| \| 1) They hiring too talent to make their models as efficient as \| possible. \| \| 2) They have a sweetheart deal with MS. \| \| 3) They're better funded than everyone else and bringing in \| substantial revenue. \| smachiz wrote: \| deleted \| ryduh wrote: \| Is this a guess or is it informed by facts? \| sebzim4500 wrote: \| Are just suggesting this as an option or do you have \| evidence that it is true? \| ugjka wrote: \| They are also trying to lobby the government for AI \| "regulation" in order limit any competitors ability achieve \| OpenAI's level \| wkat4242 wrote: \| They basically are MS by now. Everyone at Microsoft I work \| with literally calls it an 'aquisition'. Even though they \| only own a share. It's pretty clear what their plans are. \| SkyMarshal wrote: \| Probably the first two, plus first-mover brand recognition. \| Millions of $20 monthly subs for GPT4 add up. \| \| They might also be operating at a loss afaik, but I suspect \| they're one of the few that can break even just based on scale, \| brand recognition, and economics. \| michaelbuckbee wrote: \| $20/mo subs which is also the lead in to also unlocking paid \| API access. \| sarchertech wrote: \| I haven't heard any evidence that they have millions of Plus \| subscribers. \| \| I've seen 100 to 200 million active users, but nothing about \| paid users from them. The surveys I saw when doing a quick \| google search reported much less than 1% of users paying. \| SkyMarshal wrote: \| Yeah I don't know what the actual subscription numbers are, \| would be surprised if OpenAI is publishing that info. \| ShadowBanThis01 wrote: \| They're mining the gullible for phone numbers, among other \| things. \| vsreekanti wrote: \| Probably some combination of all the above! I think 1 and 2 are \| interlinked though -- the cheaper they can be, the more they \| build that moat. They might be eating the cost on these APIs \| too, but unlike the Uber/Lyft war, it'll be way stickier. \| te_chris wrote: \| There's also just the benefits of being in market, at scale and \| being exposed to the full problem space of serving and \| maintaining services that use these models. It's one thing to \| train and release and OSS model, it's another to put it into \| production and run all the ops around it. \| iliane5 wrote: \| I think it's mostly the scale. Once you have a consistent user \| base and tons of GPUs, batching inference/training across your \| cluster allows you to process requests much faster and for a \| lower marginal cost. \| ilaksh wrote: \| I think the weird thing about this is that it's completely true \| right now but in X months it may be totally outdated advice. \| \| For example, efforts like OpenMOE \| https://github.com/XueFuzhao/OpenMoE or similar will probably \| eventually lead to very competitive performance and cost- \| effectiveness for open source models. At least in terms of \| competing with GPT-3.5 for many applications. \| \| Also see https://laion.ai/ \| \| I also believe that within say 1-3 years there will be a \| different type of training approach that does not require such \| large datasets or manual human feedback. \| sidnb13 wrote: \| > I also believe that within say 1-3 years there will be a \| different type of training approach that does not require such \| large datasets or manual human feedback. \| \| I guess if we ignore pretraining, don't sample-efficient fine- \| tuning on carefully curated instruction datasets sort of \| achieve this? LIMA and OpenOrca show some really promising \| results to date. \| sharemywin wrote: \| distilbert was trained from Bert. there might be an angle \| using another model to train the model especially if your \| trying to get something to run locally. \| nico wrote: \| > I also believe that within say 1-3 years there will be a \| different type of training approach that does not require such \| large datasets or manual human feedback \| \| This makes a lot of sense. A small model that "knows" enough \| English and a couple of programming languages should be enough \| for it to replace something like copilot, or use plug-ins or do \| RAG on a substantially larger dataset \| \| The issue right now is that to get a model that can do those \| things, the current algorithms still need massive amounts of \| data, way more than what the final user needs \| Dwedit wrote: \| Abbreviate Mix of Experts as "MoE" and the Anime fans \| immediately start rushing in... \| daft_pink wrote: \| I'm confused don't a100s cost 10,000 to buy? Why would you pay \| 166k per year to rent? \| sidnb13 wrote: \| I would assume the datacenter and infra needed would also \| contribute a sizeable chunk to the costs when you consider \| upkeep to run it 24/7 \| latchkey wrote: \| For the same reason people use AWS. \| \| Spending the capex/opex to run a cluster of compute isn't easy \| or cheap. It isn't just the cost of the GPU, but the cost of \| everything else around it that isn't just monetary. \| etothepii wrote: \| This could be an interesting comparison. My experience with \| AWS is that it was super easy and cheap to start on. By the \| time we _could_ use whole servers we were using so much AWS \| orchestration that it 's going to be put off until we are at \| least $1M ARR, and probably til we are at $5M. \| \| Make adoption easy, give a free base tier but charge more \| could be a very effective model to get start ups stuck on \| you. It even probably makes adoption by small teams in big \| companies possible that can then grow ... \| dekhn wrote: \| How much does an A100 consume in power a year (in dollar \| costs)? How much does it cost to hire and retain datacenter \| techs? How long does it take to expand your fleet after a user \| says "we're gonna need more A100s?" How many discounts can you \| get as a premier customer? \| \| Answer these questions, and the equation shifts a bunch! \| shrubble wrote: \| Not really. \| \| A full rack with 16 amps usable power and some bandwidth is \| $400/month in Kansas City, MO. That is enough to power 5x \| A100s 24x7, so 10k plus $80 per month each, amortized, of \| course many more A100s would drop the price. \| \| Once installed in the rack ($250 1 time cost) you shouldn't \| need to touch it. So 10k plus $1250 per A100, per year \| including power. You can put 2 or 3 A100s per cheapo Celeron \| based CPU with motherboards. \| \| Of course if doing very bursty work then it may well make \| sense to rent... \| akomtu wrote: \| And how many A100s do you need to do something meaningful \| with LLMs? \| shrubble wrote: \| The funding has to come from somewhere, right? You either \| pay up front and save money over time, or pay as you go \| and pay more... \| dekhn wrote: \| Did you also include the network required to make the A100s \| talk to each other? Both the datacenter network (so the \| CPUs can load data) and the fabric (so the A100s can talk?) \| \| You also left out the data tech costs- probably at least \| $50K/individual-year in KC (although I guess I'd just work \| for free ribs). \| \| If you're putting A100s into celeron motherboards... I \| don't know what to say. You're not saving money by putting \| a ferrari engine in a prius. \| latchkey wrote: \| $50m GPU capex (which is A LOT) is about 2-3MW of power, it \| isn't that much. \| \| The problem though is that getting 2-3MW of power in the US \| is increasingly difficult and you're going to pay a lot more \| for it since the cheap stuff is already taken. \| \| Even more distressing is that if you're going to build new \| data center space, you can't get the rest of the stuff in the \| supply chain... backup gennies, transformers, cooling towers, \| etc... \| amluto wrote: \| Those are 8x A100 systems. \| joefourier wrote: \| AWS is extremely overpriced for nearly every service. I don't \| know why anyone else outside of startups with VC money to burn \| or bigcos that need the "no one ever got fired for buying IBM" \| guarantee would use them. You're better off with Lambdalabs or \| others which charge only $1.1/h per A100. \| \| Also that is a 8xA100 system as others have noted, but it is \| the 40GB one which can be found on eBay for as low as $3k if \| you go with the SXM4 one (although the price of supporting \| components may vary) or $5k for the PCI-e version. \| wg0 wrote: \| There are only two services that are dirt cheap and way too \| reliable, useful.That's S3 and SQS. Rest can get very \| expensive very soon. \| \| You can build a lot of stuff on top of these two. \| ommpto wrote: \| Even for S3 while the storage is dirt cheap they still have \| exorbitant bandwidth pricing. \| charcircuit wrote: \| S3 is not dirt cheap. Bandwidth is ludicrously expensive. \| charlesischuck wrote: \| You pay for the system not the gpu with AWS. \| \| It's absolutely worth the money when you look at the whole \| picture. Also lambda labs never has availability. I actually \| can schedule a distributed cluster on AWS. \| AndroTux wrote: \| > It's absolutely worth the money when you look at the \| whole picture. \| \| That highly depends on many things. If you run a business \| with a relatively steady load that doesn't need to scale \| quickly multiple times per day, AWS is definitely not for \| you. Take Let's Encrypt[1] as an example. Just because \| cloud is the hype doesn't mean it's always worth it. \| \| Edit: Or a personal experience: I had a customer that \| insisted on building their website on AWS. They weren't \| expecting high traffic loads and didn't need high \| availability, so I suggested to just use a VPS for $50 a \| month. They wanted to go the AWS route. Now their website \| is super scalable with all the cool buzzwords and it costs \| them $400 a month to run. Great! And in addition, the whole \| setup is way more complex to maintain since it's built on \| AWS instead of just a simple website with a database and \| some cache. \| \| [1] https://news.ycombinator.com/item?id=37536103 \| nharada wrote: \| Sometimes I need 512 GPUs for 3 days. \| charlesischuck wrote: \| A top end gpu now to make you competitive cost 20-50k per gpu. \| \| To train a top model you need hundreds of them in a very \| advanced datacenter. \| \| You can't just plug gpus into standard systems and train, \| everything is custom. \| \| The technical talent required for these systems is rare to say \| the least. The technical talent to make a model is also rare. \| \| I trained a few foundation models with images, and I would \| NEVER buy any of them. These guys are on a wildly different \| scale than basically everyone. \| SkyMarshal wrote: \| I think OpenAI may eventually have to go upmarket, as basic "good \| enough" AI becomes increasingly viable and cheap/free on consumer \| level devices, supplied by FOSS models and apps. \| \| Apple may be leading the way here, with Apple Silicon \| prioritizing AI processing and built into all their devices. \| These capabilities are free (or at least don't require an extra \| sub), and just used to sell more hardware. \| \| OpenAI is clearly going to compete in that market with its \| upcoming smart phone or device [1]. But what revenue model can \| OpenAI use to compete with Apple's and not get undercut by it? I \| suppose hardware + free GPT3.5, and optional subscription to GPT4 \| (or whatever their highest end version is). Maybe that will be \| competitive. \| \| I also wonder what mobile OS OpenAI will choose. Probably not \| Android, otherwise they would have partnered with Google. A \| revamped and updated Microsoft mobile OS maybe, given their MS \| partnership? Or something new and bespoke? I could imagine Johnny \| Ive demanding something new, purpose-built, and designed from \| scratch for a new AI-oriented UI/UX paradigm. \| \| A market for increasingly sophisticated AI that can only be done \| in huge GPU datacenters will exist, and that's probably where the \| margins will be for a long time. I think that's what OpenAI, \| Microsoft, Google, and the others will be increasingly competing \| for. \| \| [1]:https://www.reuters.com/technology/openai-jony-ive-talks- \| rai... \| vsreekanti wrote: \| Yep, we agree that the obvious direction of innovation for OSS \| models is smaller and cheaper, likely at roughly the same \| quality: https://generatingconversation.substack.com/p/open- \| source-ll... \| smcleod wrote: \| Also more privacy respecting, and more customisable / \| flexible. \| mensetmanusman wrote: \| Please Apple let me replace worthless Siri with ChatGPT on my \| iPhone. \| \| Would completely change how I use the device. \| bitcurious wrote: \| If you have the new iPhone with the action button, you can \| set a shortcut to ask questions of ChatGPT. It's not as fluid \| as Siri, and can't control anything, but still much more \| useful. \| CamperBob2 wrote: \| Just yesterday, while driving: "Read last message." \| \| Siri: "Sorry. Dictation service is unavailable at the \| moment." \| \| It's past time for excuses. High-level people at Apple need \| to be fired over this. Hello? Tim? Do your job. Hello? \| Anybody home...? \| freedomben wrote: \| Nobody is switching away from Apple over this, so \| ultimately Tim _is_ doing his job. Under his watch Apple \| has become the defacto choice for entire generations. \| Between vendor-lockin /walled gardens and societal/cultural \| pressures (don't want to be a green bubble!), they have one \| of the stickiest user bases there are. \| mensetmanusman wrote: \| True, but that doesn't mean we shouldn't complain. \| \| My hope is that the upcoming eu rulings allow competition \| here. Ie force Apple to get out of the way of making \| their hardware better with better software. \| CamperBob2 wrote: \| Stop excusing shitty work from trillion-dollar companies. \| It makes the world a worse place. \| smoldesu wrote: \| I think it's shitty and has no excuse, but the parent is \| right. Apple has no incentive to respond to their users \| since all roads lead to first-party Rome. It's why stuff \| like the Digital Market Act is more needed than some \| people claim. \| \| You know what would get Apple to fix this? Forced \| competition. You know what Apple spends their trillions \| preventing? \| layer8 wrote: \| Apple is ramping up spending in that area: \| https://www.macrumors.com/2023/09/06/apple-conversational- \| ai... \| \| It'll probably take a while though. \| grahamplace wrote: \| > OpenAI is clearly going to compete in that market with its \| upcoming phone \| \| What phone are you referring to? A quick google didn't seem to \| pull up anything related to OpenAI launching a hardware \| product? \| BudaDude wrote: \| They are most likely referring to this in collaboration with \| Jony Ive: \| \| https://www.yahoo.com/entertainment/openai-jony-ive-talks- \| ra... \| SkyMarshal wrote: \| Yes that one. \| jimkoen wrote: \| > OpenAI is clearly going to compete in that market with its \| upcoming phone. \| \| Excuse me, I'm not an english native, you mean like a smart \| phone? Or do you mean some sort of other new business \| direction? Where did you get the info thtat they're planning to \| launch a phone? \| MillionOClock wrote: \| I believe there has been rumors that OpenAI was working with \| Jony Ive to create a wearable device, but it was unclear \| wether it would be a phone or something else. \| SkyMarshal wrote: \| Yes a smartphone, /corrected. It's a recent announcement: \| \| https://www.nytimes.com/2023/09/28/technology/openai- \| apple-s... \| sharemywin wrote: \| It's not a really a phone. they mention ambient computing. \| SkyMarshal wrote: \| Oh, smart device then. \| layer8 wrote: \| https://www.reuters.com/technology/openai-jony-ive-talks- \| rai... \| layer8 wrote: \| Where are you taking the confidence that Apple will be able to \| catch up to OpenAI's GPT? "Apple's built-in AI capabilities" \| are very weak so far. \| filterfiber wrote: \| Not OP, \| \| In my experience apple's ML on iphones is seamless. Tap and \| hold on your dog in a picture and it'll cut out the \| background, your photos are all sorted automatically \| including by person (and I think by pet). \| \| OCR is seamless - you just select text in images as if it was \| real text. \| \| I totally understand these aren't comparable to LLMs - rumor \| has it apple is working on an llm - if their execution is \| anything like their current ML execution it'll be glorious. \| \| (Siri objectively sucks although I'm not sure it's fair to \| compare siri to an LLM as AFAIK siri does not do text \| prediction but is instead a traditional "manually crafted \| workflow" type of thing that just uses S2T to navigate) \| blackoil wrote: \| >OCR is seamless \| \| Wasn't that solved about a decade ago. Does anyone suck at \| that? \| filterfiber wrote: \| > Does anyone suck at that? \| \| Does android even have native OCR? Last I checked \| everything required an OCR app of varying quality \| (including windows/linux). \| \| On ios/macos you can literally just click on a picture \| and select the text in it as if it wasn't a picture. I \| know for sure on iOS you don't even open an app to do it, \| just any picture you can select it. \| \| Last I checked the Opensource OCR tools were decent but \| behind the closed source stuff as well. \| \| Random google result of OCR on android (could be \| outdated) - https://www.reddit.com/r/androidapps/comments \| /10te5et/why_oc... \| smoldesu wrote: \| > Does android even have native OCR? \| \| Tesseract? https://github.com/tesseract-ocr/tesseract \| SkyMarshal wrote: \| I'm not saying they will on the high-end, but maybe on the \| low end. Apple's strategy is to embed local AI in all their \| devices. Local AI will never be as capable as AI running in \| massive GPU datacenters, but if it can get to a point that \| it's "good enough" for most average users, that may be enough \| for Apple to undercut the low end of the market. \| freedomben wrote: \| > _Local AI will never be as capable as AI running in \| massive GPU datacenters_ \| \| I'm not sure this is true, even in the short term. For some \| things yes, that's definitely true. But for other things \| that are real-time or near real-time where network latency \| would be unacceptable, we're already there. For example, \| Google's Pixel 8 launch includes real-time audio \| processing/enhancing which is made possible by their new \| Tensor chip. \| \| I'm no fan of Apple, but I think they're on the right path \| with local AI. It may even be possible that the tendency of \| other device makers to put AI in the cloud might give Apple \| a much better user experience, unless Google can start \| thinking local-first which kind of goes against their \| grain. \| SkyMarshal wrote: \| _> But for other things that are real-time or near real- \| time where network latency would be unacceptable, we 're \| already there._ \| \| Agreed. Something else I wonder is if local AI in mobile \| devices might be better able to learn from its real-time \| interactions with the physical world than datacenter- \| based AI. \| \| It's walking around in the world with a human with all \| its various sensors recording in real-time (unless \| disabled) - mic, camera, GPS/location, LiDAR, barometer, \| gyro, accelerometer, proximity, ambient light, etc. Then \| the human uses it to interact with the world too in \| various ways. \| \| All that data can of course be quickly sent to a \| datacenter too, and integrated into the core system \| there, so maybe not. But I'm curious about this \| difference and wonder what advantages local AI might \| eventually confer. \| sharemywin wrote: \| I wonder if you could send the embeddings or some higher \| level compressed latent vector across the cloud you \| couldn't get the best of both worlds. \| \| GPS, phone orientation, last 5 apps you were in, etc. --> \| embedding \| \| you might even have like "what time is it?" compressed as \| it's own embedding. \| huevosabio wrote: \| OpenAI will make its money on enterprise deals for finetuning \| their latest and greatest on corporate data. They are already \| having this big enterprise deals and I think that's where the \| money is. \| \| They will keep pricing the off-the-shelf AI at-cost to keep \| competitors at bay. \| \| As for competitors, Anthropic is the most similar to OpenAI \| both in capabilities and business model. I am not sure what \| Google is up to, since historically their focus has been in \| using AI to enhance their products rather than making it a \| product. The "dark horses" here are Stability and Mistral which \| both are OSS and European and will try to make that their edge \| as they give the models for _free_ but to institutional clients \| that are more sensitive to the models being used and where is \| the data being handled. \| \| Amazon and Apple are probably catching up. Apple likely thinks \| that all of this just makes their own hardware more attractive. \| It's not clear to me what Meta's end goal is. \| tmpz22 wrote: \| > I think OpenAI may eventually have to go upmarket \| \| Let me introduce you to the VC business model. Get comical \| amounts of money. Charge peanuts for an initial product. Build \| a moat once you trap enough businesses inside it. Jack up \| prices. \| sharemywin wrote: \| don't forget the sneaky TOS changes you have to agree to \| robertlagrant wrote: \| OpenAI'd better hope no one else does it too, if that's all \| it takes. \| latchkey wrote: \| I just paid the $20 for a month to try it out. In my super \| limited experience, GPT-4 is actually impressive and worth the \| money. \| smileysteve wrote: \| I've spent the last few weeks comparing Google Duet with Chat \| GPT 3.5, and Chat GPT seems years ahead. \| a_wild_dandan wrote: \| The value I get for that $20/month is astonishing. It's by far \| the best discretionary subscription I've ever had. \| \| That scares me. I hate moats and actively want out. Running the \| uncensored 70B parameter Llama 2 model on my MacBook is great, \| but it's just not a competitive enough general intelligence to \| entirely substitute for GPT-4 yet. I think our community will \| get there, but the surrounding water is deepening, and I'm \| nervous... \| sharemywin wrote: \| tentatively called "Claude-Next" -- that is 10 times more \| capable than today's most powerful AI, according to a 2023 \| investor deck TechCrunch obtained earlier this year. \| \| this is the thing that scare me. \| \| when do these models stop getting smarter? or at least slow \| down? \| minimaxir wrote: \| When the ChatGPT API was released 7 months ago, I posted a \| controversial blog post that the API was so cheap, it made other \| text-generating AI obsolete: \| https://news.ycombinator.com/item?id=35110998 \| \| 7 months later, nothing's changed surprisingly. Even open-source \| models are trickier to get to be more cost-effective despite the \| many inference optimizations since. Anthropic Claude is closer to \| price and quality effectiveness now, but there's no reason to \| switch. \| cainxinth wrote: \| These are still early days. All the major players are willing \| to lose billions to be top of mind with consumers in an \| emerging market. \| \| Either there will be some major technological breakthrough that \| lowers their costs, or they will all eventually start raising \| prices. \| Eumenes wrote: \| "too cheap to beat" sounds anti-competitive and monopolistic. \| Large LLM providers are not dissimilar to industrial operations \| at scale - it requires alot of infrastructure and the more you \| buy/rent, the cheaper it gets. Early bird gets the worm I guess. \| stevenae wrote: \| Not sure I understand your comment, but generally you have to \| prove anti-competitiveness /beyond/ too cheap to beat (unless \| it is a proven loss-leader which, viz all big tech companies, \| seems very hard to prove) \| Havoc wrote: \| Yep. Building a project that needs some LLMs. I'm very much of \| the self-hosting mindset so will try DIY, but it's very obviously \| the wrong choice by any reasonable metric. \| \| OpenAI will murder my solution by quality, by availability, by \| reliability and by scalability...all for the price of a coffee. \| \| It's a personal project though & partly intended for learning \| purposes so there is scope for accepting trainwreck level \| tradeoffs. \| \| No idea how commercial projects are justifying this though. \| nine_k wrote: \| One small caveat: OpenAI gets to see all your prompts, and all \| the responses. \| \| Sometimes this can be unacceptable. Law,, medicine, finance, \| all of them would prefer a self-hosted, private GPT. \| kevlened wrote: \| Their data retention policy on their APIs is 30 days, and \| it's not used for training [0]. In addition, qualifying use \| cases (likely the ones you mentioned) qualify for zero data \| retention for most endpoints. \| \| [0] - https://platform.openai.com/docs/models/how-we-use- \| your-data \| nine_k wrote: \| In sensitive cases you do not think about the normal \| policy, you think about the worst case. You just can't \| afford a leak. Your local installation may be much better \| protected than a public service, by technology and by \| policy. \| BoorishBears wrote: \| For years people have essentially made a living off FUD \| like "ignore the literal legal agreement and imagine all \| the worst case scenarios!!!" to justify absolutely \| farcical on-premise deployments of a lot of software, but \| AI is starting to ruin the grift. \| \| There _are_ some cases where you really can 't afford to \| send Microsoft data for their OpenAI offering... but \| there are a lot more where some figurehead solidified \| their power by insisting the company build less secure \| versions of public offerings instead of letting their \| "gold" go to a 3rd party provider. \| \| As AI starts to appear as a competitive advantage, and \| the SOTA of self-hosted lagging so ridiculously far \| behind, you're seeing that work less and less. Take \| Harvey.ai for example: it's a frankly non-functional \| product and still manages to spook top law firms with \| tech policies that have been entrenched for decades into \| paying money despite being OpenAI based on the simple \| chance they might get outcompeted otherwise. \| littlestymaar wrote: \| > and it's not used for training [0]. \| \| It's "not be used to train or improve OpenAI models", \| doesn't mean it's not used to get knowledge about your \| prompts, your business use case. In fact, the wording of \| the policy is lose enough they could train a policy model \| on it (just not the LLM itself). \| Der_Einzige wrote: \| A lot of tools for constraint, creativity, and related rely on \| manipulating the entire log probability distribution. OpenAI \| won't expose this information and is therefor shockingly \| uncompetitive on things like poetry generation \| fulafel wrote: \| This focuses on compute capacity but wouldn't the algorithmic \| improvements be much more important in bang for the buck at this \| stage as there's so much low hanging fruit as evidenced by \| constant stream of news about getting better results with less \| hardware. \| debacle wrote: \| Open source always wins, in the end. This is a fluff piece. \| downWidOutaFite wrote: \| Where's the open source web search that is beating Google? \| serjester wrote: \| I think this is under appreciated. I run a "talk-to-your-files" \| website with 5ish K MRR and a pretty generous free tier. My \| OpenAI costs have not exceeded $200 / mo. People talk about using \| smaller, cheaper models but unless you have strong data security \| requirements you're burdening yourself with serious maintenance \| work and using objectively worse models to save pennies. This \| doesn't even consider OpenAI continuously lowering their prices. \| \| I've talked to a good amount of businesses and 90% of custom use \| cases would also have negligible AI costs. In my opinion, unless \| you're in a super regulated industry or doing genuinely cutting \| edge stuff, you should probably just be using the best that's \| available (OpenAI). \| vsreekanti wrote: \| I completely agree -- open-source models and custom deployments \| just can't compete with the cost and efficiency here. The only \| exception here is _if_ open-source models can get way smaller \| and faster than they are now while maintaining existing \| quality. That will make private deployments and custom fine- \| tuning way more likely. \| SkyMarshal wrote: \| Or FOSS models remain the same size and speed, but hardware \| for running them, especially locally, steadily improves till \| the AI is "good enough" for a large enough segment of the \| market. \| hobs wrote: \| How do you deal with the fact that Azure et al are not \| appearing to sell anyone additional capacity? \| jejeyyy77 wrote: \| how do ur customers feel about you uploading potentially \| confidential documents to a 3rd party? \| CDSlice wrote: \| If they are confidential they probably shouldn't be uploaded \| to any website no matter if it calls out to OpenAI or does \| all the processing on their own servers. \| yunohn wrote: \| It's simple really, lots of businesses share data with 3rd \| parties to enable various services. OpenAI provides a service \| contract claiming they do not mine/reshare/etc the data \| shared via their API. As the SaaS provider, you just need to \| call it out your user service agreement. \| euazOn wrote: \| Just curious, could you briefly mention some of the custom use \| cases with negligible AI costs? Thanks \| cyode wrote: \| Are any OpenAI powered flows available to public, logged-out \| user traffic? I've worried (maybe irrationally) about doing \| this in a personal project and then dealing with malicious \| actors and getting stuck with a big bill. \| Bukhmanizer wrote: \| The bleeding obvious is that OpenAI is doing what most tech \| companies for the last 20 years have done. Offer the product \| for dirt cheap to kill off competition, then extract as much \| value from your users as possible by either mining data or \| hiking the price. \| \| I don't understand how people are surprised by this anymore. \| \| So yeah, it's the best option right now, when the company is \| burning through cash, but they're planning on getting that \| money back from you _eventually_. \| jaredklewis wrote: \| > Offer the product for dirt cheap to kill off competition, \| then extract as much value from your users as possible by \| either mining data or hiking the price. \| \| Genuine question, what are some examples of companies in that \| "hiking the price" camp? \| \| I can think of tons of tech companies that sold or sell stuff \| at a loss for growth, but struggling to find examples where \| the companies then are able to turn dominant market share \| into higher prices. \| \| To be clear, I'm definitely not implying they are not out \| there, just looking for examples. \| loganfrederick wrote: \| Uber, Netflix and the online content streaming services. \| These are probably the most prominent examples from this \| recent 2010s era. \| spacebanana7 wrote: \| The Google Maps API price hike of 2018 [1] is a relevant \| example. \| \| [1] https://kobedigital.com/google-maps-api-changes \| beezlebroxxxxxx wrote: \| Uber is probably the biggest pure example. When I was in \| uni when they first spread, Uber's entire business model \| was flood the market with hilariously low prices and steep \| discounts. People overnight started using them like crazy. \| They were practically giving away their product. Now, \| they're as expensive, if not sometimes more expensive, than \| any other taxi or ridesharing service in my area. \| \| One thing I'll add is that it's not always that this ends \| with higher prices in an absolute sense, but that the tech \| company is able to essentially cut the knees out of their \| competitors until they're a shell of their former selves. \| Then when the prices go "up", they're in a way a return to \| the "norm", only they have a larger and dominant market \| share because of their crazy pricing in the early stages. \| wkat4242 wrote: \| Yeah I kinda wonder why people even use them anymore. \| I've long gone back to real taxis because their cheaper \| and I don't have to book them, I can just grab one on the \| street. Much more efficient than waiting for slowly \| watching my driver edge his way to me from 3 kilometers \| away. \| jdminhbg wrote: \| The number of places where you can reliably walk out onto \| the street and hail a taxi is pretty small. Everywhere \| else, the relevant decision is whether calling a \| dispatcher or using a taxi company's app is \| faster/cheaper/more reliable than Uber/Lyft. \| mikpanko wrote: \| - Uber/Lyft increased prices significantly (and partially \| transition it into longer wait times) since they got into \| profitability mode \| \| - Google is showing more and more ads over time to power \| high revenue growth YoY \| \| - Unity has just tried to increase its prices \| jaredklewis wrote: \| I think Google fits more in the "extract as much value \| from your users" bucket more than the price hiking one. \| \| Uber/Lyft did raise prices, but interestingly (at least \| to me) is that if the strategy was the smother the \| competition with low prices, it didn't seem to work. \| \| Unity is interesting too, though I'm not sure it would \| make a good poster child for this playbook. It raised \| prices but seems to be suffering for it. \| HillRat wrote: \| Everyone's in "show your profits" mode, as befitting a \| mature market with smaller growth potential relative to \| the last few decades. Some of what we're talking about \| here is just what happens when a company tries to use \| investment capital to build a moat but fails (the \| Uber/Lyft issue you mentioned -- there's no obvious moat \| to ride-hailing, as with many software and app domains). \| My theory is that, going forward, we're going to see a \| much lower ceiling on revenue coupled with lots of \| competition in the market as VC investments cool off and \| companies can't spend their way into ephemeral market \| dominance. \| \| As for Unity, they're certainly dealing with a bunch of \| underperforming PE and IPO-enabled M&A on the one hand \| (really should have considered that AppLovin offer, \| folks), but also just a failure to extract reasonable \| income from their flagship product on the other; I don't \| think their problems come from raising prices _per se_ \| (game devs pay for a lot already, an engine fee is \| nothing new to them) as much as how they chose to do it \| and the original pricing model they tried to force on \| their clients. What they chose to do and the way they \| handled it wasn 't just bad, it was "HBS case study bad." \| dboreham wrote: \| VMWare, Docker. \| zarzavat wrote: \| OpenAI doesn't own transformers, they didn't even invent \| them. They just have the best one at this particular time. \| They have no moat. \| \| At some point, someone else will make a competitive model, if \| it's Facebook then it might even be open source, and the \| industry will see price competition _downwards_. \| strangemonad wrote: \| This argument has always felt to me like saying "google has \| no moat in search, they just happen to currently have the \| best page rank. Nothing is stopping yahoo from creating a \| better one" \| jdminhbg wrote: \| Google has a flywheel where its dominant position in \| search results in more users, whose data refines the \| search algorithm over time. The question is whether \| OpenAI has a similar thing going, or whether they just \| have done the best job of training a model against a \| static dataset so far. If they're able to incorporate \| customer usage to improve their models, that's a moat \| against competitors. If not, it's just a battle between \| groups of researchers and server farms to see who is best \| this week or next. \| zarzavat wrote: \| It's a different situation computationally. Transformers \| are asymmetric: hard to train but easy to run. \| \| There is no such thing as an open source Google because \| Google's value is in its vast data centers. Search is \| hard to train and hard to run. \| \| GPT4 is not that big. It's about 220B parameters, if you \| believe geohot, or perhaps more if you don't. \| \| _One_ hard drive. \| shihab wrote: \| My understanding is that Google search is a lot more than \| just Pagerank (Map reduce for example). They had lots of \| heuristics, data, machine learning before anyone else \| etc. \| \| Whereas the underlying algorithms behind all these GPTs \| so far are broadly same. Yes, OpenAI does probably have \| better data, model finetuning and other engineering \| techniques now, but I don't feel it's anything special \| that'll allow themselves to differentiate themselves from \| competitors in the long run. \| \| (If the data collected from a current LLM user in \| improving model proves very valuable, that's different. I \| personally think that's not the case now but who knows). \| YetAnotherNick wrote: \| The difference between openai and next best model seems to \| be increasing and not decreasing. Maybe Google's gemini \| could be competitive, but I don't believe open source will \| match OpenAI's capability ever. \| \| Also OpenAI gets significant discount on compute due to \| favourable deals from Nvidia and Microsoft. And they could \| design their server better for their homogenous needs. They \| are already working on AI chip. \| goosinmouse wrote: \| Are you using 3.5 turbo? Its always funny when i test a new fun \| chatbot or something and see my API usage 10x just from a \| single GPT 4 API call. Although i only usually have a $2 bill \| every month from openAI. \| littlestymaar wrote: \| > you should probably just be using the best that's available \| (OpenAI). \| \| Sure, if you want to let a monopoly have all the added value \| while you get to keep the rest you can do that. \| \| Just make sure you're never successful enough to inspire them \| though, otherwise you're dead the next minute. Oops. \| zzbn00 wrote: \| p4d.24xlarge spot price is $8.2 / hour in US East 1 at the \| moment... \| charlesischuck wrote: \| Good luck getting that lol \| tester756 wrote: \| >iPhone of artificial intelligence \| \| It feels like the biggest investor bait of this year \| \| Will it beat ARM IPO? \| lossolo wrote: \| It's also worth noting that if you build your business on using \| OpenAI's LLM or Anthropic etc, then, in the majority of cases \| I've seen so far (no fine tuning etc), your competitor is just \| one prompt away from replicating your business. \| beauHD wrote: \| I signed up for OpenAI's ChatGPT tool, and entered a query, like \| 'What does the notation 1e100 mean?' (just to try it out). And \| then when displaying the output it would start outputting the \| reply in a slow way, like, it was dripfeeded to me, and I was \| like: 'what? surely this could be faster?' \| \| Maybe I'm missing something crucial here, but why does it \| dripfeed answers like this? Does it have to think really hard \| about the meaning of 1e100? Why can't it just spit it out \| instantly without such a delay/drip, like with the near-instant \| Wolfram Alpha? \| baby wrote: \| You can but it'll take longer. So one way to get faster answers \| is to stream the response as it is generated. And in GPT-based \| apps the response is generated token by token (~4chars), hence \| what you're seeing. \| maccam912 wrote: \| Its a result of how these transformer models work. It's pretty \| quick for the amount of work it does, but it's not looking up \| anything, it's generating it a token a time. \| notRobot wrote: \| Under the hood, GPT works by predicting the next token when \| provided with an input sequence of words. At each step a single \| word is generated taking into consideration all the previous \| words. \| \| https://ai.stackexchange.com/questions/38923/why-does-chatgp... \| swatcoder wrote: \| The non-technical way to think about it is that ChatGPT "thinks \| out loud" and can _only_ "think out loud". \| \| Future products would be able to hide some of that, but for \| now, that's what the ChatGPT / Bing Assistant product does. \| codedokode wrote: \| Because it needs to do billions of arithmetic operations to \| generate a reply. Replying to questions is not an easy task. \| iambateman wrote: \| This is _the_ playbook for big, fast scaling companies...Uber \| subsidized every ride for _a decade_ before finally charging \| market price, just to make sure that Uber was the only option \| which made sense. \| \| While it's nice to consume the cheap stuff, it is not good for \| healthy markets. \| matteoraso wrote: \| It's not even just the cost of finetuning. The API pricing is so \| low, you literally can't save money by buying a GPU and running \| your own LLM, no matter how many tokens you generate. It's an \| incredible moat for OpenAI, but something they can't provide is \| an LLM that doesn't talk like an annoying HR manager, which is \| the real use case for self-hosting. \| rosywoozlechan wrote: \| The service quality sucks. You're getting what you pay for. We \| switched to Azure Open AI APIs because of all the service quality \| issues. \| layer8 wrote: \| Isn't OpenAI too cheap to be sustainable, and currently living \| off Microsoft's $10B investment? \| xnx wrote: \| Nothing in that article convinces me the situation couldn't \| change entirely in any given month. Google Gemini could be more \| capable. Any number of new players (AWS, Microsoft, Apple) could \| enter the market in a serious way. The head-start OpenAI has in \| usage data is small and probably eclipsed by the clickstream and \| data stores that Google and Microsoft have access to. I see no \| durable advantage for OpenAI. \| freedomben wrote: \| Gemini very well might be the biggest threat to OpenAI. ChatGPT \| has first-mover advantage so has a decent moat, but the amount \| of people willing to pay $20 per month for something worse[1] \| than they get for free with google.com is going to dwindle. I'd \| be very worried if I were them. \| \| [1]: That knowledge cutoff and terrible UX of browse the web is \| brutal compared to the experience of Bard \| appplication wrote: \| The premise of this is flawed. OpenAI is cheap because of has to \| be right now. They need to establish market dominance quickly, \| before competitors slide in. The winner of this horse race is not \| going to be the company with the best performing AI, it's going \| to be the one who does the best job at creating an outstanding \| UX, ubiquitously presence, entrenching users, and building \| competitive moats that are not feature differentiated because at \| best even cutting edge features are only 6-12 months ahead of \| competition cloning or beating. \| \| This is Uber/AirBnB/Wework/literally every VC subsidized hungry- \| hungry-hippos market grab all over again. If you're falling in \| love because the prices are so low, that is ephemeral at best and \| is not a moat. Someone try calling an Uber in SF today and tell \| me how much that costs you and how much worse the experience is \| vs 2017. \| \| OpenAI is the undisputed future of AI... for timescales 6 months \| and less. They are still extremely vulnerable to complete \| disruption and as likely to be the next MySpace as they are \| Facebook. \| shaburn wrote: \| Your Uber/AirBnB/Wework all have physical base units with \| ascending costs due to inflation and theoretical economies of \| scale. \| \| AI models have some GPU constraints but could easily reach a \| state where the cost to opperate falls and becomes relatively \| trivial with almost no lowerbound, for most use cases. \| \| You are correct there is a race for marketshare. The crux in \| this case will be keeping it. Easy come, easy go. Models often \| make the worst business model. \| monocasa wrote: \| Probably why Altman has been talking so much about how \| dangerous it is and how regulations are needed. No natural \| moat, so building a regulatory one. \| blackoil wrote: \| This point is discussed in the article. Title is not for \| Google/Meta, they'll invest all the billions that they have to. \| \| It is for the consumers of these models, is there even a point \| to train your own or experiment with OSS! \| hendersoon wrote: \| Sure, open models often require much less hardware than \| chatGPT3.5 and offer ballpark (and constantly improving) \| performance and accuracy. ChatGPT3.5 scores 85 in ARC and the \| huggingface leaderboard is up to 77. \| \| If you need chatGPT4-quality responses they aren't close yet, \| but it'll happen. \| toddmorey wrote: \| Just heard Steve today from Builder.io who did an impressive \| launch of Figma -> code powered by AI. \| \| They trained a custom model for this. Better accuracy, sure, \| but I was a little surprised to watch how much faster it is \| than GPT4. \| \| Based on their testing, they've become believers in domain \| specific smaller models, especially for performance. \| ldjkfkdsjnv wrote: \| Completely wrong, the best AI will win. There is insane demand \| for better models. \| datadrivenangel wrote: \| There is insane demand for good enough models at extremely \| good prices. \| \| Better beyond a certain point is unlikely to be competitive \| with the cheaper models. \| oceanplexian wrote: \| Yep, quality over quantity. The difference between 99.9% \| accurate and 99.999% accurate can be ridiculously valuable in \| so many real world applications where people would apply \| LLMs. \| gbmatt wrote: \| Only Big Tech (Microsoft,Google,Facebook) can crawl the web \| at scale because they own the major content companies and \| they severly throttle the competition's crawlers, and \| sometimes outright block them. I'm not saying it's impossible \| to get around, but it is certainly very difficult, and you \| could be thrown in prison for violating the CFAA. \| PaulHoule wrote: \| I'm not sure if training on a vast amount of content is \| really necessary in the sense that linguistic competence \| and knowledge can probably be separated to some extent. \| That is, the "ChatGPT" paradigm leads to systems that just \| confabulate and "makes shit up" and making something \| radically more accurate means going to something retrieval- \| based or knowledge graph-based. \| \| In that case you might be able to get linguistic competence \| with a much smaller model that you end up training with a \| smaller, cleaner, and probably partially synthetic data \| set. \| wkat4242 wrote: \| The improvements seem to be leveling off already. GPT-4 isn't \| really worth the extra price to me. It's not that much \| better. \| \| What I would really want though is an uncensored LLM. OpenAI \| is basically unusable now, most of its replies are like "I'm \| only a dumb AI and my lawyers don't want me to answer your \| question". Yes I work in cyber. But it's pretty insane now. \| bugglebeetle wrote: \| GPT-4, correctly prompted, is head and shoulders above \| everything for coding. All the text generation stuff and \| NLP tasks, it's a toss-up. \| jrockway wrote: \| I haven't played with the self-hosted LLMs at all yet, but \| back when Stable Diffusion was brand new I had a ton of fun \| creating images that lawyers wouldn't want you to create. \| ("Abraham Lincoln and Donald Trump riding a battle \| elephant." It's just so much funnier with living people!) I \| imagine that Llama-2 and friends offer a similar \| experience. \| PaulHoule wrote: \| Depends how you define quality. This paper reflects my own \| experience \| \| https://arxiv.org/abs/2305.08377 \| \| and shows how LLM technology has a lot more to offer than \| "ChatGPT". The real takeaway is that by training LLMs with \| real training data (even with a "less powerful" model) you \| can get an error rate more than 10x less than you get with \| the "zero shot" model of asking ChatGPT to answer a question \| for you the same way that Mickey Mouse asked the broom to \| clean up for him in _Fantasia._ The "few-shot" approach of \| supplying a few examples in the attention window was a little \| better but not much. \| \| The problem isn't something that will go away with a more \| powerful model because the problem has a lot to do with the \| intrinsic fuzziness of language. \| \| People who are waiting for an exponentially more expensive \| ChatGPT-5 to save them will be pushing a bubble around under \| a rug endlessly while the grinds who formulate well-defined \| problems and make training sets will actually cross the \| finish line. \| \| Remember that Moore's Law is over in the sense that \| transistors are not getting cheaper generation after \| generation, that is why the NVIDIA 40xx series is such a \| disappointment to most people. LLMs have some possibility of \| getting cheaper from a software perspective as we understand \| how they work and hardware can be better optimized to make \| the most of those transistors, but the driving force of the \| semiconductor revolution is spent unless people find some \| entirely different way to build chips. \| \| But... people really want to be like Mickey in _Fantasia_ and \| hope the grinds are going to make magic for them. \| sbierwagen wrote: \| > Remember that Moore's Law is over in the sense that \| transistors are not getting cheaper generation after \| generation, that is why the NVIDIA 40xx series is such a \| disappointment to most people. \| \| Huh? The NVIDIA H100 has twice the FLOPS of the A100 on a \| smaller die. How is that not Moore's law? \| mg wrote: \| I don't think Uber and AirBnB are good comparisons. \| \| Both are B2C and have network effects. \| paul7986 wrote: \| The PI iPhone app has a solid UX and even better UX if Apple \| (bought it) integrated into Siri. \| kcorbitt wrote: \| Eh, OpenAI is too cheap to beat at their own game. \| \| But there are a ton of use-cases where a 1 to 7B parameter fine- \| tuned model will be faster, cheaper and easier to deploy than a \| prompted or fine-tuned GPT-3.5-sized model. \| \| In fact, it might be a strong statement but I'd argue that _most_ \| current use-cases for (non-fine-tuned) GPT-3.5 fit in that \| bucket. \| \| (Disclaimer: currently building https://openpipe.ai; making it \| trivial for product engineers to replace OpenAI prompts with \| their own fine-tuned models.) \| kristjansson wrote: \| This article might have a point about the data flywheel, but it's \| lost in the confused economics in the second half. Why would we \| expect to hire one engineer per p4.24x instance? Why do we think \| OpenAI needs a whole p4.24x to run fine tuning? Why do we ignore \| the higher costs on the inference side for fine-tuned models? Why \| do we think OpenAI spends _any_ money on racking-and-stacking \| GPUs rather than just take them at (hyperscaler) cost from Azure? \| oceanplexian wrote: \| Has anyone actually used GPT4? It's not "cheap". \| \| It was roughly $150 for me to build a small dataset with a few \| thousand quarter-page chunks of text for a data project using \| GPT4. GPT3 is substantially cheaper but it would hallucinate 30% \| of the time; honestly a nice fine-tune of LlaMA is on-par with \| GPT3 and after the sunk cost all it costs is a few $0.01 in \| electricity to generate the same sized dataset. \| slowhadoken wrote: \| It's insanely expensive to run and operate "AI". Meredith \| Whittaker's talk on AI is very insightful \| https://www.youtube.com/watch?v=amNriUZNP8w \| slowhadoken wrote: \| Thanks to traumatized $2 an hour Kenyan labor, yeah \| https://time.com/6247678/openai-chatgpt-kenya-workers/ \| pimpampum wrote: \| Classic anti-competition strategy, sell below cost and burn money \| until competition is out, then sell higher than you could have \| ever sold with competition. \| BrunoJo wrote: \| We just started a service different open source models and with \| an OpenAI compatible API [1]. The pricing isn't final and we \| haven't officially launched yet but you should be able to save at \| least 75% compared to GPT 3.5. \| \| [1] https://lemonfox.ai/ \| Meegul wrote: \| Are you doing this profitably? If so, does that entail owning \| your own hardware or renting from cheaper services such as \| Lambda? \| slowhadoken wrote: \| none of it is cheap, "AI" insanely expensive. Meredith Whittaker \| talks about it in this interview \| https://www.youtube.com/watch?v=amNriUZNP8w She's the president \| of the Signal Foundation. \| AJRF wrote: \| I read this and think "That won't last long". \| \| The pricing is too good to be true with you think about it \| rationally. If they raise prices they seem much, much less \| attractive than using AWS or Azure. \| \| Amazon seem to have a much better business built around their \| Bedrock offering. And all their other tools are available there \| like SageMaker, ec2, integration with MLFlow, etc, etc. \| \| I guess the same goes for Azure, if you are already using it it's \| much easier to just stick with whatever they are offering for LLM \| Ops. \| \| OpenAI offering just models doesn't seem like it can last \| forever, and to compete with AWS or Azure at enterprise level \| they need to build all the things Amazon/MS have built. \| \| The other side of that coin seems much more realistic. \| DominikPeters wrote: \| > While per-token inference costs for fine-tuned GPT-3.5 is 10x \| more expensive than GPT-3.5 it is still 10x cheaper than GPT-4! \| \| Not quite accurate; finetuned 3.5 is only 4x cheaper than GPT-4. \| Cost per million output tokens from https://openai.com/pricing $2 \| - GPT-3.5 $16 - finetuned GPT-3.5 $60 - GPT 4 ___________________________________________________________________ (page generated 2023-10-12 21:00 UTC)