from :birdsite:

if this ends up being correct this is /fantastic/ news

one of my biggest fears with the recent wave of AI stuff is that the upfront costs of computing the models and collecting data reverse the sort of diseconomies of scale that i see as a fatal flaw in capitalism (and economic domination more broadly) and make analysis like that of @KevinCarson1 outdated

twitter.com/fchollet/status/16

@rechelon @KevinCarson1 stable diffusion coming out was a massive relief to me last year

@rechelon @KevinCarson1 @mutual_ayyde having tried both DALL-E 2 and Stable Diffusion extensively, it’s enormously clear the difference in quality that a bigger budget can make. That said, we’re not talking about anything remotely approaching AGI, and for almost all real world use-cases you’re going to want to fine tune with pretty tractable computational resources on a much smaller scale dataset

@elfprince13 absolutely agreed with the fine-tuned point, but even if that wasn't the case still having subpar options that are free and open is a fantastic hedge against the worst case scenarios of all the models being behind the next generation of tech giants

@elfprince13 that said the black-box nature of neural networks and the weird behavior that results from that means i am hopeful open models prove competitive if for no other reason than its easier to identify and fix bugs or errors

@mutual_ayyde So a lot of the sorts of things you would want to fix with an already released neural network aren’t straightforwardly patchable because of that black box behavior. For example if you train with certain normalizations or regularizations that are later found to destroy information that the network needs for better performance … that information has already been lost in the trained weights and fixing it will only help further training.

@mutual_ayyde big changes in network architecture are even less feasible because the weights likely can’t be transferred over directly (although you can salvage certain layers and freeze them while retraining other components of the network around them)

Follow

@mutual_ayyde *but* once the open source models are out there, anyone can fine-tune them to specific tasks on commodity hardware, and if they were halfway decent to start with, frankly most likely beat whatever general purpose thing the tech giants have on the specific task you wanted, which is where the “a good application and use case is the real moat” comes in (see eg the AI Profile Pic generator that went viral a few weeks ago)

@mutual_ayyde also - there are a lot of domains still open where there are no off-the-shelf models ready to even process the data types you need to process, and that’s where I’ll take a small team of smart folks with a vision over a directionless tech giant any day. I fully intend on changing the world with what we’re building at Geopipe.

Sign in to participate in the conversation
Mastodon

General topic personal server.