Recent comments in /f/MachineLearning

EnzoTrent t1_j704ban wrote

I'm not trying to temper your dreams or anything but I think Data Science as an industry is going to completely change over the next 5-10 years. I'm not saying humans won't work in Data Science in 5-10. I'm saying, the Data Science industry is going to evolve a lot as it incorporates AI. I don't know what that will look like - do your professors?

If you go to an Ivy and are taught by the smartest person in this old world, learn all the best old world stuff, it will do you little good in the new world.

If data is interesting to you - maybe, at least as a hobby, be sure that you learn everything you need to completely setup and deploy a native GPT-AI (s) and can train for years on specific tasks/functions/intent, all tailored to whatever services you want to offer a business owner - this should be obvious as this gets closer.

Eventually, most of this AI generative tech will be locked behind corporate walled gardens and anything accessible to consumers will be lightyears behind - prices will skyrocket for basic stuff that people just no longer really do. This will be super cheap until it isn't.

That is when you come in with your own generative AI - still lightyears behind Microsoft, but won't be nerfed like the consumer AI that will be available.

Haha, of course Microsoft doesn't really lose anything by undercutting your company at that point... I'm really just trying to get you thinking.

Don't expect anything about the world today to stay the way it is today. Assume everything will be updated and changed. Try to see the world that will follow the transition - where do you fit?

tl;dr: I think everyone with the ability to run/train a limited native AI - should totally do that.

−1

znihilist t1_j704b3j wrote

>> That's beside the point,

> It does for the comment thread which was about copyright

It doesn't, as this is issue has not been decided by courts or laws yet, and opinion seems to be evenly divided. So this is circular logic.

>> my point is that the MP3 compression comparison doesn't work,

> It does for the part that is actually the point (copyright law).

You mentioned MP3 (compressed versions) as comparable in functionality, and my argument is about how they are not similar in functionality, so the conclusion doesn't follow as they are not comparable in that analysis. Compression not absolving copyright infringement doesn't lead to the same thing being concluded for diffusion models. As you asserted that, you need to show show compression and diffusion follow the same functionality for that comparison to work. That's like if I say that it isn't illegal that I can look at a painting and then go home and have vivid images of that painting therefore diffusion models are not doing any infringement, that would be fallacious and wrong, functionality doesn't follow, the same for MP3 example.

1

puppet_pals t1_j701uqt wrote

>I think normalization will be here to stay (maybe not the ImageNet one though), as it usually speeds up training.

the reality is you are tied to the normalization scheme of whatever you are transfer learning from. (assuming you are transfer learning). Framework authors and people publishing weights should make normalization as easy as possible; typically via a 1/255.0 rescaling operation (or x/127.5 - 1, I'm indifferent though I opt for 1/255 personally)

1

TrevorIRL t1_j6zvy53 wrote

Sure, but until recently, OpenAI has been a not for profit researching platform.

That means, the R and D would have been written off as a cost of production for this product.

As far as publicly known info, $3 million a year is our best guess at what it costs to run.

Considering the excitement at future utility, I don’t imagine capital will be the constraint for future development.

2

Professional_Poet489 t1_j6zsleg wrote

There are smarter people than me out there, so maybe I’m missing something, but the market doesn’t change trajectories because of any move you make. All finance wants to do is guess what the movement will be (up, down, how much). This is a classification or regression problem, not RL.

0

cachemonet0x0cf6619 t1_j6zkpdk wrote

correct. nothing specific. I’ve given it a bit of code and asked it to add doc strings to it. it was meh.

I’ve asked it to help me set up a new environment. it gave me old set up instructions but was able to make my way through by changing old versions. a lot like google.

it will write a lot of boiler tests. I’ve asked it to write a script and then write a unit test for the script. that was also meh but it was a good scaffold

2

mtocrat t1_j6zk1ka wrote

Let's say your initial model is quite racist and outputs only extremely or moderately racist choices. If you rank those against each other and do supervised training on that dataset you train it to mimic the moderately racist style. You might however plausibly train a model from this that can judge what racism is and extrapolate to judge answers free of it to be even better. Then you optimize with respect to that model to get that style

2