Recent comments in /f/MachineLearning

Khal_Doggo t1_j8cpwdk wrote

I have a matrix of data I want to run NMF on. The range of values is from -13.1 to 13.4. What's the best way to prep this data for NMF? I've seen people just take all the negative values and make them 0 but that seems to me like it massively cripples the variance in the data. Would it make sense to just add the absolute minimum to each value in the matrix so that it ranges from 0 to 26 instead? Or rescale the data from 0 to 1?

1

leepenkman t1_j8co3gr wrote

Also checkout https://text-generator.io its a multi modal model so visits any input links, downloads web pages and images are analyzed with NNs to make better text.

Also does speech to text/text to speech so can talk

As many have said lots of these things will likely/hopefully come together into something big, needs a few things like the when to train new tools/model zoo thing, but internally Text Generator is based on multiple models too and has some internal decision making for which model is best on every request (so you dont need to pick a code/text model it does it automatically) which is similar but it's not training new nets.

−3

askingforhelp1111 t1_j8cmbr6 wrote

Much thanks for the reply, would love to read your resources on compression and inference.

I'm keen on cutting down costs. Previously ran on GPU via AWS EC2 instance but gotta tighten the company's belt this year and my manager suggested running on CPU. Love to hear your suggestions too (if any).

1

cajmorgans t1_j8chwh9 wrote

Do you know the ”swipe to write” feature that exists on iPhone and Android, where you can keep your finger down and “draw” the words?

There is some small company suing the big guys atm for this “feature” (imo I think a fraction actually uses it). When I heard it, I lost it as, how can you patent such a thing? I mean yea, it might not be the most simple software to write but it just feels so weird to be able to patent such a (useless) technique

2

Varpie t1_j8cftrx wrote

I'm surprised this hasn't been done before. This paper mostly cites works from the last 2-3 years, but surely, something similar was done previously (maybe not using the same kind of model)? In fact, isn't it pretty close to what search engines do to provide instant results when given an equation or an address for instance? Does anyone know of such work?

4

tysam_and_co t1_j8cf8al wrote

I...I...this is the first time I've heard this. Machine learning is often used as the hype-shelter word for "AI", because it triggers very few people (in the hype sense -- or at least it used to).

I'm not quite sure what to say, this is very confusing to me.

11

tysam_and_co t1_j8cf1o9 wrote

That is a really good point.

Though, minor contention, it seems like most of the comments in the post are pretty well-informed. I see the main difference is batchnorm before or after the activation, which oddly enough years-later seems to be better in the form of being before the activation due to the efficiency increases offered by fusing.

I'm surprised they were so on the mark even 6 years ago about being skeptical of this internal covariate shift business. I guess keeping the statistics centered and such is helpful but as we've seen since then, batchnorm seems to do so much more than just that (and is a frustratingly utilitarian, if limiting tool, in my experience, unfortunately).

6

Disastrous_Elk_6375 t1_j8cd4x4 wrote

I think it will depend on how small the LLMs that it uses are. If they can be run on consumer GPUs, then it will probably take off. If you need to rent 8xGPU servers just for inference, probably not.

Stablediffusion took off because in the first two weeks you could run it on 4GB VRAM GPUs. Then when "finetuning" aka dreambooth came along, it went from 24 to 16 to 8 GB in a matter of weeks. Same effect there.

15

sloganking t1_j8cculc wrote

It's not just calling APIs. This model is independently teaching itself how to use new APIs and when to use them. The process is pretty much the same for any API, and doesn't require much extra effort by the programmer to add a new one.

This paper also states it is one of the first to have models learn to use APIs in an unsupervised way, meaning they teach themselves instead of relying on a ton of human annotated data.

22

rafgro t1_j8cc6ne wrote

Agreed. The quality of discussions under posts is also pretty bad.

IMO it's the result of outdated rules and lax moderation. On the rules, there's definitely a need to address low-effort chatgpt posts and comments. Some of them are straight scam posts! On the moderation, it's not about quality but about the quantity, realistically this sub has just a few moderators (because some/most of these 9 lads are very busy engineers), with no new moderators added in the last two years, while it has seen enormous huge growth in members.

11