Recent comments in /f/MachineLearning
cthorrez t1_j9x8x1b wrote
The methods and models are identical yep. It's basically just to denote that whether the labels were assigned by a human or determined automatically.
mil24havoc t1_j9x8tol wrote
I generally agree with you. But it is useful to have a term for training methods that use clever tricks to bypass manual data labeling, usually with some secondary objective in mind (that the model should do something that is not strictly the same as the SSL objective). In that sense, I think of it as a subset of supervised learning. In ML, literally every innovation gets its own catchy name. This is in contrast to, say, statistics, where major innovations often aren't named until years later. I suspect this has to do with the hotness and competitiveness of ML - you need a catchy name to stand out in a crowd of thousands of papers doing very similar things.
[deleted] t1_j9x7ewp wrote
[removed]
cthorrez t1_j9x772h wrote
Reply to [D] Best Way to Measure LLM Uncertainty? by _atswi_
along with each prompt, just put: "And at the end of your response, state on a scale from one to ten how confident you are in you answer"
This works amazingly and is very accurate. source
It has the added bonus where you can get confidence intervals on your confidence intervals just by asking how confident it is in it's estimation of its confidence.
terath t1_j9x6v7k wrote
Reply to comment by gt33m in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
My point is that you don’t need ai to hire a hundred people to manually spread propaganda. That’s been going on now for a few years. AI makes it cheaper yes but banning AI or restricting it in no way fixes it.
People are very enamoured with AI but seem to ignore the already many existing technological tools being used to disrupt things today.
[deleted] t1_j9x4hup wrote
Reply to comment by ktpr in [D] A funny story from my interview by nobody0014
[deleted]
clueless1245 t1_j9x3dlc wrote
Reply to comment by mosquitoLad in [D] What is the correct term for a non-GAN system where two or more networks compete as part of training? by mosquitoLad
Its also an issue for generator training though if the discriminator gets 100% all the time, if I remember correctly. Theres various stuff you can look up to make training more stable which I don't have on hand rn.
mosquitoLad OP t1_j9x2guv wrote
Reply to comment by clueless1245 in [D] What is the correct term for a non-GAN system where two or more networks compete as part of training? by mosquitoLad
The less formal way conveys the concept better; and it makes sense, the worse the discriminator performs (whether it is overly sensitive or less sensitive when attempting to sus out the validity of assets), the worse the generator performs, at least with regard to the quality of the output for human purposes. If I'm understanding the use of gradient correctly, the generator become trapped in a local minimum because it discovers how to consistently exploit the weaknesses of the discriminator.
I don't know for sure if it always applies; you could apply an evolutionary algorithm where two or more competing populations are tackling the same problem from opposing sides, and have relatively infrequent breeding between members of the populations, motivating avoidance of bottlenecking while enabling the development of unique solutions; over several generations, any short term loss should serve to be a long term gain. But, I guess they'd still be dependent on how the scoring system works (equivalent to loss function?).
Imnimo t1_j9x01v0 wrote
Reply to comment by Hyper1on in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
Overfitting is just one among many possible optimization failures. While these models might over-memorize portions of training data, they're also badly underfit in many other respects (as evidenced by their frequent inability to answers questions humans would find easy).
If Bing is so well-optimized that it has learned these strange outputs as some sort of advanced behavior to succeed at the LM or RLHF tasks, why is it so weak in so many other respects? Is simulating personalities either so much more valuable or so much easier than simple multi-step reasoning, which these models struggle terribly with?
clueless1245 t1_j9wzcbn wrote
Reply to comment by mosquitoLad in [D] What is the correct term for a non-GAN system where two or more networks compete as part of training? by mosquitoLad
Idk what he means specifically by the "gradient being passed between" two agents but in a GAN (part of) the loss function of the generator is the inverse of (part of) the loss function of the discriminator, so the gradients calculated at generator output and discriminator output are linked.
A less formal way of saying it: The generator's gradient depends on the discriminator's loss.
This should be true for any adversarial game, I would think?
[deleted] t1_j9wwi82 wrote
Reply to comment by MrAcurite in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
Looks like reddit filtered out your response. Anyway, you seem plenty unhinged and I think I'll stop interacting with you.
In parting, I googled you as well. Bit odd for a "researcher" (one that compares himself to physics professors, no less) to have a citation count of zero. Have a good one...
BrohammerOK t1_j9ww6yx wrote
Reply to comment by BrohammerOK in [D] Is validation set necessary for non-neural network models, too? by osedao
If you wanna use something like early stopping, though, you'll have no choice but to use 3 splits.
BrohammerOK t1_j9wvrl7 wrote
Reply to comment by osedao in [D] Is validation set necessary for non-neural network models, too? by osedao
You can work with 2 splits, which is a common practice. For a small dataset you can use 5 or 10 fold crossvalidation with shuffling on 75-80% of the dataset (train) for hyperparameter tunning / model selection, fit the best model on the entirety of that set, and then evaluate/test on the remaining 25%-20% that you held out. You can repeat the process multiple times with different seeds to get a better estimation of the expected performance, assuming that the input data when you do inference comes from the same distribution as your dataset.
[deleted] t1_j9wtkhv wrote
[deleted] t1_j9wq9xx wrote
Reply to comment by MrAcurite in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
>Eliezer Yudkowsky didn't attend High School or College. I'm not confident he understands basic Calculus or Linear Algebra
This is incredibly dishonest and I think you know it. Even a little bit of googling would show that he has done research in decision theory and has an h-index of ~15. What's yours, btw?
And if you really want to go down the route of credentialism, there are quite a few established researchers who broadly agree with what Yudkowsky is saying and are working on the same things.
gsvclass OP t1_j9wq69s wrote
Reply to comment by ZestyData in [P] Minds - A JS library to build LLM powered backends and workflows (OpenAI & Cohere) by gsvclass
It's a prompt engineering library that has implementations of various papers in the space include ReAct, Pal etc. We are working on adding more. Here's a list of some of papers we are implementing. https://42papers.com/c/llm-prompting-6343
PHEEEEELLLLLEEEEP t1_j9wnx8o wrote
Reply to comment by donshell in [D] Are there any good FID and KID metrics implementations existing that are compatible with pytorch? by ats678
>I tried the torchmetrics implementation, however they’re giving me completely wrong results
hellrail t1_j9wnk66 wrote
Wtf of course man, u also need one if you fit y=ax+b dude
Kroutoner t1_j9wfyz1 wrote
Reply to comment by osedao in [D] Is validation set necessary for non-neural network models, too? by osedao
This does not seem like suitable justification.
[deleted] t1_j9wdqrq wrote
[deleted] t1_j9wd0cc wrote
Hyper1on t1_j9wbysn wrote
Reply to comment by Imnimo in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
Well, the obvious optimisation shortcoming is overfitting. We cannot distinguish this rigorously without access to model weights, but we also have a good idea what overfitting looks like in both pretraining and RL finetuning (in both cases it tends to result in common repeated text strings and a strong lack of diversity in output, a sort of pseudo mode collapse). We can test this by giving Bing GPT the same question multiple times and observing if it has a strong bias towards particular completions -- having played with it a bit I don't think this is really true for the original version, before Microsoft limited it in response to criticism a few days ago.
Meanwhile, the alternative hypothesis I raised seems very plausible and fits logically with prior work on emergent capabilities of LLMs (https://arxiv.org/abs/2206.07682), since it seems only natural to expect that when you optimise a powerful system for an objective sufficiently, it will learn instrumental behaviours which help it minimise that objective, potentially up to and including appearing to simulate various "personalities" and other strange outputs.
Personally, as a researcher who works on RL finetuned large language models and has spent time playing with many of these models, my intuition is that Bing GPT is not RL finetuned at all but is just pretrained and finetuned on dialogue data, and the behaviour we see is just fairly likely to arise by default, given Bing GPT's particular model architecture and datasets (and prompting interaction with the Bing Search API).
osedao OP t1_j9waf9t wrote
Reply to comment by Kroutoner in [D] Is validation set necessary for non-neural network models, too? by osedao
Could this approach be enough to justify not using validation: i have 8 features and if i have equal/same distributions of each of these features in both training and test set, would this be enough?
osedao OP t1_j9wa0a1 wrote
Reply to comment by Additional-Escape498 in [D] Is validation set necessary for non-neural network models, too? by osedao
Thanks for the recommendations! I’ll try this
cthorrez t1_j9xahu6 wrote
Reply to comment by Maximum-Ruin-9590 in [D] Is validation set necessary for non-neural network models, too? by osedao
I just have one dataset too. I train, pick hyperparameters, and test on the same data. Nobody can get better metrics than me. :D