Recent comments in /f/MachineLearning

suflaj t1_j57te64 wrote

It's not necessarily better, but it will help you if your data is not really abundant...

For an example, if you look at it as regression, then the model uses your features and tries to figure out how correlated they are with the grade. Your grade is continuous and monotonous, meaning that if the features contribute in "sane" ways to the grade, it will map easily.

If you consider it a classification problem, then each class has basically its own degree of freedom. This could cause your model to be overconfident, whereas with the regression solution at the very least your model is going to try and fit it to a continuous monotonous function.

With the regression task, you are implicitly telling your model that grade 2 is better than 1 and worse than 3. But with a classification model, because each class can be independent, your model can only learn this implicitly through data. Which means that if your data is insufficient for the model to learn it, it won't work, whereas with a regression task, if your data is insufficient, it might still interpolate correctly.

1

aidv t1_j57rrqa wrote

I believe that this might be fully true. I’ll tell you why:

I don’t know how many times I’ve felt like the voice of speakers in videos have sounded like they are AI generated.

Like, voices of people that I subscribe to.

I am was convinced that they were doing some AI fuckery, and this post only pretty much confirms it.

It’s probably to save bandwidth and storage on site, so makes sense.

−2

axm92 t1_j57jfrk wrote

>My use case is more classification of datapoints(containing many seemingly unimportant features that may or may not have some relationship to each other. Relationships that are captured in the knowledge graph

Sounds eerily close to one of our paper: https://aclanthology.org/2021.emnlp-main.508.pdf

To solve commonsense reasoning questions, we first generate a graph that can capture relationship between entities in the question (if you're thinking "chain-of-thought" prompting--yes, the idea is similar). Then, we jointly train a mixture-of-experts model with a classifier (RoBERTa) to do three things: i) learn to discard useless nodes, ii) pool node representations from useful nodes into a single graph embedding, and iii) classify using question + graph embeddings.

​

This video may give a good TLDR too.

8

suflaj t1_j57gnky wrote

Well this is a regression task, not classification. You could classify 1, 2, 3 and 4 for each output, but it seems like they are continuous. You can always just truncate your result, ex. with y = max(1, min(4, ceil(x + 0.5))). With classification you could argmax a class, but then you'll overfit more easily. You would probably benefit from the bias coming from the regression task itself telling the algorithm that 2 is close to 3 and 1, but far away from 4.

1