Recent comments in /f/MachineLearning
JimmyTheCrossEyedDog t1_j6nv3zg wrote
Reply to comment by Internal-Diet-514 in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
This feels like a mix-up between the colloquial and mathematical definitions of dimension Yes, NN approaches tend to work better on very high-dimensional data, but the dimension here refers to the number of input features. So, for a 416x416x3 image, that's >500k dimensions, far higher than the number of dimensions in almost all tabular datasets.
> image data 4D (extra dimension for batch)
The batch is an arbitrary parceling of data simply due to how NNs are typically trained for computational reasons. If I were to train a NN on tabular data, it'd also be batched, but it doesn't give it a new meaningful dimension (either in the colloquial sense or the sense that matters for ML)
Also, NNs are still the best option for computer vision even on greyscale data, which is spatially 2D but still has a huge number of dimensions.
edit: I'd also argue that high dimensionality isn't the biggest reason NNs work for computer vision, but something more fundamental - see qalis's point bin this thread
nucLeaRStarcraft t1_j6nunti wrote
Reply to comment by qalis in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
There's also this survey of DL vs traditional methods for tabular data: https://arxiv.org/pdf/2110.01889.pdf
EduCGM OP t1_j6nrow0 wrote
Reply to A taxonomy of generative AI models [R] by EduCGM
Anyway, the purpose of this work was just talking about a topic that is popular being at the same time readable by a broad audience, and was done by an undergraduate student. There is no peer reviewed article yet, hopefully soon and I will be delighted to share it here as well.
Technical_Ad_9732 t1_j6nqru3 wrote
Not at all. It would make this all a sad one trick pony, and frankly, not all fixes are requiring a hammer only when you have an entire toolbox in existence.
[deleted] t1_j6now07 wrote
Reply to comment by Internal-Diet-514 in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
[deleted]
ApprehensiveNature69 t1_j6nmxbc wrote
Reply to comment by qalis in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
Tangential, but I am so happy cupy is still being developed even though Chainer died.
ktpr t1_j6nmsol wrote
Reply to comment by beanhead0321 in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
Did they claim traditional ML explained the features engineered by the DL? If so, how did they explain the units of feature variables?
akrasia_here_I_come t1_j6nmd3k wrote
Reply to comment by PredictorX1 in A taxonomy of generative AI models [R] by EduCGM
But this is exactly the kind of content that gets neglected in academia because of the assumption that everyone reading them already knows the field very well. Lit reviews are wonderful when you can find them, but there's not a lot of incentive to publish those.
If there's a peer-reviewed article out there that covers this info, then by all means please share it! (And in that case, it may be justified to critique someone for sharing the non-peer-reviewed equivalent). But if there's not, it seems pointlessly exclusionary gatekeep the sharing of illuminating content just because it's from outside academia proper.
Grandexar t1_j6nlqx2 wrote
The hype for neural nets right now is just because the hardware has finally caught up
aschroeder91 t1_j6nliim wrote
Reply to comment by beanhead0321 in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
the real innovation will be comfortably backpropagating through the hybrid model as a whole
Imaginary_Parfait944 t1_j6nkswu wrote
Reply to comment by bananonymos in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
I hope so.
Brudaks t1_j6nj9z1 wrote
For most established tasks people have a good idea (based on empirical evidence) about the limits of particular methods for this task.
There are tasks where "traditional machine learning methods" work well, and people working on these tasks use them and will use them.
And there are tasks where they don't and deep learning gets far better results that we could/can do otherwise - and for those types of tasks, yes, it would be accurate to say that we have given up on traditional machine learning; if you're given an image classification or text analysis task, you'd generally use DL even for a simple baseline without even trying any of the "traditional" methods we used in earlier years.
arhetorical t1_j6nhean wrote
Hiya, great work again! Maybe I'm outing myself a little here, but the code doesn't work on Windows machines, apparently because the processes are spawned instead of forked. I'm not sure it's an easy fix and maybe not worth the time (it works fine on WSL) but just thought I'd mention in case you weren't aware!
On the ML side, should this scale up pretty straightforwardly to CIFAR100 or are there things to be aware of?
thevillagersid t1_j6nfjle wrote
Reply to comment by antodima in [D] Sparse Ridge Regression by antodima
You can still compute the estimator with sparse inputs because the regularization term ensures the denominator is full rank. If the zeros are standing in for missing values, however, your estimates will be biased.
As for your second question, W* computed from only columns 2 and 4 will only yield the same values as W in the unrestricted model if the columns of X are orthogonal. Could you work with an orthogonal transform (e.g. PCA projection) of the X matrix?
Internal-Diet-514 t1_j6nep37 wrote
Deep learning is only really the better option with higher dimensional data. If Tabular data is 2D, time series is 3D and image data 4D (extra dimension for batch) than deep learning is really only used for 3D and 4D data. As others have said tree based models will most of the time outperform deep learning on a 2D problem.
But I think the interesting thing is the reason we have to use deep learning in the first place. In higher dimensional data we don’t have something that is “a feature” in the sense that we do with 2D data. In time series you have features but they are taken over time so really we need a feature which describes that feature over time. That’s what CNNs do. CNNs are feature extractors and at the end of the process almost always put that data back into 2D format (when doing classification) which is sent through a neural net, but it could be sent through a random forest as well.
I think it’s fair to compare a neural network to traditional ML but when we get into a CNN thats not really a comparison. A CNN is a feature extraction method. The great thing is that we can optimize this step by connecting it to a neural network with a sigmoid (or whatever activation) output.
We don’t have a way to connect traditional ML methods with a feature extraction method in the way you can with back propagation for a neural net and a CNN. If it’s possible to find a way to do that, maybe we would see a rise in the use of traditional ML for high dimensional data.
EduCGM OP t1_j6ndee3 wrote
Reply to A taxonomy of generative AI models [R] by EduCGM
Got it.
pucklermuskau t1_j6nc5b5 wrote
Reply to comment by EduCGM in A taxonomy of generative AI models [R] by EduCGM
We'd prefer peer-review.
bananonymos t1_j6nc2fr wrote
Lol no. Have researched given up on linear regression?
beanhead0321 t1_j6nbq1j wrote
Reply to comment by aschroeder91 in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
I remember sitting in on a talk from a large insurance company who did this a number of years back. They used DL for feature engineering, but used traditional ML for the predictive model itself. This had to do with satisfying some regulatory requirements around model interpretability.
EduCGM OP t1_j6nbkep wrote
Reply to comment by PredictorX1 in A taxonomy of generative AI models [R] by EduCGM
?
PredictorX1 t1_j6nb8vs wrote
Reply to A taxonomy of generative AI models [R] by EduCGM
>This medium article is ...
Let me stop you right there.
PredictorX1 t1_j6naygg wrote
Reply to comment by [deleted] in [D] Have researchers given up on traditional machine learning methods? by fujidaiti
Deep learning and "shallow" machine learning are generally appropriate for different types of problems.
andreichiffa t1_j6n9lg6 wrote
Reply to comment by visarga in Few questions about scalability of chatGPT [D] by besabestin
A lot of the conclusions from that paper has been called into question by the discovery GPT-2 was actually memorizing a lot of information from the training dataset a little less than a year later: https://arxiv.org/abs/2012.07805
About a year after that Anthropic came out with a paper that suggested that there were scaling laws that meant undertrained larger models did not that much better and actually did need more data: https://arxiv.org/pdf/2202.07785.pdf
Finally, more recent results from DeepMind did an additional pass on the topic and seem to suggest that the relationship between the data and model size is much more tight than anticipated and that a 4x smaller model trained for 4x the time would out-perform the larger model: https://arxiv.org/pdf/2203.15556.pdf
Basically the original OpenAI paper did contradict a lot of prior research on overfitting and generalization and seems to be due to a Simpson paradox instance on some of the batching they were doing.
SaifKhayoon t1_j6n9kb2 wrote
Nah, researchers haven't given up on traditional machine learning methods! They combine them with deep learning in lots of places, like image classification, speech recognition, and recommender systems.
Plus, traditional methods can be better for some tasks, like when you have a small dataset or want an explainable model or real-time predictions.
peatfreak t1_j6nvknw wrote
Reply to [D] Have researchers given up on traditional machine learning methods? by fujidaiti
No way. Machine learning has always been subject to ideas cycling around over time, booms and busts of interest. For example, neural nets have come and gone many times during the decades. I wouldn't be surprised if there is a revival of interest in support vector machines, for example, in 5-10 years' time.