Recent comments in /f/MachineLearning
[deleted] t1_j9jsq4d wrote
[removed]
[deleted] t1_j9jsgf6 wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
What is "bounded interval" here?
inspired2apathy t1_j9jsbz6 wrote
Reply to comment by hpstring in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
It's that entirely accurate though? There's all kinds of explicit dimensionally reduction methods. They can be combined with traditional ml models pretty easily for supervised learning. As I understand, the unique thing DL gives us just a massive embedding that can encode/"represent" something like language or vision.
GraciousReformer OP t1_j9jrhjd wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
This is a great point. Thank you. So do you mean that DL work for language models only when they get a large amount of data?
yldedly t1_j9jr821 wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
This one is pretty good: https://arxiv.org/abs/2207.08815
GraciousReformer OP t1_j9jr4i4 wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
"for example on tabular data where discontinuities are common, DL performs worse than alternatives, even if with more data it would eventually approximate a discontinuity." True. Is there references on this issue?
yldedly t1_j9jpuky wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
There are two aspects, scalability and inductive bias. DL is scalable because compositions of differentiable functions make backpropagation fast, and those functions being mostly matrix multiplications make GPU acceleration effective. Combine this with stochastic gradients, and you can train on very large datasets very quickly.
Inductive biases make DL effective in practice, not just in theory. While the universal approximation theorem guarantees that an architecture and weight-setting exist that approximate a given function, the bias of DL towards low-dimensional smooth manifolds reflects many real-world datasets, meaning that SGD will easily find a local optimum with these properties (and when it doesn't, for example on tabular data where discontinuities are common, DL performs worse than alternatives, even if with more data it would eventually approximate a discontinuity).
GraciousReformer OP t1_j9jpu94 wrote
Reply to comment by LowLook in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
?
LowLook t1_j9jprdu wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Inventing them
GraciousReformer OP t1_j9jppu7 wrote
Reply to comment by terminal_object in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
"Artificial neural networks are often (demeneangly) called "glorified regressions". The main difference between ANNs and multiple / multivariate linear regression is of course, that the ANN models nonlinear relationships."
uhules t1_j9jpkun wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Before why, ask if. GBDTs are very widely used.
terminal_object t1_j9jp51j wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
You seem confused as to what you yourself are saying.
yldedly t1_j9jorh1 wrote
Reply to comment by ewankenobi in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
It's not from a paper, but it's pretty uncontroversial I think - though people like to forget about the "bounded interval" part, or at least what it implies about extrapolation.
chinguetti t1_j9joqfu wrote
Reply to comment by astonzhang in [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501
Will make a good story when you accept your Nobel prize. Well done.
Optimal-Asshole t1_j9jo26z wrote
Reply to comment by buyIdris666 in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_
Residual refers to the fact that the NN/bottleneck learns the residual left over after accounting for the entire input. Anyone calling the skip connections “residual connections” should stop though lol
1bir t1_j9jnu2h wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
>Whether the algorithms for training them do that as well as the ones for deep NNs in practice is a separate issue.
For supervised learning the big problem with decision trees (RFs, GBTs etc) seems to be representation learning
ktpr t1_j9jmqq7 wrote
I feel like recently ML boosters come this Reddit, make large claims, and then use the ensuing discussion, time, and energy from others to correct their click content at our expense
elmcity2019 t1_j9jm4nq wrote
I have been an applied data scientist for 10 years. I have built over 100k models using python, databricks and DataRobot. I have never seen a DL model out compete all the other algorithms. Granted I am largely working with structured business data, but nonetheless DL isn't really competitive.
[deleted] t1_j9jlx00 wrote
Reply to comment by [deleted] in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
[deleted]
ewankenobi t1_j9jl91t wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
I like your wording, did you come up with that definition yourself or is it from a paper?
[deleted] t1_j9jl4z3 wrote
Reply to comment by VirtualHat in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
[deleted]
randomoneusername t1_j9jkzs7 wrote
Reply to comment by [deleted] in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
The statement stand-alone that you have there is very vague. Can I assume is taking about NLP or CV projects ?
On tabular data even with non linear relationship normal boosting, ensemble algorithms can scale and be top of the game.
Zer0D0wn83 t1_j9jjq9v wrote
Reply to comment by Own_Quality_5321 in [P] The First Depthwise-separable Convolution Animation by Animated-AI
Do you teach online?
[deleted] t1_j9jt2vp wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
>the bias of DL towards low-dimensional smooth manifolds
What is this? Got all the rest but that