[deleted] t1_j9jt2vp wrote on February 22, 2023 at 2:07 PM

Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

>the bias of DL towards low-dimensional smooth manifolds

What is this? Got all the rest but that

[deleted] t1_j9jsq4d wrote on February 22, 2023 at 2:05 PM

Reply to [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

[removed]

[deleted] t1_j9jsjad wrote on February 22, 2023 at 2:03 PM

Reply to [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501

[removed]

[deleted] t1_j9jsgf6 wrote on February 22, 2023 at 2:03 PM

Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

What is "bounded interval" here?

inspired2apathy t1_j9jsbz6 wrote on February 22, 2023 at 2:02 PM

Reply to comment by hpstring in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

It's that entirely accurate though? There's all kinds of explicit dimensionally reduction methods. They can be combined with traditional ml models pretty easily for supervised learning. As I understand, the unique thing DL gives us just a massive embedding that can encode/"represent" something like language or vision.

GraciousReformer OP t1_j9jrhjd wrote on February 22, 2023 at 1:55 PM

Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

This is a great point. Thank you. So do you mean that DL work for language models only when they get a large amount of data?

yldedly t1_j9jr821 wrote on February 22, 2023 at 1:53 PM

Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

This one is pretty good: https://arxiv.org/abs/2207.08815

GraciousReformer OP t1_j9jr4i4 wrote on February 22, 2023 at 1:53 PM

Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

"for example on tabular data where discontinuities are common, DL performs worse than alternatives, even if with more data it would eventually approximate a discontinuity." True. Is there references on this issue?

yldedly t1_j9jpuky wrote on February 22, 2023 at 1:43 PM

Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

There are two aspects, scalability and inductive bias. DL is scalable because compositions of differentiable functions make backpropagation fast, and those functions being mostly matrix multiplications make GPU acceleration effective. Combine this with stochastic gradients, and you can train on very large datasets very quickly.
Inductive biases make DL effective in practice, not just in theory. While the universal approximation theorem guarantees that an architecture and weight-setting exist that approximate a given function, the bias of DL towards low-dimensional smooth manifolds reflects many real-world datasets, meaning that SGD will easily find a local optimum with these properties (and when it doesn't, for example on tabular data where discontinuities are common, DL performs worse than alternatives, even if with more data it would eventually approximate a discontinuity).

GraciousReformer OP t1_j9jpu94 wrote on February 22, 2023 at 1:43 PM

Reply to comment by LowLook in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

?

LowLook t1_j9jprdu wrote on February 22, 2023 at 1:42 PM

Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

Inventing them

GraciousReformer OP t1_j9jppu7 wrote on February 22, 2023 at 1:42 PM

Reply to comment by terminal_object in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

"Artificial neural networks are often (demeneangly) called "glorified regressions". The main difference between ANNs and multiple / multivariate linear regression is of course, that the ANN models nonlinear relationships."

https://stats.stackexchange.com/questions/344658/what-is-the-essential-difference-between-a-neural-network-and-nonlinear-regressi

uhules t1_j9jpkun wrote on February 22, 2023 at 1:40 PM

Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

Before why, ask if. GBDTs are very widely used.

terminal_object t1_j9jp51j wrote on February 22, 2023 at 1:37 PM

Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

You seem confused as to what you yourself are saying.

yldedly t1_j9jorh1 wrote on February 22, 2023 at 1:34 PM

Reply to comment by ewankenobi in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

It's not from a paper, but it's pretty uncontroversial I think - though people like to forget about the "bounded interval" part, or at least what it implies about extrapolation.

chinguetti t1_j9joqfu wrote on February 22, 2023 at 1:34 PM

Reply to comment by astonzhang in [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501

Will make a good story when you accept your Nobel prize. Well done.

Optimal-Asshole t1_j9jo26z wrote on February 22, 2023 at 1:28 PM

Reply to comment by buyIdris666 in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

Residual refers to the fact that the NN/bottleneck learns the residual left over after accounting for the entire input. Anyone calling the skip connections “residual connections” should stop though lol

1bir t1_j9jnu2h wrote on February 22, 2023 at 1:26 PM

Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

>Whether the algorithms for training them do that as well as the ones for deep NNs in practice is a separate issue.

For supervised learning the big problem with decision trees (RFs, GBTs etc) seems to be representation learning

ktpr t1_j9jmqq7 wrote on February 22, 2023 at 1:17 PM

Reply to [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

I feel like recently ML boosters come this Reddit, make large claims, and then use the ensuing discussion, time, and energy from others to correct their click content at our expense

elmcity2019 t1_j9jm4nq wrote on February 22, 2023 at 1:12 PM

Reply to [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

I have been an applied data scientist for 10 years. I have built over 100k models using python, databricks and DataRobot. I have never seen a DL model out compete all the other algorithms. Granted I am largely working with structured business data, but nonetheless DL isn't really competitive.