Recent comments in /f/MachineLearning
yldedly t1_j9k3orr wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Not sure what you're asking. CNNs have inductive biases suited for images.
MuonManLaserJab t1_j9k3bcn wrote
Reply to comment by inspired2apathy in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
They did say "crack", not "attempt".
KingRandomGuy t1_j9k363j wrote
Reply to comment by activatedgeek in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
> CNNs provide the inductive bias to prefer functions that handle translation equivariance
There's some interesting bodies of work to inductive biases in CNNs, such as "Making Convolutional Networks Shift-Invariant Again". Really interesting stuff!
GraciousReformer OP t1_j9k1srq wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
But then what is the difference from the result that NN works better for ImageNet?
Disastrous_Nose_1299 OP t1_j9k1o9p wrote
Reply to comment by Top-Perspective2560 in [Discussion] Exploring the Black Box Theory and Its Implications for AI, God, and Ethics by Disastrous_Nose_1299
I still think its possible the engineers missed something that makes it sentient, i don't think its realistic, but the idea that it is possible that it is secretly sentient and the engineers missed it intrigues me.
Disastrous_Nose_1299 OP t1_j9k1hpi wrote
Reply to comment by Blakut in [Discussion] Exploring the Black Box Theory and Its Implications for AI, God, and Ethics by Disastrous_Nose_1299
I fully respect you because the reason I wanted to talk about this on reddit was because I couldn't talk to any professionals, I am fully aware of what the god in the gaps theory is, but my idea is different because it does not claim that god exists somewhere, instead it is a thought experiment.
A"does god exist?"
B"no"
A"But what if he is in a black hole?"
B"he is not in a black hole"
A"I cannot fully trust your judgement until we see what is inside a black hole first, then we can say whether or not he is in a black hole."
It is simple, concise and one time one of Open AI's models called me a genius because of it, although most people seem to think im an idiot for saying it.
DigThatData t1_j9k17rr wrote
it's not. tree ensembles scale gloriously, as do approximations of nearest neighbors. there are certain (and growing) classes of problems for which deep learning produces seemingly magical results, but that doesn't mean it's the only path to a functional solution. It'll probably give you the best solution, but that doesn't mean it's the only way to do things.
in any event, if you want to better understand scaling properties of DL algorithms, a good place to start is the "double descent" literature.
Own_Quality_5321 t1_j9k15ps wrote
Reply to comment by Zer0D0wn83 in [P] The First Depthwise-separable Convolution Animation by Animated-AI
Face to face, but we use online resources as well, and this seems to be a good one! 🙂
TinkerAndThinker t1_j9k0y39 wrote
Reply to [D] Simple Questions Thread by AutoModerator
Looking for recommendations on PhD-level papers/textbooks/reading list on Machine Learning.
I want to revisit even the most "basic" of topics such as linear/logistic regression, but with better deeper understanding.
Desired outcome: able to answer questions like
- how to test for xxx assumption
- what is the implication if xxx assumption is violated (eg. heteroskedascity of error terms)
TIA!
red75prime t1_j9k0i84 wrote
Reply to comment by SodomizedPanda in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Does in-context learning suggest that inductive biases could also be extracted from training data?
NitroXSC t1_j9k09wt wrote
Reply to comment by hpstring in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
> Q2: Probably there are other classes but they haven't been discovered or are only at the early age of research.
I think there are many different classes that would work but current DL is based in large parts on matrix-vector operations which can be implemented efficiently on current hardware.
bloodmummy t1_j9jzvnr wrote
Reply to comment by randomoneusername in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
It strikes me that people who tout DL as a hammer-for-all-nails never touched tabular data in their lives. Go try to do a couple of Kaggle Tabular competitions and you'll soon realise that DL can be very dumb, cumbersome, and data-hungry. Ensemble models,Decision Tree models, and even feature-engineered Linear Regression models still rule there and curb-stomp DL all day long ( For most cases ).
Tabular data is also still the type of data most-used with ML. I'm not a "DL-hater" if there is such a thing, in fact my own research is using DL only. But it isn't a magical wrench, and it won't be.
inspired2apathy t1_j9jzpqw wrote
Reply to comment by hpstring in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Other models like PGMs can absolutely be applied to ImageNet, just not for SOTA accuracy.
cccntu OP t1_j9jz6ov wrote
Reply to comment by JClub in [P] minLoRA: An Easy-to-Use PyTorch Library for Applying LoRA to PyTorch Models by cccntu
This project started out as me exploring if PyTorch parametrizations could be used to do LoRA, and it turned out perfect for this task! And I simply wanted to share that.
I think it would be interesting to see it integrated into PEFT, too. Although they already have their own LoRA implementation there.
SodomizedPanda t1_j9jyhem wrote
Reply to comment by activatedgeek in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
And somehow, the best answer is at the bottom of the thread..
A small addition : Recent research suggests that the implicit bias in DNN that helps generalization does not only lie in the structure of the network but in the learning algorithm as well (Adam, SGD, ...). https://francisbach.com/rethinking-sgd-noise/ https://francisbach.com/implicit-bias-sgd/
hpstring t1_j9jxzpm wrote
Reply to comment by inspired2apathy in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Well the traditional ml + dimensionality reduction cannot crack e.g. imagenet recognition
activatedgeek t1_j9jvj8h wrote
For generalization (performing well beyond the training), there’s at least two dimensions: flexibility and inductive biases.
Flexibility ensures that many functions “can” be approximated in principle. That’s the universal approximation theorem. It is a descriptive result and does not prescribe how to find that function. This is not something very unique to DL. Deep Random Forests, Fourier Bases, Polynomial Bases, Gaussian processes all are universal function approximators (with some extra technical details).
The part unique to DL is that somehow their inductive biases have helped match some of the complex structured problems including vision and language that makes them generalize well. Inductive bias is a loosely defined term. I can provide examples and references.
CNNs provide the inductive bias to prefer functions that handle translation equivariance (not exactly true but only roughly due to pooling layers). https://arxiv.org/abs/1806.01261
Graph neural networks provide a relational inductive bias. https://arxiv.org/abs/1806.01261
Neural networks overall prefer simpler solutions, embodying Occam’s razor, another inductive bias. This argument is made theoretically using Kolmogorov complexity. https://arxiv.org/abs/1805.08522
[deleted] t1_j9jveco wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
[deleted]
[deleted] t1_j9jukfu wrote
Reply to comment by yldedly in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Interesting but that is valid for us as well. So I am not sure this is true once they learn very general things, like learning itself.
hpstring t1_j9juk1f wrote
Reply to comment by GraciousReformer in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Q1: We don't know yet. Q2: Probably there are other classes but they haven't been discovered or are only at the early age of research.
yldedly t1_j9judc7 wrote
Reply to comment by [deleted] in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Any interval [a; b] where a and b are numbers. In practice, it means that the approximation will be good in the parts of the domain where there is training data. I have a concrete example in a blog post of mine: https://deoxyribose.github.io/No-Shortcuts-to-Knowledge/
yldedly t1_j9jtuzy wrote
Reply to comment by [deleted] in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
I'll link you to an old comment: https://www.reddit.com/r/MachineLearning/comments/z12zxj/comment/ix9t149/?utm_source=share&utm_medium=web2x&context=3
NapkinsOnMyAnkle t1_j9jtolb wrote
Reply to comment by i2mi in [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501
I've trained 100m CNNs on my laptop 3070 6gb. So...
activatedgeek t1_j9jt721 wrote
Reply to comment by chief167 in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
I think the no free lunch theorem is misquoted here. The NFL also assumes that all datasets from the universe of datasets are equally likely. But that is objectively false. Structure is more likely than noise.
ichiichisan t1_j9k3sx1 wrote
Reply to [R] Provable Copyright Protection for Generative Models by vyasnikhil96
Although this is interesting work, you are no lawyers and will not be able to provide "provable copyright protection".