Recent comments in /f/MachineLearning

edjez t1_j7t9rp3 wrote

Another emergent capability - and this depends on the model architecture, for example I don’t think Stable Diffusion could have it, but Dalle does - is to generate written letters / “captions” that to us look like gibberish but actually correspond to internal language embeddings for real-world cluster of concepts.

2

nielsrolf t1_j7stpek wrote

Parti (https://parti.research.google/) showed that being able to spell is an emergent ability. That is the only one I know of, but others that I could imagine are learning compositionally (a blue box between a yellow sphere and a green box), but it's more likely that this is a data issue. Also working out of distribution (a green dog) is a potential candidate. Interesting question

48

currentscurrents t1_j7sri62 wrote

SNN-ANN conversion is kludge - not only do you have to train an ANN first, it means your SNN is incapable of learning anything new.

Surrogate gradients are better! But they're still non-local and require backwards passes, which means you're missing out on the massive parallelization you could achieve with local learning rules on the right hardware.

Local learning is the dream, and would have benefits for ANNs too: you could train a single giant model distributed across an entire datacenter or even multiple datacenters over the internet. Quadrillion-parameter models would be technically feasible - I don't know what happens at that scale, but I'd sure love to find out.

2

EyeSprout t1_j7sqjzc wrote

CNNs and some very early optimizations for them that used to be kind of useful but are no longer really needed anymore since our computers are now faster (like Gabor functions) are sort of inspired from neuroscience research. Attention mechanisms were also floating around for quite a bit in neuroscience in models of memory and retrieval before it was sort of streamlined and simplified into the form we see today.

In general, when things go from neuroscience to machine learning, it takes a lot of stripping down of things into the actually relevant and useful components before they become actually workable. Neuroscientists have lot of ideas for mechanisms, but not all of them are useful...

3

sonofmath t1_j7se4mx wrote

Reply to comment by mr_house7 in [D] List of RL Papers by C_l3b

Can't really speak for Hugging Face. It seems to touch on relatively advanced topics and challenging tasks. It certainly looks nice from a practitoner's side, which is very useful to learn the various tricks to make RL work.

Regarding Silver's course, it is a bit outdated indeed, but the focus is more on the basics of RL, whereas Levine focuses on deep RL and assumes a good understanding of the basics.

Now, there are some topics in Silver's course which are a bit outdated (e.g. TD(lambda) with eligibility traces or linear function approximation) which would be better replaced by other topics in more modern courses, typically DQN or AlphaGo (UCL has also a more recent series, which touches on Deep RL). But Silver's explainations are very instructive and is one of the best taught university courses I have seen (in general). I would for sure at least watch the first few lectures.

2