Recent comments in /f/MachineLearning

Gemabo t1_j5rwh5f wrote

Matlab has a deep learning Toolbox that makes it easy and efficient to train any type of model. Including RNNs. Although, there is a good argument (and a famous paper) that anything you can do with RNN you can do better with CNN. Julia has deep learning libraries, but don't expect nearly the level of support and ease of use as Matlab. Matlab's DL is underrated.

4

toftinosantolama t1_j5rvxa3 wrote

Reply to comment by entarko in [D] CVPR Reviews are out by banmeyoucoward

Well the rating could be hidden... Not that this is the problem, the problem is that the reviews are really entitled and not willing to stand corrected. I've these so many times. And I'd bet this kind of reveiwrs are phd students, not very clever ones.

0

entarko t1_j5rvhgc wrote

Since there is a discussion period, not having access to the initial reviews would only be a waste of time. As a reviewer, you would re-write arguments from your review, which could have simply been read from the initial review.

2

entarko t1_j5rujuz wrote

There is an official discussion period between reviewers and AC starting on the 31st of January. It would be weird not to know other reviewers ratings. It would be unprecedented, as it was the case for, at least, the last three years.

2

toftinosantolama t1_j5rtegr wrote

I don't have an answer to your question, but given that the reviewers don't know the scores of the rest of the reviewers during the rebuttal, a good rebuttal could raise potentially all of them, but even with 2 borderlines being raised to weak accepts, accept should be possible. Good lucks folks.

2

PredictorX1 t1_j5rb8gp wrote

>I was in the understanding that two contiguous linear layers in a NN would be no better than only one linear layer.

This is correct: In terms of the functions they can represent, two consecutive linear layers are algebraically equivalent to one linear layer.

1

arg_max t1_j5r8qe6 wrote

What do you mean by "function represented by a neural network"? If you are hinting in the direction of universal approximation, then yes, you can learn any continuous function arbitrarily close with a single layer, sigmoid activation and infinite width. But similarly, there exist some results that show you can achieve a similar statement with a width-limited and "infinite depth" network (the required depth is not infinite but depends on the function you want to approximate and is afaik unbounded over the space of continuous functions). In practice, we are far away from either infinite width or depth so specific configurations can matter.

1

HateRedditCantQuitit t1_j5r5f69 wrote

You can represent any `m x n` matrix with the product of some `m x k` matrix with a `k x n` matrix, so long as k >= min(m, n). If k is less than that, you're basically adding regularization.

Imagine you have some optimal M in Y = M X. Then if A and B are the right shape (big enough in the k dimension), they can represent that M. If they aren't big enough, then they can't learn that M. If the optimal M doesn't actually need a zillion degrees of freedom, then having a small k bakes that restriction into the model, which would be regularization.

Look up linear bottlenecks.

3