Recent comments in /f/MachineLearning

AnothaUselessComment t1_j9c9er6 wrote

Yikes, this may be tough.

I know you can try Bloom (like this blog post tried) and let it try and download overnight, but you may run into problems. (I've heard the download takes forever)

https://enjoymachinelearning.com/blog/gpt-3-vs-bloom/

Though I will say, it's probably worth whatever cost you're trying to dodge just to hit an API, even if your hardware is great.

2

AlmightySnoo OP t1_j9c56bx wrote

>It’s not even using the generative model for anything useful.

Thank you, that's literally what I meant in my second paragraph. They're literally training the GAN to learn Dirac distributions. The noise has no use, and the discriminator eventually ends up learning to do roughly the job of a simple squared loss.

−6

Optimal-Asshole t1_j9c4h8d wrote

Okay lol so I’m actually researching kinda similar things and I assumed this paper was related because it used similar tools but upon a closer look, nope nvm. It’s not even using the generative model for anything useful.

So their paper just shows that the basic idea of least squares PDE solving can be used for generative models. Okay now it’s average class project tier. I guess this demonstrates that yes these workshops accept literally anything.

Edit: it’s still not plagiarism. It’s just not very novel. Plagiarism is stealing ideas without credit. What they did was discuss an existing idea and extend it in a very small way experimentally only. Not plagiarism.

14

AlmightySnoo OP t1_j9c2trd wrote

>It doesn’t seem like plagiarism, since they do ample citation.

It is when you are pretending to do things differently while in practice you do the exact same thing and add a useless layer (the GAN) to give the false impression of novelty. Merely citing sources in such cases doesn't shield you from being accused of plagiarism.

>As far as the justification goes, there are some generative based approaches for solving parametric PDEs even now.

Not disputing that there might be papers out there where the use is justified, of course there are skilled researchers with academic integrity. But again, in this paper, and the ones I'm talking about in general, the setting is exactly as in my 2nd paragraph, where the use of GANs is clearly not justified at all.

>but I don’t think it’s that bad

Again, in the context of my second paragraph (because that's literally what they're doing), it is bad.

−17

Optimal-Asshole t1_j9c20cy wrote

I think these workshops accept every submission that is not incoherent or desk rejected.

From my quick glance, It doesn’t seem like plagiarism, since they do ample citation. As far as the justification goes, there are some generative based approaches for solving parametric PDEs even now. It doesn’t seem like the best paper ever, but I don’t think it’s that bad.

14

tdgros t1_j9bfds3 wrote

Just read the post!

>However, the paper itself builds a network that gets as an input 5D vectors (3 location coordinates+2 camera angles) and outputs color and volume density for each such coordinate. I don't understand where do I get those 5D coordinates from? My training data surely doesn't have those - I only have a collection of images.

7

catch23 t1_j9b9upb wrote

Could try something like this: https://github.com/Ying1123/FlexGen

This was only released a few hours ago, so there's no way for you to have discovered this previously. Basically makes use of various strategies if your machine has lots of normal cpu memory. The paper authors were able to fit a 175B parameter model on their lowly 16GB T4 gpu (with a machine with 200GB of normal memory).

56

DevarshTare OP t1_j9b4u7s wrote

Thats interesting. I was considering that purchase since it makes sense to run larger datasets or models on the rtx 3060. But the Tensor cores were significantly lower. The GPU would definitely run much larger models but at a lower speed I assume?

How has your experience been with larger models? Especially video or image based models ?

2