luaks1337 t1_j9cajyf wrote on February 20, 2023 at 9:42 PM

Reply to comment by EuphoricPenguin22 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Yes, at least if I read the documentation correctly.

AnothaUselessComment t1_j9c9er6 wrote on February 20, 2023 at 9:34 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Yikes, this may be tough.

I know you can try Bloom (like this blog post tried) and let it try and download overnight, but you may run into problems. (I've heard the download takes forever)

https://enjoymachinelearning.com/blog/gpt-3-vs-bloom/

Though I will say, it's probably worth whatever cost you're trying to dodge just to hit an API, even if your hardware is great.

AlmightySnoo OP t1_j9c56bx wrote on February 20, 2023 at 9:07 PM

Reply to comment by Optimal-Asshole in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

>It’s not even using the generative model for anything useful.

Thank you, that's literally what I meant in my second paragraph. They're literally training the GAN to learn Dirac distributions. The noise has no use, and the discriminator eventually ends up learning to do roughly the job of a simple squared loss.

EuphoricPenguin22 t1_j9c51t7 wrote on February 20, 2023 at 9:06 PM

Reply to comment by catch23 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Does that increase inference time?

Agreeable-Run-9152 t1_j9c4naa wrote on February 20, 2023 at 9:03 PM

Reply to [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

Yeah i actually agree with your rant. However there is a small Chance they acted in good faith and did Not see that the randomness in the GAN wont do anything.

Optimal-Asshole t1_j9c4h8d wrote on February 20, 2023 at 9:02 PM

Reply to comment by AlmightySnoo in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

Okay lol so I’m actually researching kinda similar things and I assumed this paper was related because it used similar tools but upon a closer look, nope nvm. It’s not even using the generative model for anything useful.

So their paper just shows that the basic idea of least squares PDE solving can be used for generative models. Okay now it’s average class project tier. I guess this demonstrates that yes these workshops accept literally anything.

Edit: it’s still not plagiarism. It’s just not very novel. Plagiarism is stealing ideas without credit. What they did was discuss an existing idea and extend it in a very small way experimentally only. Not plagiarism.

AlmightySnoo OP t1_j9c2trd wrote on February 20, 2023 at 8:51 PM

Reply to comment by Optimal-Asshole in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

>It doesn’t seem like plagiarism, since they do ample citation.

It is when you are pretending to do things differently while in practice you do the exact same thing and add a useless layer (the GAN) to give the false impression of novelty. Merely citing sources in such cases doesn't shield you from being accused of plagiarism.

>As far as the justification goes, there are some generative based approaches for solving parametric PDEs even now.

Not disputing that there might be papers out there where the use is justified, of course there are skilled researchers with academic integrity. But again, in this paper, and the ones I'm talking about in general, the setting is exactly as in my 2nd paragraph, where the use of GANs is clearly not justified at all.

>but I don’t think it’s that bad

Again, in the context of my second paragraph (because that's literally what they're doing), it is bad.

Optimal-Asshole t1_j9c20cy wrote on February 20, 2023 at 8:46 PM

Reply to [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

I think these workshops accept every submission that is not incoherent or desk rejected.

From my quick glance, It doesn’t seem like plagiarism, since they do ample citation. As far as the justification goes, there are some generative based approaches for solving parametric PDEs even now. It doesn’t seem like the best paper ever, but I don’t think it’s that bad.

Artichoke-Lower t1_j9bnbgf wrote on February 20, 2023 at 7:10 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

This seems really promising also https://github.com/Ying1123/FlexGen

k3iter t1_j9bn6k3 wrote on February 20, 2023 at 7:09 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Nel

tdgros t1_j9bfds3 wrote on February 20, 2023 at 6:19 PM

Reply to comment by harharveryfunny in [D] Something basic I don't understand about Nerfs by alik31239

Just read the post!

>However, the paper itself builds a network that gets as an input 5D vectors (3 location coordinates+2 camera angles) and outputs color and volume density for each such coordinate. I don't understand where do I get those 5D coordinates from? My training data surely doesn't have those - I only have a collection of images.

harharveryfunny t1_j9bf30y wrote on February 20, 2023 at 6:18 PM

Reply to comment by tdgros in [D] Something basic I don't understand about Nerfs by alik31239

OP's question seems to be how to get from 2D images to the 3D voxels, no? But anyways if they've got their answer that's good.

Edit: I guess they were talking about camera position for the photos, not mapping to 3D.

Purplekeyboard t1_j9bd1jg wrote on February 20, 2023 at 6:05 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Keep in mind, these smaller models are going to be a lot dumber than what you've likely seen in GPT-3.

pyepyepie t1_j9bbg1b wrote on February 20, 2023 at 5:54 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Try to use both GPUs with this one: https://github.com/huggingface/accelerate https://huggingface.co/docs/accelerate/usage_guides/big_modeling https://huggingface.co/blog/accelerate-large-models Maybe it will help (the last link is clearer IMHO).

Janderhungrige t1_j9bantv wrote on February 20, 2023 at 5:49 PM

Reply to [R] Using AI/ML for Quality Control for a factory? by aumzzzz

I work for a data and AI consulting company. You can contact me on LinkedIn for a noncommitting chitchat.

https://www.linkedin.com/in/jan-werth

catch23 t1_j9b9upb wrote on February 20, 2023 at 5:44 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Could try something like this: https://github.com/Ying1123/FlexGen

This was only released a few hours ago, so there's no way for you to have discovered this previously. Basically makes use of various strategies if your machine has lots of normal cpu memory. The paper authors were able to fit a 175B parameter model on their lowly 16GB T4 gpu (with a machine with 200GB of normal memory).

waffles2go2 t1_j9b9iiv wrote on February 20, 2023 at 5:42 PM

Reply to [R] Using AI/ML for Quality Control for a factory? by aumzzzz

Too generic a question - what data do you use for QC? Why do you think AI is a good fit?

There are a ton of "intro to AI" videos on YT that will explain the main domain areas (problems) where ML is a good fit. Start there.

deathisnear t1_j9b8yzh wrote on February 20, 2023 at 5:38 PM

Reply to [D] Something basic I don't understand about Nerfs by alik31239

The original NeRF requires the camera poses. As /u/marixer commented the typical approach is to approximate the camera poses using SfM approaches like COLMAP. However, there has been some work that try to tackle using NeRF without known camera poses.

https://nerfmm.active.vision/

https://arxiv.org/abs/2104.06405

Last-Belt-4010 t1_j9b8gtl wrote on February 20, 2023 at 5:35 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Just a question does this work with non Nvidia gpus? Like Intel arc and such

[deleted] t1_j9b8duw wrote on February 20, 2023 at 5:35 PM

Reply to comment by Emergency_Apricot_77 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

[deleted]

MediumOrder5478 t1_j9b7ln5 wrote on February 20, 2023 at 5:30 PM

Reply to [D] Something basic I don't understand about Nerfs by alik31239

You need to use a program like colmap for sparse scene reconstruction to recover the camera intrisics (focal length, lens distortions) and extrinsics (camera positions and orientations)

walkingsparrow t1_j9b7j3d wrote on February 20, 2023 at 5:29 PM

Reply to comment by radi-cho in [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho

I think I understand now. Thanks for the explanation.

Emergency_Apricot_77 t1_j9b68si wrote on February 20, 2023 at 5:21 PM

Reply to comment by Rockingtits in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

They literally asked for LARGE language models

[deleted] t1_j9b5x2y wrote on February 20, 2023 at 5:19 PM

Reply to [P] I've been commissioned to make 1000+ variations of my unique geometric art, while retaining its essential characteristics. It's been suggested that I use GAN to create permutations of my art. Any advice/directions? by eternalvisions

use ControlNET

DevarshTare OP t1_j9b4u7s wrote on February 20, 2023 at 5:12 PM

Reply to comment by TruthAndDiscipline in [D] What matters while running models? by DevarshTare

Thats interesting. I was considering that purchase since it makes sense to run larger datasets or models on the rtx 3060. But the Tensor cores were significantly lower. The GPU would definitely run much larger models but at a lower speed I assume?

How has your experience been with larger models? Especially video or image based models ?

Recent comments in /f/MachineLearning