Recent comments in /f/MachineLearning
AnothaUselessComment t1_j9c9er6 wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Yikes, this may be tough.
I know you can try Bloom (like this blog post tried) and let it try and download overnight, but you may run into problems. (I've heard the download takes forever)
https://enjoymachinelearning.com/blog/gpt-3-vs-bloom/
Though I will say, it's probably worth whatever cost you're trying to dodge just to hit an API, even if your hardware is great.
AlmightySnoo OP t1_j9c56bx wrote
Reply to comment by Optimal-Asshole in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
>It’s not even using the generative model for anything useful.
Thank you, that's literally what I meant in my second paragraph. They're literally training the GAN to learn Dirac distributions. The noise has no use, and the discriminator eventually ends up learning to do roughly the job of a simple squared loss.
EuphoricPenguin22 t1_j9c51t7 wrote
Reply to comment by catch23 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Does that increase inference time?
Agreeable-Run-9152 t1_j9c4naa wrote
Yeah i actually agree with your rant. However there is a small Chance they acted in good faith and did Not see that the randomness in the GAN wont do anything.
Optimal-Asshole t1_j9c4h8d wrote
Reply to comment by AlmightySnoo in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
Okay lol so I’m actually researching kinda similar things and I assumed this paper was related because it used similar tools but upon a closer look, nope nvm. It’s not even using the generative model for anything useful.
So their paper just shows that the basic idea of least squares PDE solving can be used for generative models. Okay now it’s average class project tier. I guess this demonstrates that yes these workshops accept literally anything.
Edit: it’s still not plagiarism. It’s just not very novel. Plagiarism is stealing ideas without credit. What they did was discuss an existing idea and extend it in a very small way experimentally only. Not plagiarism.
AlmightySnoo OP t1_j9c2trd wrote
Reply to comment by Optimal-Asshole in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
>It doesn’t seem like plagiarism, since they do ample citation.
It is when you are pretending to do things differently while in practice you do the exact same thing and add a useless layer (the GAN) to give the false impression of novelty. Merely citing sources in such cases doesn't shield you from being accused of plagiarism.
>As far as the justification goes, there are some generative based approaches for solving parametric PDEs even now.
Not disputing that there might be papers out there where the use is justified, of course there are skilled researchers with academic integrity. But again, in this paper, and the ones I'm talking about in general, the setting is exactly as in my 2nd paragraph, where the use of GANs is clearly not justified at all.
>but I don’t think it’s that bad
Again, in the context of my second paragraph (because that's literally what they're doing), it is bad.
Optimal-Asshole t1_j9c20cy wrote
I think these workshops accept every submission that is not incoherent or desk rejected.
From my quick glance, It doesn’t seem like plagiarism, since they do ample citation. As far as the justification goes, there are some generative based approaches for solving parametric PDEs even now. It doesn’t seem like the best paper ever, but I don’t think it’s that bad.
Artichoke-Lower t1_j9bnbgf wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
This seems really promising also https://github.com/Ying1123/FlexGen
k3iter t1_j9bn6k3 wrote
tdgros t1_j9bfds3 wrote
Reply to comment by harharveryfunny in [D] Something basic I don't understand about Nerfs by alik31239
Just read the post!
>However, the paper itself builds a network that gets as an input 5D vectors (3 location coordinates+2 camera angles) and outputs color and volume density for each such coordinate. I don't understand where do I get those 5D coordinates from? My training data surely doesn't have those - I only have a collection of images.
harharveryfunny t1_j9bf30y wrote
Reply to comment by tdgros in [D] Something basic I don't understand about Nerfs by alik31239
OP's question seems to be how to get from 2D images to the 3D voxels, no? But anyways if they've got their answer that's good.
Edit: I guess they were talking about camera position for the photos, not mapping to 3D.
Purplekeyboard t1_j9bd1jg wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Keep in mind, these smaller models are going to be a lot dumber than what you've likely seen in GPT-3.
pyepyepie t1_j9bbg1b wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Try to use both GPUs with this one: https://github.com/huggingface/accelerate https://huggingface.co/docs/accelerate/usage_guides/big_modeling https://huggingface.co/blog/accelerate-large-models Maybe it will help (the last link is clearer IMHO).
Janderhungrige t1_j9bantv wrote
I work for a data and AI consulting company. You can contact me on LinkedIn for a noncommitting chitchat.
catch23 t1_j9b9upb wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Could try something like this: https://github.com/Ying1123/FlexGen
This was only released a few hours ago, so there's no way for you to have discovered this previously. Basically makes use of various strategies if your machine has lots of normal cpu memory. The paper authors were able to fit a 175B parameter model on their lowly 16GB T4 gpu (with a machine with 200GB of normal memory).
waffles2go2 t1_j9b9iiv wrote
Too generic a question - what data do you use for QC? Why do you think AI is a good fit?
There are a ton of "intro to AI" videos on YT that will explain the main domain areas (problems) where ML is a good fit. Start there.
deathisnear t1_j9b8yzh wrote
The original NeRF requires the camera poses. As /u/marixer commented the typical approach is to approximate the camera poses using SfM approaches like COLMAP. However, there has been some work that try to tackle using NeRF without known camera poses.
Last-Belt-4010 t1_j9b8gtl wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Just a question does this work with non Nvidia gpus? Like Intel arc and such
[deleted] t1_j9b8duw wrote
Reply to comment by Emergency_Apricot_77 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
[deleted]
MediumOrder5478 t1_j9b7ln5 wrote
You need to use a program like colmap for sparse scene reconstruction to recover the camera intrisics (focal length, lens distortions) and extrinsics (camera positions and orientations)
walkingsparrow t1_j9b7j3d wrote
Reply to comment by radi-cho in [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho
I think I understand now. Thanks for the explanation.
Emergency_Apricot_77 t1_j9b68si wrote
Reply to comment by Rockingtits in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
They literally asked for LARGE language models
DevarshTare OP t1_j9b4u7s wrote
Reply to comment by TruthAndDiscipline in [D] What matters while running models? by DevarshTare
Thats interesting. I was considering that purchase since it makes sense to run larger datasets or models on the rtx 3060. But the Tensor cores were significantly lower. The GPU would definitely run much larger models but at a lower speed I assume?
How has your experience been with larger models? Especially video or image based models ?
luaks1337 t1_j9cajyf wrote
Reply to comment by EuphoricPenguin22 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Yes, at least if I read the documentation correctly.