[deleted] t1_j9ahqkw wrote on February 20, 2023 at 2:32 PM

Reply to [R] Using AI/ML for Quality Control for a factory? by aumzzzz

[deleted]

Rockingtits t1_j9afl0a wrote on February 20, 2023 at 2:15 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Why not look into distilled models like DistilBERT

ggf31416 t1_j9a8p88 wrote on February 20, 2023 at 1:14 PM

Reply to comment by DevarshTare in [D] What matters while running models? by DevarshTare

3070 and 3060ti both have 8GB, and while the 3070 will be a bit faster, most people will agree that the difference is not worth the price if you have a tight budget.

For training the extra 4GB from the plain 3060 is quite useful, but for inference only you can run most small and medium models (such as stable diffusion) in 8GB and the 3060ti will be faster.

lemurlemur t1_j9a8gb7 wrote on February 20, 2023 at 1:12 PM

Reply to comment by BarockMoebelSecond in [D] Please stop by [deleted]

Yes, this is how science works - you make a claim and show proof.

This is NOT how developing an idea works though, and this subreddit exists in part to help develop ideas. Developing an idea requires entertaining ideas that are not fully formed, and yes this includes some ideas that may seem stupid or wrong.

DevarshTare OP t1_j9a7h2p wrote on February 20, 2023 at 1:02 PM

Reply to comment by TruthAndDiscipline in [D] What matters while running models? by DevarshTare

Thanks a lot!

DevarshTare OP t1_j9a7gap wrote on February 20, 2023 at 1:02 PM

Reply to comment by ggf31416 in [D] What matters while running models? by DevarshTare

Appreciate it! This gave me a better picture. I was stuck between 3060 ti and 3070. In this case 3060 ti is the logical option. I will be using the Colab for training it, and can probably optimise it to run with 8 Gb, if I'm not wrong?

avocadoughnut t1_j9a64k1 wrote on February 20, 2023 at 12:49 PM

Reply to comment by gliptic in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Yup. I'd recommend using whichever RWKV model that can be fit with fp16/bf16. (apparently 8bit is 4x slower and lower accuracy) I've been running GPT-J on a 24GB gpu for months (longer contexts possible using accelerate) and I noticed massive speed increases when using fp16 (or bf16? don't remember) rather than 8bit.

[deleted] t1_j9a56tc wrote on February 20, 2023 at 12:39 PM

Reply to [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho

[removed]

thecodethinker t1_j9a4mvo wrote on February 20, 2023 at 12:33 PM

Reply to comment by synth_mania in [R] neural cloth simulation by LegendOfHiddnTempl

Yeah, exactly my point about image classification. We’ve had it for a long time already.

easy_peazy t1_j9a2xe0 wrote on February 20, 2023 at 12:13 PM

Reply to comment by guaranteednotabot in [D] Simple Questions Thread by AutoModerator

I’m not sure what the time complexity is

Disastrous_Elk_6375 t1_j9a2877 wrote on February 20, 2023 at 12:05 PM

Reply to comment by ArmagedonAshhole in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Thanks!

ArmagedonAshhole t1_j9a1vq3 wrote on February 20, 2023 at 12:01 PM

Reply to comment by Disastrous_Elk_6375 in [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

it depends mostly on settings so no.

Small context like 200-300 tokens could work with 24GB but then your AI will not remember and connect dots well which would make model worse than 13B

People are working right now on spliting work between gpu(vram) and cpu(ram) in 8bit mode. I think like 10% to RAM would make model work well on 24GB vram card. IT would be a bit slower but still usable.

If you want you can always load whole model to ram and run it via cpu but it is very slow.

[deleted] t1_j9a12np wrote on February 20, 2023 at 11:52 AM

Reply to [R] neural cloth simulation by LegendOfHiddnTempl

[removed]

guaranteednotabot t1_j9a0q74 wrote on February 20, 2023 at 11:47 AM

Reply to comment by easy_peazy in [D] Simple Questions Thread by AutoModerator

Say there’s a model with double the parameter, will it take twice as long to process?

snowpixelapp t1_j99zc4b wrote on February 20, 2023 at 11:30 AM

Reply to comment by sam__izdat in [P] I've been commissioned to make 1000+ variations of my unique geometric art, while retaining its essential characteristics. It's been suggested that I use GAN to create permutations of my art. Any advice/directions? by eternalvisions

In my experiments, I have found dreambooth implementation by diffusers to be not good. There are many alternatives for it though.

[deleted] t1_j99ycwk wrote on February 20, 2023 at 11:17 AM

Reply to comment by [deleted] in [D] Lack of influence in modern AI by I_like_sources

[removed]

ggf31416 t1_j99y9e1 wrote on February 20, 2023 at 11:15 AM

Reply to [D] What matters while running models? by DevarshTare

https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/

https://lambdalabs.com/gpu-benchmarks

How much VRAM you need will depend mostly on the number of parameters of the model with some extra for the data. At FP32 precision each parameter needs 4 bytes, at FP16 or BF16 2 bytes, and at FP8 or INT8 only one byte. Almost all models can be run at FP16 without noticeable accuracy loss, FP8 sometimes works, sometimes it doesn't depending on the model.

gliptic t1_j99y0cp wrote on February 20, 2023 at 11:12 AM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

RWKV can run on very little VRAM with Rwkvstic streaming and 8-bit. I've not tested streaming, but I expect it's a lot slower. 7B model sadly takes 8 GB with just 8-bit quantization.