Recent comments in /f/MachineLearning
Mysterious-Career236 t1_jb57u9u wrote
Reply to comment by LeanderKu in [R] High-resolution image reconstruction with latent diffusion models from human brain activity by SleekEagle
I also bet the scope didn't allow them to try hard enough. It is possible you could generalise a little if you trained the model in numerous people and not only a handfew
von-hust OP t1_jb55rhp wrote
Reply to comment by JrdnRgrs in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
I think the first version of SD is trained with duplicates, and they made some effort to remove duplicates for training v2 (people on discord are saying pHash or something ismilar). I suppose it'd be interesting to see if the same prompts can be verbatim copied.
NoLifeGamer2 OP t1_jb559ia wrote
Reply to comment by Philpax in [D] Ethics of minecraft stable diffusion by NoLifeGamer2
Thx! If I do do it, I will probably just use it myself, or submit it anonymously.
AuspiciousApple t1_jb557q8 wrote
Reply to comment by JrdnRgrs in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
Not obviously so.
First, de-duplicating text data didn't help much in the cramming paper. Second, even if the images are duplicates, the captions might be different so you still learn more than if you only had one copy of each image.
Finally, even with exact copies of text and image, it would just weigh those images more heavily than the rest - which could harm performance, not matter at all, or even help performance (for instance if those images tend to be higher quality/more interesting/etc.)
FeministNeuroNerd t1_jb54wql wrote
Reply to comment by Adamanos in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho
Superb interchange
SaifKhayoon t1_jb54pnw wrote
Is this why some checkpoints / safetensors make for better results than stable diffusion's 1.5 and 2.1 weights?
Was LAION-2B used to train the base model shared by all other "models"/weights?
Adamanos t1_jb542kw wrote
Reply to comment by Delster111 in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho
immaculate dialogue
JrdnRgrs t1_jb53xvx wrote
Very interesting, so what is the implication for stable diffusion?
Does this mean that if the data set was corrected for these duplicated images that a corrected model using this data set would be of even "higher quality"? Can't wait
Philpax t1_jb53nhe wrote
Reply to comment by I_will_delete_myself in [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python by bo_peng
As far as I can tell, the sparse documentation is just because they've been in pure R&D mode. I've played around with it in their Discord server and can confirm it does perform well, but I've struggled to get it working locally.
Philpax t1_jb53hvo wrote
Reply to [D] Ethics of minecraft stable diffusion by NoLifeGamer2
You are probably fine, but note that a) people will likely be very angry with you, whether or not the licensing permits it and b) this is a non-trivial problem and even more non-trivial to train.
Good luck, though!
Philpax t1_jb53cl8 wrote
Reply to comment by head_robotics in [D] Ethics of minecraft stable diffusion by NoLifeGamer2
The issue is less with the platform they're using and more with where they're sourcing the data from. They're asking if there are any issues with taking people's uploaded builds and using them to train a generative system.
I_will_delete_myself t1_jb532v5 wrote
Reply to comment by Philpax in [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python by bo_peng
Intelligence is the ability to take complex information into a simple explanation that a child can understand .
It makes me skeptical if someone doesn’t explain besides performance reasons . Most people just use the cloud because ML networks regardless of size take up a lot of battery.
Delster111 t1_jb4z3qm wrote
Reply to comment by askljof in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho
fantastic discourse
Hiitstyty t1_jb4wjnj wrote
Reply to comment by tysam_and_co in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho
It helps to think of the bias-variance trade off in terms of the hypothesis space. Dropout trains subnetworks at every iteration. The hypothesis space of the full network will always contain (and be larger) than the hypothesis space of any subnetwork, because the full network has greater expressive capacity. Thus, the full network can not be any less biased than any subnetwork. However, any subnetwork will have reduced variance because of its smaller relative hypothesis space. Thus, dropout helps because its reduction in variance offsets its increase in bias. However, as the dropout proportion is set increasingly higher, eventually the bias will be too great to overcome.
shayanrc t1_jb4vbcz wrote
Reply to comment by iloveintuition in [D] Best way to run LLMs in the cloud? by QTQRQD
What config did you use?
l0g1cs t1_jb4tbu7 wrote
Reply to [D] Best way to run LLMs in the cloud? by QTQRQD
Check out Banana. They seem to do exactly that with "serverless" A100.
[deleted] t1_jb4qdyv wrote
Reply to comment by ggdupont in To RL or Not to RL? [D] by vidul7498
[deleted]
ggdupont t1_jb4onx8 wrote
Reply to comment by ThaGooInYaBrain in To RL or Not to RL? [D] by vidul7498
Anything in production yet?
ggdupont t1_jb4olgn wrote
Reply to comment by cantfindaname2take in To RL or Not to RL? [D] by vidul7498
I have probably not a complete view but worked in very large hardware industry and all robots were using classic optimal control approach (like the one used by Boston dynamic) non were using RL.
alterframe t1_jb4nwrt wrote
Reply to comment by WandererXZZ in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho
Yes, I worded it badly.
ggf31416 t1_jb4j0uk wrote
Reply to [D] Best way to run LLMs in the cloud? by QTQRQD
Good luck getting a EC2 with a single A100, last time I checked, AWS only offered instances with 8 of them at a high price.
isaeef t1_jb4iz4f wrote
Reply to [D] Best way to run LLMs in the cloud? by QTQRQD
or you could use any gpu workload specific provider https://www.paperspace.com/
radi-cho OP t1_jb4edeo wrote
Reply to comment by kryptoklob in [P] diffground - A simplistic Android UI to access ControlNet and instruct-pix2pix. by radi-cho
Stable Diffusion conditioned with Controlnet-Scribble or Controlnet-canny. For the editing option, instruct-pix2pix.
head_robotics t1_jb4djzx wrote
Reply to [D] Ethics of minecraft stable diffusion by NoLifeGamer2
knight1511 t1_jb5975w wrote
Reply to comment by FeministNeuroNerd in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho
Exquisite deliberation