knight1511 t1_jb5975w wrote on March 6, 2023 at 3:16 PM

Reply to comment by FeministNeuroNerd in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho

Exquisite deliberation

Mysterious-Career236 t1_jb57u9u wrote on March 6, 2023 at 3:07 PM

Reply to comment by LeanderKu in [R] High-resolution image reconstruction with latent diffusion models from human brain activity by SleekEagle

I also bet the scope didn't allow them to try hard enough. It is possible you could generalise a little if you trained the model in numerous people and not only a handfew

von-hust OP t1_jb55rhp wrote on March 6, 2023 at 2:52 PM

Reply to comment by JrdnRgrs in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust

I think the first version of SD is trained with duplicates, and they made some effort to remove duplicates for training v2 (people on discord are saying pHash or something ismilar). I suppose it'd be interesting to see if the same prompts can be verbatim copied.

NoLifeGamer2 OP t1_jb559ia wrote on March 6, 2023 at 2:48 PM

Reply to comment by Philpax in [D] Ethics of minecraft stable diffusion by NoLifeGamer2

Thx! If I do do it, I will probably just use it myself, or submit it anonymously.

AuspiciousApple t1_jb557q8 wrote on March 6, 2023 at 2:48 PM

Reply to comment by JrdnRgrs in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust

Not obviously so.

First, de-duplicating text data didn't help much in the cramming paper. Second, even if the images are duplicates, the captions might be different so you still learn more than if you only had one copy of each image.

Finally, even with exact copies of text and image, it would just weigh those images more heavily than the rest - which could harm performance, not matter at all, or even help performance (for instance if those images tend to be higher quality/more interesting/etc.)

FeministNeuroNerd t1_jb54wql wrote on March 6, 2023 at 2:46 PM

Reply to comment by Adamanos in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho

Superb interchange

SaifKhayoon t1_jb54pnw wrote on March 6, 2023 at 2:44 PM

Reply to [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust

Is this why some checkpoints / safetensors make for better results than stable diffusion's 1.5 and 2.1 weights?

Was LAION-2B used to train the base model shared by all other "models"/weights?

Adamanos t1_jb542kw wrote on March 6, 2023 at 2:39 PM

Reply to comment by Delster111 in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho

immaculate dialogue

JrdnRgrs t1_jb53xvx wrote on March 6, 2023 at 2:38 PM

Reply to [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust

Very interesting, so what is the implication for stable diffusion?

Does this mean that if the data set was corrected for these duplicated images that a corrected model using this data set would be of even "higher quality"? Can't wait

Philpax t1_jb53nhe wrote on March 6, 2023 at 2:36 PM

Reply to comment by I_will_delete_myself in [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python by bo_peng

As far as I can tell, the sparse documentation is just because they've been in pure R&D mode. I've played around with it in their Discord server and can confirm it does perform well, but I've struggled to get it working locally.

Philpax t1_jb53hvo wrote on March 6, 2023 at 2:35 PM

Reply to [D] Ethics of minecraft stable diffusion by NoLifeGamer2

You are probably fine, but note that a) people will likely be very angry with you, whether or not the licensing permits it and b) this is a non-trivial problem and even more non-trivial to train.

Good luck, though!

Philpax t1_jb53cl8 wrote on March 6, 2023 at 2:34 PM

Reply to comment by head_robotics in [D] Ethics of minecraft stable diffusion by NoLifeGamer2

The issue is less with the platform they're using and more with where they're sourcing the data from. They're asking if there are any issues with taking people's uploaded builds and using them to train a generative system.

I_will_delete_myself t1_jb532v5 wrote on March 6, 2023 at 2:32 PM

Reply to comment by Philpax in [R] RWKV (100% RNN) can genuinely model ctx4k+ documents in Pile, and RWKV model+inference+generation in 150 lines of Python by bo_peng

Intelligence is the ability to take complex information into a simple explanation that a child can understand .

It makes me skeptical if someone doesn’t explain besides performance reasons . Most people just use the cloud because ML networks regardless of size take up a lot of battery.

Delster111 t1_jb4z3qm wrote on March 6, 2023 at 2:01 PM

Reply to comment by askljof in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho

fantastic discourse

Hiitstyty t1_jb4wjnj wrote on March 6, 2023 at 1:39 PM

Reply to comment by tysam_and_co in [R] [N] Dropout Reduces Underfitting - Liu et al. by radi-cho

It helps to think of the bias-variance trade off in terms of the hypothesis space. Dropout trains subnetworks at every iteration. The hypothesis space of the full network will always contain (and be larger) than the hypothesis space of any subnetwork, because the full network has greater expressive capacity. Thus, the full network can not be any less biased than any subnetwork. However, any subnetwork will have reduced variance because of its smaller relative hypothesis space. Thus, dropout helps because its reduction in variance offsets its increase in bias. However, as the dropout proportion is set increasingly higher, eventually the bias will be too great to overcome.

shayanrc t1_jb4vbcz wrote on March 6, 2023 at 1:28 PM

Reply to comment by iloveintuition in [D] Best way to run LLMs in the cloud? by QTQRQD

What config did you use?

l0g1cs t1_jb4tbu7 wrote on March 6, 2023 at 1:10 PM

Reply to [D] Best way to run LLMs in the cloud? by QTQRQD

Check out Banana. They seem to do exactly that with "serverless" A100.

[deleted] t1_jb4qdyv wrote on March 6, 2023 at 12:42 PM

Reply to comment by ggdupont in To RL or Not to RL? [D] by vidul7498

[deleted]

ggdupont t1_jb4onx8 wrote on March 6, 2023 at 12:24 PM

Reply to comment by ThaGooInYaBrain in To RL or Not to RL? [D] by vidul7498

Anything in production yet?

ggdupont t1_jb4olgn wrote on March 6, 2023 at 12:23 PM

Reply to comment by cantfindaname2take in To RL or Not to RL? [D] by vidul7498

I have probably not a complete view but worked in very large hardware industry and all robots were using classic optimal control approach (like the one used by Boston dynamic) non were using RL.