GoofAckYoorsElf t1_j6wbljw wrote on February 2, 2023 at 10:07 AM

Reply to comment by Ulfgardleo in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

Both, actually. I can easily echo this question back to the people I call copyright warriors. Do they care about what is right or what they like? Right would be that everyone took an objective and unbiased look at the new technology and how to incorporate it into their work, instead of seeing only and aggressively clinging to their crumbling business models.

Much_Blacksmith_1857 OP t1_j6wbgus wrote on February 2, 2023 at 10:06 AM

Reply to comment by Remco32 in [P] AI Poker/Machine Learning/Game-Theory by Much_Blacksmith_1857

True for 1v1 scenarios but solving multi-way situations are far more complex.

ProSmokerPlayer t1_j6wb70n wrote on February 2, 2023 at 10:02 AM

Reply to [P] AI Poker/Machine Learning/Game-Theory by Much_Blacksmith_1857

Poker is solved already bud don't waste your time

SuddenlyBANANAS t1_j6waypu wrote on February 2, 2023 at 9:58 AM

Reply to comment by Argamanthys in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

If diffusion models were a perfect bijection between the latent space and the space of possible images, that would make sense, but they're obviously not. If you could repeat this procedure and find exact duplicates of images which were not in the training data, you'd have a point.

Remco32 t1_j6waskx wrote on February 2, 2023 at 9:56 AM

Reply to [P] AI Poker/Machine Learning/Game-Theory by Much_Blacksmith_1857

This has been done to death already.

DingusFamilyVacation t1_j6waih9 wrote on February 2, 2023 at 9:52 AM

Reply to [P] An open source tool for repeatable PyTorch experiments by embedding your code in each model checkpoint by latefordinnerstudios

I'm excited to try this out. I'm doing most of the ML development on my team. I'll iterate on code development and retrain, multiple times over. Oftentimes, my team members will jump in and want to use a trained model to run some downstream analyses. If the library API has changed, or the model architecture has been tweaked, loading the state_dicts of earlier models becomes nearly impossible without checking out old commits. Even then, storing the results and associating them w commit numbers is super annoying.

Thanks for the tool!

londons_explorer t1_j6wa910 wrote on February 2, 2023 at 9:48 AM

Reply to [D] Why is stable diffusion much smaller than predecessors? by dahdarknite

It's a much smaller model, but IMO, the results are much lower quality too.

However the fact you can run it on your PC means you can tweak all the settings and have many goes at getting better results, partially offsetting that.

SulszBachFramed t1_j6wa7ii wrote on February 2, 2023 at 9:47 AM

Reply to comment by znihilist in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

You can make the same argument about lossy compression. Am I really infringing on copyright if I record an episode of House, re-encode it and redistribute it? It's not the 'original' episode, but a lossy copy of it. What if I compress it in a zip file and distribute that? In that case, I am only sharing something that can imperfectly recreate the original. The zip file itself does not resemble a video at all.

WikiSummarizerBot t1_j6w9h7w wrote on February 2, 2023 at 9:36 AM

Reply to comment by Argamanthys in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

The Library of Babel

>"The Library of Babel" (Spanish: La biblioteca de Babel) is a short story by Argentine author and librarian Jorge Luis Borges (1899–1986), conceiving of a universe in the form of a vast library containing all possible 410-page books of a certain format and character set. The story was originally published in Spanish in Borges' 1941 collection of stories El jardín de senderos que se bifurcan (The Garden of Forking Paths). That entire book was, in turn, included within his much-reprinted Ficciones (1944).

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

Argamanthys t1_j6w9gal wrote on February 2, 2023 at 9:36 AM

Reply to comment by HateRedditCantQuitit in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

There is a short story called The Library of Babel about a near-infinite library that contains every possible permutation of a book with 1,312,000 characters. It is not hard to recreate that library in code. You can explore it if you want.

Contained within that library is a copy of every book ever written, freely available to read.

Is that book piracy? It's right there if you know where to look.

That's pretty much what's going on here. They searched the latent space for an image and found it. But that's because the latent space, like the Library of Babel is really big and contains not just that image but also near-infinite permutations of it.

sad_potato00 t1_j6w92uy wrote on February 2, 2023 at 9:31 AM

Reply to [P] NER output label post processing by hasiemasie

so we had a similar problem, where buidling names were written in diffrent ways (some abbreviation, full name, full name + what type of it). something that worked for me was using sentence BERT and doing a cosine similarity. deciding a cut off value was easier than deciding how many cluster to use. sadly, manuall labeling and checking is still needed

Ulfgardleo t1_j6w8snb wrote on February 2, 2023 at 9:26 AM

Reply to comment by GoofAckYoorsElf in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

"copyright warriors"

do you care about what is right, or what you like?

alkibijad OP t1_j6w7lo3 wrote on February 2, 2023 at 9:09 AM

Reply to comment by TheDeviousPanda in [D] Apple's ane-transformers - experiences? by alkibijad

That was not the answer I was hoping for, but very helpful :)
Do you have any code/repo to share? I'm only able to find the DistilBERT implementation in apple's repo, would like to see some other examples?

[deleted] t1_j6w69n8 wrote on February 2, 2023 at 8:50 AM

Reply to comment by [deleted] in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

[removed]

[deleted] t1_j6w68lq wrote on February 2, 2023 at 8:49 AM

Reply to comment by [deleted] in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

[removed]

[deleted] t1_j6w67if wrote on February 2, 2023 at 8:49 AM

Reply to comment by [deleted] in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

[removed]

E_Snap t1_j6w4skd wrote on February 2, 2023 at 8:29 AM

Reply to comment by Monoranos in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

So are we just collectively pretending that the terms and conditions of websites don’t exist? You put something up on somebody else’s server, 99% of the time it’s no longer yours to claim ownership of anymore.

jimmymvp t1_j6w4ezb wrote on February 2, 2023 at 8:23 AM

Reply to [D] Normalizing Flows in 2023? by wellfriedbeans

Any application where you need exact likelihoods, flows are king. Such is the case for example jf you're learning a sampling distribution for MCMC sampling, estimating normalizing constants (I believe in physics there are a lot of these problems) etc.

emotionalfool123 t1_j6w29e1 wrote on February 2, 2023 at 7:54 AM

Reply to comment by minhrongcon2000 in [D] What does a DL role look like in ten years? by PassingTumbleweed

It will solve that problem by solving for nuclear fusion. Everybody will get energy as Oprah would say.

Ulfgardleo t1_j6vzpgz wrote on February 2, 2023 at 7:20 AM

Reply to [D] Normalizing Flows in 2023? by wellfriedbeans

There is only very little research. They are a nice theoretical idea, but the concept is very constraining and numerical difficulties make experimenting hell.

I am not aware of any active research and I think they never were really big to begin with.

[deleted] t1_j6vy5uy wrote on February 2, 2023 at 7:01 AM

Reply to [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

[deleted]

Monoranos t1_j6vy1lg wrote on February 2, 2023 at 6:59 AM

Reply to [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

I am the only one who finds it weird to make profits from what it seems to be stolen data from the whole humanity?

Edit: Well didn't think this was a controversial take. I feel like people juste choose to ignore the whole aspect of consent and ethics about your data.

The GDPR further clarifies the conditions for consent in Article 7: https://gdpr.eu/gdpr-consent-requirements/

Where processing is based on consent, the controller shall be able to demonstrate that the data subject has consented to processing of his or her personal data.
If the data subject’s consent is given in the context of a written declaration which also concerns other matters, the request for consent shall be presented in a manner which is clearly distinguishable from the other matters, in an intelligible and easily accessible form, using clear and plain language. Any part of such a declaration which constitutes an infringement of this Regulation shall not be binding.
The data subject shall have the right to withdraw his or her consent at any time. The withdrawal of consent shall not affect the lawfulness of processing based on consent before its withdrawal. Prior to giving consent, the data subject shall be informed thereof. It shall be as easy to withdraw as to give consent.
When assessing whether consent is freely given, utmost account shall be taken of whether, inter alia, the performance of a contract, including the provision of a service, is conditional on consent to the processing of personal data that is not necessary for the performance of that contract.

GoofAckYoorsElf t1_j6vwbgm wrote on February 2, 2023 at 6:38 AM

Reply to [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

Well, there goes a main argument against the copyright warriors... Damn...

TheDeviousPanda t1_j6vv0my wrote on February 2, 2023 at 6:23 AM

Reply to [D] Apple's ane-transformers - experiences? by alkibijad

I hate to do this to you, but I have been in your position and I have answers to all your questions.

Yes, yes
A lot
Yes, very

Parzival_007 t1_j6vtx0l wrote on February 2, 2023 at 6:10 AM

Reply to [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata

I'm not surprised, but imo this is good. I think they did the same once before ? Hopefully the watermarking system gets very good too, I know there is active research going on in this area.

Recent comments in /f/MachineLearning