jess-plays-games t1_j5g0oz4 wrote on January 22, 2023 at 7:20 PM

You don't need to implement a full-scale LLM in order to degrade watermarks at scale or even mix-and-match watermarked inputs. People who aren't even trying get halfway there now with crappy synonym engines.

And before you ask, no, I'm not going to technically spec it for you. Instead I suggest using the upvote pattern from this expert community to run backprop on your beliefs. ;)

ardula99 t1_j5fsj2m wrote on January 22, 2023 at 6:28 PM

Reply to comment by watchsnob in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Not scalabale long term, especially once they start selling to clients. Clients will have privacy and security issues with OpenAI having full access (and logging history) of all their previous queries.

ardula99 t1_j5fsdsw wrote on January 22, 2023 at 6:27 PM

Reply to comment by artsybashev in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

That is what adversarial data points are -- people have discovered that it is possible to "confuse" image models using attacks such as these. Take a normal picture of cat, feed it into an image model and it'll label it correctly and say - hey, I'm 97% sure there's a cat in this. Change a small number of pixels using some algorithm (say <1% of the entire image) - to a human, it will still look like a cat, but an image model now thinks it's a stop sign (or something equally unlikely) with 90%+ probability.

diditforthevideocard t1_j5frqlc wrote on January 22, 2023 at 6:23 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

They did

doIneedtohaveone1 t1_j5fqzkf wrote on January 22, 2023 at 6:19 PM

Reply to [D] Simple Questions Thread by AutoModerator

Does any one know how to solve the PDE for it in python? Any kind of reference material would be appreciated!

It's been long since I came across any PDEs and have forgotten everything related to it.

[deleted] t1_j5fqkim wrote on January 22, 2023 at 6:16 PM

Reply to comment by hey_look_its_shiny in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

[removed]

evys_garden t1_j5fpwmw wrote on January 22, 2023 at 6:12 PM

Reply to [D] Simple Questions Thread by AutoModerator

I'm currently reading Interpretable Machine Learning by Christoph Molnar and am confused with section 3.4: Evaluation of Interpretability.

I don't quite get Human level evaluation (simple task). The example is show a user different explanations and the user would choose the best one and i don't know what that means. Can someone enlighten me?

VelveteenAmbush t1_j5fp3jp wrote on January 22, 2023 at 6:07 PM

Reply to comment by conchoso in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

> but they included an option to turn it off

They did not include an option to turn it off. They released it as open-source software, which lets people modify it themselves, including by turning it off.

hey_look_its_shiny t1_j5fnyt6 wrote on January 22, 2023 at 5:59 PM

Reply to comment by [deleted] in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I'm not OP, but the words "won't work in the long term" from their original statement are not synonymous with "useless".

Your original comment was disrespectful, and while you have raised some valid points along the way, they're collectively misaligned with the original statement you were responding to. You've been fighting a strawman, and it shows in how the community received your comments.

Advanced-Hedgehog-95 t1_j5fmjo5 wrote on January 22, 2023 at 5:50 PM

Reply to comment by watchsnob in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

That's not a practical option for obvious reasons

GalaxyGoldMiner t1_j5flc0v wrote on January 22, 2023 at 5:42 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Thinking out loud, instead of watermarking you could just look at each tokens conditional probability of being sampled based on the prior tokens; if the probabilities are high in aggregate it is likely to hang come from low temperature GPT. This assumes that transformer models trained by different companies (on presumably overlapping data) will have different enough predictions in long sequences.

watchsnob t1_j5fl99s wrote on January 22, 2023 at 5:42 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

couldn't they just log all of their own outputs and check against it?

artsybashev t1_j5fhyex wrote on January 22, 2023 at 5:21 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Infowars gets a new meaning in 10 years

EmmyNoetherRing t1_j5fhpjq wrote on January 22, 2023 at 5:19 PM

Reply to comment by artsybashev in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

There’s almost an analogy here to malicious influence attacks aimed at radicalizing people. You have to inundate them with a web of targeted information/logic to gradually change their worldview.

artsybashev t1_j5fhioy wrote on January 22, 2023 at 5:18 PM

Reply to comment by conchoso in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Yeah it is easier to modify the color of a pixel that characters in a text in a way than humans do not detect. Something can be done through typos, weird choise of word or calculating the checksum of word choises, but those methods can easily sound unnatural to human readers.

Historical-Coat5318 t1_j5fh33h wrote on January 22, 2023 at 5:15 PM

Reply to comment by twiztidsoulz in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Say a novelist in the future wants to prove to the world that they wrote their book themselves and not an AI, how could that be done with DocuSign? Or SSL? That's the kind of use case I'm thinking of.

artsybashev t1_j5fh1m8 wrote on January 22, 2023 at 5:15 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I wonder if in 50 years, the LLM models are able to produce "viruses" that cause problems in competing models. Like AI hacking the other AI through injecting disruptive training data to the enemy training procedure.

artsybashev t1_j5fgnjm wrote on January 22, 2023 at 5:12 PM

Reply to comment by EmmyNoetherRing in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

So they are effectively currently poluting the public space with their AI which only they have the tool to detect. Smells like anti-competition to me. This potentially makes the competing teams models worse since they will be eating the shit that GPT3 pushes out.

eigenman t1_j5fgmc8 wrote on January 22, 2023 at 5:12 PM

Reply to comment by adt in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Ahh of course I should have checked Scott's blog first.

[deleted] t1_j5fflnf wrote on January 22, 2023 at 5:05 PM

Reply to comment by BitterAd9531 in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

[removed]

Recent comments in /f/MachineLearning