dineNshine t1_j5esx4f wrote on January 22, 2023 at 2:20 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Why would you want to do this? We can fake text without GPT, and we also have the means to prove authenticity by digital signatures. By limiting the technology artificially, you will end up limiting the end user, while organizations with more resources will still be able to circumvent these limitations by training their own models.

To avoid limiting usability, desired limitations should be applied on top of the base model by the end user, not to the base model itself.

The sole consequence of attempts like which OP suggests is further centralization of the technology, which is the worst imaginable result.

Advanced-Hedgehog-95 t1_j5es27r wrote on January 22, 2023 at 2:13 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Watermark may be useful in academic context but why should a company care?

I also wonder what happens to hundreds of writing support apps that are built upon these GPT models.

BitterAd9531 t1_j5erse4 wrote on January 22, 2023 at 2:11 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Won't work in the long term. OpenAI might have been the first one to release, but we know other companies have better LLMs and others will catch up soon. When that happens, models without watermarks will be released and people who want output without a watermark will use that model.

And even if you somehow force all of them to implement a watermark, it would be trivial to combine outputs of different models to circumvent it. Not to mention that slight rewrites by a human would probably break most watermarks, the same way they break the current GPT detectors.

new_name_who_dis_ t1_j5ern6t wrote on January 22, 2023 at 2:09 PM

Reply to comment by JackandFred in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

How do you detect text produced by GPT? Is there like open source code?

[deleted] t1_j5ermam wrote on January 22, 2023 at 2:09 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

[deleted]

flinsypop t1_j5erlpw wrote on January 22, 2023 at 2:09 PM

Reply to comment by AmputatorBot in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Good bot.

adt t1_j5erdiz wrote on January 22, 2023 at 2:07 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Already in the works (Scott Aaronson is a scientist with OpenAI):

>>we actually have a working prototype of the watermarking scheme, built by OpenAI engineer Hendrik Kirchner. It seems to work pretty well—empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text came from GPT.
>Now, this can all be defeated with enough effort. For example, if you used another AI to paraphrase GPT’s output—well okay, we’re not going to be able to detect that. On the other hand, if you just insert or delete a few words here and there, or rearrange the order of some sentences, the watermarking signal will still be there. Because it depends only on a sum over n-grams, it’s robust against those sorts of interventions.
https://scottaaronson.blog/?p=6823

drewkungfu t1_j5erafj wrote on January 22, 2023 at 2:06 PM

Reply to comment by ardula99 in ChatGPT is not all you need [R] by EduCGM

I think there was major word choice failures that perhaps auto-correct spell help mask.

Here my attempt to fix:

“This ~~work~~ paper ~~consists on an~~ attempts to describe in a concise way the min models ~~are~~ and sectors ^of ^industry ^jobs that are affected by generative AI.

EmmyNoetherRing t1_j5er3xp wrote on January 22, 2023 at 2:05 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I’d heard they had added one, actually. Or were planning to— the concern they listed was they didn’t want the model accidentally training on its own output, as more of its output shows up online.

I have to imagine this is a situation where security by obscurity is unavoidable though, so if they do have a watermark we might not hear much about it. Otherwise malicious users would just clean it back out again.

We may end up with a situation where only a few people internal to OpenAI know how the watermark works, and they occasionally answer questions for law enforcement with the proper paperwork.

AmputatorBot t1_j5eqn8m wrote on January 22, 2023 at 2:00 PM

Reply to comment by abriec in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

It looks like you shared an AMP link. These should load faster, but AMP is controversial because of concerns over privacy and the Open Web.

Maybe check out the canonical page instead: https://techcrunch.com/2022/12/10/openais-attempts-to-watermark-ai-text-hit-limits/

^(I'm a bot | )^(Why & About)^( | )^(Summon: u/AmputatorBot)

abriec t1_j5eqmbt wrote on January 22, 2023 at 2:00 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I believe there’s ongoing work related to this at OpenAI and similar research in generative models in general, like this submission currently under review.

JackandFred t1_j5epyi0 wrote on January 22, 2023 at 1:54 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

It wouldn’t necessarily be easy. But you say you want one detectable by some “key or other model” you can already design or use a model to detect if it was generated by Gpt, so it wouldn’t really need to use a watermark if you’re using a model. And if you’re using a more traditional watermark for digital pictures it could be very easily removed.

sabertoothedhedgehog t1_j5eneoh wrote on January 22, 2023 at 1:30 PM

Reply to ChatGPT is not all you need [R] by EduCGM

Love the topic of the paper.Absolutely HATE the figures showing taxonomies / example AI tools. These visualisations with boxes and arrows are really awful. These arrows are all over the place and meaningless. And the category boxes look the same as the application boxes.

It could have looked more like this:https://the-decoder.com/wp-content/uploads/2022/10/market_map_generative_AI-770x1027.png.webp

Or like this:https://www.sequoiacap.com/wp-content/uploads/sites/6/2022/09/genai-landscape-8.png

I don't even particularly like my examples. But there is no need for all these arrows and category boxes looking like the examples.

SatoshiNotMe t1_j5emmmu wrote on January 22, 2023 at 1:22 PM

Reply to comment by Veggies-are-okay in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229

This great survey/intro on GNN was just published this week

https://papers.labml.ai/paper/050c8888986f11ed8f9c3d8021bca7c8

clemda2 t1_j5eliix wrote on January 22, 2023 at 1:10 PM

Reply to [D] GCN datasets by ramya_1995

You CAN batch train GCNs (or some of them are very amenable to that) some of the most scalable GCNs rely on something like GraphSAGE convolution which doesn’t require the whole graph laplacian for updates (this approach is used by Wikipedia, Uber, Pinterest) to train highly scalable GCNs). Other convolutional operators like GAT also can be batch trained.

You can use the Python package PyTorch-Geometric documentation as a jumping off point for reading about practical graph sub sampling.

olegranmo OP t1_j5eiya9 wrote on January 22, 2023 at 12:42 PM

Reply to comment by hiptobecubic in [R] New Tsetlin machine learning scheme creates up to 80x smaller logical rules, benefitting hardware efficiency and interpretability. by olegranmo

First chapter of An Introduction to Tsetlin Machines is a great place to start: https://tsetlinmachine.org

hiptobecubic t1_j5eik7j wrote on January 22, 2023 at 12:38 PM

Reply to [R] New Tsetlin machine learning scheme creates up to 80x smaller logical rules, benefitting hardware efficiency and interpretability. by olegranmo

Is there a canonical introductory paper on TMs?

ikke89 t1_j5egvtc wrote on January 22, 2023 at 12:18 PM

Reply to ChatGPT is not all you need [R] by EduCGM

Thanks, I've been looking for an overview like that for a while!

ardula99 t1_j5e9in9 wrote on January 22, 2023 at 10:39 AM

Reply to comment by MonsieurBlunt in ChatGPT is not all you need [R] by EduCGM

Yeah, I couldn't understand what this sentence means either.

damc4 t1_j5e9bog wrote on January 22, 2023 at 10:36 AM

Reply to comment by gunshoes in [D]Can a bachelor get a job in ML? by alphapussycat

Ok, so that proves that someone has the skill. But when someone doesn't have master/phd, that doesn't prove that someone doesn't have that skill. In other words, if someone has no master/phd degree, but has published a research paper, then they have also proved to have that skill, so it should be possible for someone with bachelor / no bachelor to get a job in ML. Is that correct?

JustARandomNoob165 t1_j5e4vke wrote on January 22, 2023 at 9:34 AM

Reply to comment by IamTimNguyen in [R] Greg Yang's work on a rigorous mathematical theory for neural networks by IamTimNguyen

I know it is a very broad question, but maybe you have recommendation of materials/resources to prepare yourself better to digest the topics used in this talk? Thank you in advance very much!