Recent comments in /f/MachineLearning

BitterAd9531 t1_j5fcby3 wrote

>If you think, you can take two watermarked LLMs and 'trivially" combine their output as you stated, explain in detail how you do that in an automated way.

No thank you, I'm not going to write an LLM from scratch for a Reddit argument. And FWIW, I suspect that even if I did, you'd find some way to convince yourself that you're not wrong. You not understanding how this works doesn't impact me nearly enough to care that much. Have a good one.

18

serverrack3349b t1_j5fc250 wrote

In a sense it is just copying and pasting from the web just in a different order, but I get that that is not your question. Something I would try is to use plagiarism checking sites online to see if there is an exact copy of your text online. If there is than you should be able to either attribute it to the right person or re write it a bit so it is not plagiarism

1

Historical-Coat5318 t1_j5fbhj5 wrote

If by fighting technological progress you mean controlling it to make sure it serves humanity in the safest most optimal way then yes, we've been doing this forever, when cars were first introduced traffic police didn't exist. There is nothing retrograde or luddite in thinking this way, it's what we've always done.

Obviously watermarking is futile but there are other methods that need to be considered which no one even entertains, for example the ones I mentioned in my first comment.

Also it should be trivially obvious that AI should never be open-source. That's the worst possible idea.

−5

BitterAd9531 t1_j5fal5s wrote

>no one seems to be even considering dealing with it in a serious way

Everyone has considered dealing with it, but everyone who understands the technology behind them also knows that it's futile in the long term. The whole point of these LLMs it to mimic human writing as closely as possible and the more they succeed, the more difficult it becomes to detect. They can be used to output both more precise and more variated text.

Countermeasures like watermarks will be trivial to circumvent while at the same time restricting the capabilities and performance of these models. And that's ignoring the elephant in the room, which is that once open-source models come out, it won't matter at all.

>this is the most pressing ethical issue in AI safety today

Why? It's been long known that the difference between AI and human capabilities will diminish over time. This is simply the direction we're going. Maybe it's time to adapt instead of trying to fight something inevitable. Fighting technological progress has never worked before.

People banking on being able to distinguish between AI and humans will be in for a bad time the coming few years.

42

Historical-Coat5318 t1_j5f88m7 wrote

It seems to me ethically imperative to be able to discern human text from AI text, so it's really concerning when people just hand-wave it away immediately as obviously futile, like Altman did in a recent interview. Obviously these detection methods would have to be more robust than just a cryptographic key that can be easily circumvented just by changing a few words, but this is the most pressing ethical issue in AI safety today and no one seems to be even considering dealing with it in a serious way.

One idea: Couldn't you just train the AI to identify minor changes to the text to the point where rewriting it would be too much of a hassle? Also, open the server history under a homonymous (for privacy concerns) database so that everyone has access to all GPT (and all other LLMs) output and couple that with the cryptographic key Scott Aaronson introduced plus adversarial solutions for re-worded text. This with other additional safety features would make it too much of hassle for anyone to try to bypass it, maybe an additional infinitesimal cost to every GPT output to counteract spam, etc etc. A lot of regulation is needed for something so potentially disruptive.

−39

BitterAd9531 t1_j5f5olr wrote

I think you are misunderstanding how these watermarks work. The watermark is encoded in the tokens used and so combining or rewriting will weaken the watermark to the point it can no longer be used to accurately detect. Robust means a few tokens may be changed, but changing enough tokens will have an impact eventually.

The semantics don't change because in language, there are multiple ways to describe the same thing without using the same (order of) words. That's literally what "rewriting" means.

21

stanteal t1_j5f0jaw wrote

As you have said you would need a variable amount of outputs which is not feasible in a CNN. However, you could divide the image into a grid and make predictions of the probability of the center of a circle is within each grid and their x and y offsets . Not sure if there are better resources available, but it might be worth looking at how YOLO or YOLO2 implemented their outputs.

1

conchoso t1_j5ez9w0 wrote

For comparison, the Stable Diffusion-based AI models that generate prompted images DO have an invisible but detectable watermark embedded by default in those hundreds of dreamed up images that get posted to reddit now everyday ... but they included an option to turn it off. Steganography is far further along in digital images than plain text though...

17

gunshoes t1_j5ey2s1 wrote

No, many hiring managers will use arbitrary criteria to reduce the number of applicants they need to evaluate for a job. Degree requirements are one of those. While yes, there probably are a few people who are in ML jobs without meeting degree requirements, in general, you're going to struggle without them.

1