Recent comments in /f/MachineLearning
[deleted] t1_j5x0kfl wrote
Reply to [P] Diffusion models best practices by debrises
[removed]
mirrorcoloured t1_j5wwhn5 wrote
Reply to comment by dineNshine in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
While I agree with your concluding sentiments against centralization and recommending use of signatures, I don't believe your initial premise holds.
Consider stenography in digital images, where extra information can be included without any noticeable loss in signal quality.
One could argue that any bits not used for the primary signal are 'limiting usability', but this seems pedantic to me. It seems perfectly reasonable that watermarking could be implemented with no noticable impacts, given the already massive amount of computing power required and dense information output.
[deleted] t1_j5wqggr wrote
Reply to comment by Icries4frenchfries in Apple AI Residency 2023 [R] by Extension-Reward5756
[deleted]
deeceeo t1_j5wnrf1 wrote
Reply to [R] Tsetlin Machine in Medical Research - Striking Differences Between Tsetlin Machine Interpretability and Deep Learning Attention by olegranmo
How would you compare Tselin Machines to other intrinsically interpretable models, like the sparse decision trees that Cynthia Rudin's group works on? Both in terms of capacity/expressiveness and interpretability.
NeoKov t1_j5wmjkr wrote
As a novice, I’m not understanding why the test loss continues to increase— in general, but also in Fig. 8.2b, if anyone can explain… The model continues to update and (over)fit throughout testing? I thought it was static after training. And the testing batch is always the same size as the training batch? And they don’t occur simultaneously, right? So the test plot is only generated after the training plot.
shingekichan1996 OP t1_j5wlavz wrote
Reply to comment by melgor89 in [D] Self-Supervised Contrastive Approaches that don’t use large batch size. by shingekichan1996
I saw an implementation of that paper here: https://github.com/raminnakhli/Decoupled-Contrastive-Learning
And I saw also that the same paper is rejected at NeurIPS'21 becuase of its similar impact on other methods like Barlow Twins, SimSiam, BYOL, etc.
However, at first glance at the re-implemented results, it works great on small batch-size indeed.
maximalentropy t1_j5wl3gx wrote
Reply to [D] Self-Supervised Contrastive Approaches that don’t use large batch size. by shingekichan1996
Momentum encoder based approaches don’t need large batch size because they use a queue for storing negatives rather than taking negatives from the mini-batch
[deleted] t1_j5wj5tc wrote
Reply to [D] Simple Questions Thread by AutoModerator
[deleted]
BigDreamx OP t1_j5wgoy0 wrote
Reply to comment by Red-Portal in [D] Publication Resume by BigDreamx
I see. So, it's just like don't do something that will make your paper be all over the internet thing.
Sorry if I am asking dumb questions lol. I'm new to this stuff.
Red-Portal t1_j5wgewn wrote
Reply to comment by BigDreamx in [D] Publication Resume by BigDreamx
"Yes, but don't make a fuss about it" is pretty much the guideline.
LetWrong1932 t1_j5wduib wrote
Reply to comment by avd4292 in [D] CVPR Reviews are out by banmeyoucoward
heard that those scores got accepted last year, maybe quite good!
LetWrong1932 t1_j5wdrdy wrote
Reply to comment by Ok-Yogurtcloset-4508 in [D] CVPR Reviews are out by banmeyoucoward
not except for 2012
koolaidman123 t1_j5wbk37 wrote
Reply to comment by [deleted] in [D] Self-Supervised Contrastive Approaches that don’t use large batch size. by shingekichan1996
Thats not the same thing...
Gradient accumulation calcs the loss on each batch, it doesnt work with in batch negatives because you need compare input from batch 1 to inputs of batch 2, hence offloading and caching predictions, then calculating the loss with 1 batch
Thats why gradient accumulation doesnt work to simulate large batch sizes for contrastive learning, if youre familiar with it
[deleted] t1_j5w9rbv wrote
Reply to comment by koolaidman123 in [D] Self-Supervised Contrastive Approaches that don’t use large batch size. by shingekichan1996
[deleted]
BigDreamx OP t1_j5w8u6o wrote
Reply to comment by Red-Portal in [D] Publication Resume by BigDreamx
"A reviewer may be able to deduce the authors’ identities by using external resources, such as technical reports published on the web. The availability of information on the web that may allow reviewers to infer the authors’ identities does not constitute a breach of the double-blind submission policy. Reviewers are explicitly asked not to seek this information."
Doesn't this indicate that we can post whatever on the web? It's just the submission that has to he anonymous?
Red-Portal t1_j5w8thc wrote
Reply to comment by BigDreamx in [D] Publication Resume by BigDreamx
> Authors are allowed to post versions of their work on preprint servers such as arXiv. They are also allowed to give talks to restricted audiences on the work(s) submitted to ICML during the review. If you have posted or plan to post a non-anonymized version of your paper online before the ICML decisions are made, the submitted version must not refer to the non-anonymized version.
> ICML strongly discourages advertising the preprint on social media or in the press while under submission to ICML. Under no circumstances should your work be explicitly identified as ICML submission at any time during the review period, i.e., from the time you submit the paper to the communication of the accept/reject decisions.
Mate, it's stated on the call for papers
BigDreamx OP t1_j5w83wc wrote
Reply to comment by Red-Portal in [D] Publication Resume by BigDreamx
Do you have a source for this? Also, if this is the case, am I allowed to put on ArXiv and link that to my resume tho?
Red-Portal t1_j5w7qtt wrote
Reply to [D] Publication Resume by BigDreamx
Yes, but I believe you musn't state that it's under consideration for ICML
Ready-Blacksmith-411 t1_j5w7hap wrote
Reply to [D] CVPR Reviews are out by banmeyoucoward
1 weak accept and 2 weak reject with only theoretical comments, is it possible to have opportunity to increase the score once I rebuttals well? Really worried about that
RealKillering t1_j5vu1t5 wrote
Reply to [D] Simple Questions Thread by AutoModerator
I just started working with Google Colab. I am still learning and just used Cifar 10 for the first time. I switched to colab pro and also switched the GPU class to Premium.
The thing is the training seems to take just as long as with the free GPU. What am I doing wrong?
shapul t1_j5vp3a3 wrote
Reply to comment by NadaBrothers in [R] Easiest way to train RNN's in MATLAB or Julia? by NadaBrothers
Consider that MATLAB can directly call Python functions and scripts. This is built into the core MATLAB, no extra toolbox or 3rd party code needed.
squidward2022 t1_j5vmb95 wrote
Reply to [D] Self-Supervised Contrastive Approaches that don’t use large batch size. by shingekichan1996
(https://arxiv.org/pdf/2106.04156.pdf ) This was a cool paper from NeurIPS 2020 which aimed to theoretically explain the success of CL by relating it spectral clustering. They present a loss with a very similar form to InfoNCE, which they use for their theory. One of the plus sides found was it worked well with small batch sizes.
(https://arxiv.org/abs/2110.06848) I skimmed this work a while back, one of their main claims is that this approach works with small batch sizes.
dineNshine t1_j5vm4r5 wrote
Reply to comment by mirrorcoloured in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
By definition. If you force the model to embed a watermark, you can only generate watermarked content. Since OP proposed to embed it into model parameters, it would also likely degrade performance.
Limiting the end user this way is bad, for reasons I have stated above. The right approach is to train a model that fits the data well, and then condition it using the input prompt. Putting arbitrary limits on the model itself to prevent misuse is misguided at best, making sure that only people in power will be able to utilize the technology to its fullest. This would also give people a false sense of security, since they might think that content generated with a model lacking a watermark is "genuine".
If AI advances to a point where the content it generates is indistinguishable from human-generated content and fake recordings become a problem, the only sensible thing we can really do is using signatures. This is a simple method that works perfectly well. For any piece of virtual content, you can quickly check if it came from an entity known to you by checking against their public key.
stargazer1Q84 t1_j5vazpd wrote
If you want to go the Semantic Search route, make sure to check out the deepset.ai haystack framework in conjunction with a sentence-transformer. They make semantic document retrieval very easy to set up and there's many, high-performing pre-trained models for semantic search on hugging face
eamonnkeogh t1_j5x10g4 wrote
Reply to [R] Best service for scientific paper correction by Meddhouib10
May I suggest you look at this short checklist?
https://www.cs.ucr.edu/~eamonn/public/ChecklistforRevisingaDataMiningPaper.doc