lucidrage t1_j8kewo9 wrote on February 14, 2023 at 11:16 PM

Reply to comment by drcopus in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

> allow it to ~~generalise to~~ generate new ones!

FTFY, that's how you get skynet!

Red-Portal t1_j8ke7vj wrote on February 14, 2023 at 11:11 PM

Reply to [D] Repeating important samples in every batch for NN training? by zxkj

It's literally called importance sampling in the SGD literature. You normally have to downweigh the "important samples" to counter the fact that you're sampling them more often. Whether this practice actually accelerates convergence has been an important question in SGD until very recently. Check this paper.

RoboticJan t1_j8kdf3r wrote on February 14, 2023 at 11:06 PM

Reply to comment by DLamikins in [R] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances by Maleficent_Stay_7737

Why? It's open access.

astonzhang t1_j8kcydh wrote on February 14, 2023 at 11:02 PM

Reply to comment by lwl in [R] Multimodal Chain-of-Thought Reasoning in Language Models - Amazon Web Services Zhuosheng Zhang et al - Outperforms GPT-3.5 by 16% (75%->91%) and surpasses human performance on ScienceQA while having less than 1B params! by Singularian2501

Can you check it again?

Main_Mathematician77 t1_j8kawas wrote on February 14, 2023 at 10:48 PM

Reply to comment by Main_Mathematician77 in [D] Sentient AI Encryption by Prestigious_Tap8633

But like yeah if you zipped the NPY files storing the weight matrices if most likely would not run

Main_Mathematician77 t1_j8kaol4 wrote on February 14, 2023 at 10:46 PM

Reply to [D] Sentient AI Encryption by Prestigious_Tap8633

Intelligence is compression

ComplexColor t1_j8kaasf wrote on February 14, 2023 at 10:44 PM

Reply to [D] Sentient AI Encryption by Prestigious_Tap8633

You're not making a lot of sense. It's not clear you understand what software is, what an AI is, what encryption is. Why do you think encrypting an AI would affect it running? Do you have some specific encryption in mind?

Are you just high?

ntaylor- t1_j8k7v8y wrote on February 14, 2023 at 10:27 PM

Reply to comment by bik1230 in [P] Introducing arxivGPT: chrome extension that summarizes arxived research papers using chatGPT by _sshin_

I had the same thought... Im fairly sure any gpt based model can only handle 4k tokens.

DLamikins t1_j8k2f9n wrote on February 14, 2023 at 9:50 PM

Reply to [R] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances by Maleficent_Stay_7737

Arxiv link?

GlorifiedPlumber100 t1_j8jrhpl wrote on February 14, 2023 at 8:40 PM

Reply to [D] Sentient AI Encryption by Prestigious_Tap8633

If it really had human like intelligence the encryption key would be 123, so it would be easy to fix.

WikiSummarizerBot t1_j8jrghx wrote on February 14, 2023 at 8:40 PM

Reply to comment by The-Last-Lion-Turtle in [D] Sentient AI Encryption by Prestigious_Tap8633

Homomorphic encryption

>Homomorphic encryption is a form of encryption that allows computations to be performed on encrypted data without first having to decrypt it. The resulting computations are left in an encrypted form which, when decrypted, result in an output that is identical to that produced had the operations been performed on the unencrypted data. Homomorphic encryption can be used for privacy-preserving outsourced storage and computation. This allows data to be encrypted and out-sourced to commercial cloud environments for processing, all while encrypted.

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

The-Last-Lion-Turtle t1_j8jrena wrote on February 14, 2023 at 8:40 PM

Reply to [D] Sentient AI Encryption by Prestigious_Tap8633

It could work with a heavy efficiency penalty.

https://en.m.wikipedia.org/wiki/Homomorphic_encryption

Though I don't think gradient descent will select for something like this.

Far more likely is to obfuscate how it works so while it's not encrypted we learn little with the tools we have, and would have an extremely difficult time verifying something is absent.

[deleted] t1_j8jn8ja wrote on February 14, 2023 at 8:13 PM

Reply to [D] Simple Questions Thread by AutoModerator

[deleted]

BrotherAmazing t1_j8jd23p wrote on February 14, 2023 at 7:07 PM

Reply to [D] Is a non-SOTA paper still good to publish if it has an interesting method that does have strong improvements over baselines (read text for more context)? Are there good examples of this kind of work being published? by orangelord234

Yes.

“SoTA” is also often ill-defined and while important, can sometimes be a bit overhyped IMO.

Most practitioners and engineers want something that is as good as it can be or is above some threshold in accuracy, given constraints that can often be severe. If a “SoTA” approach cannot meet these real-world constraints, I would argue it’s not “SoTA” for that particular problem of interest.

If you have something that performs very well under such real-world constraints and can demonstrate value to the practitioner, it should be considered for publication by the editors.

redmx t1_j8jbgo5 wrote on February 14, 2023 at 6:56 PM

Reply to [D] Repeating important samples in every batch for NN training? by zxkj

In Deep Reinforcement Learning is called prioritized experience replay: https://arxiv.org/abs/1511.05952

Pfohlol t1_j8j6pfb wrote on February 14, 2023 at 6:26 PM

Reply to comment by d0cmorris in [D] Constrained Optimization in Deep Learning by d0cmorris

Here's one to get started https://proceedings.mlr.press/v98/cotter19a.html

tdgros t1_j8j2wbd wrote on February 14, 2023 at 6:01 PM

Reply to [R] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances by Maleficent_Stay_7737

Unless I missed it, the paper does mention the fact that the degradation mapping should be estimated but does not detail or cite papers that do that. (examples: KernelGAN, KernelNet, doubleDIP or MetaKernelGAN...)

SleekEagle t1_j8ix4fz wrote on February 14, 2023 at 5:24 PM

Reply to comment by MustBeSomethingThere in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

Authors publish papers on research, experiments, findings, etc. They do not always release the code for the models they are studying.

The lucidrains' repos implement the models, creating an open-source implementation for the research

The next step would then be to train the model, which requires a lot more than just the code (most notably, money). I assume you're referring to these trained weights when you say "the needed AI model". Training would require a huge amount of time and money for a team, never mind a single person, to train even one of these models let alone a whole portfolio of them

For this reason, it's not very reasonable to expect lucidrains or any other person to train these models - the open-source implementations are a great contribution on their own!

fl2ooo t1_j8itlks wrote on February 14, 2023 at 5:01 PM

Reply to [D] Repeating important samples in every batch for NN training? by zxkj

Oversampling

nerdimite t1_j8is883 wrote on February 14, 2023 at 4:52 PM

Reply to [D] Repeating important samples in every batch for NN training? by zxkj

This seems somewhat similar to hard example mining except that you already know which ones are hard here.

perta1234 t1_j8iqbah wrote on February 14, 2023 at 4:40 PM

Reply to comment by Final-Rush759 in [R] What are some papers that describe TikTok's algorithm? by Thin-Shirt6688

There is the claim that any system can be (approximately) reverse engineered if one has access to the results of the system. Are those too hidden from the public?

What is "best" is subjective. At least I was reading last week that any moderate fitness related interest brings quite unhealthy content very quickly. But it has to be better than Amazon's system, anyway.

BossOfTheGame t1_j8ikmcj wrote on February 14, 2023 at 4:03 PM

Reply to [D] Repeating important samples in every batch for NN training? by zxkj

Because you have a small batch size, my feeling is that you probably want a very small dropout rate on the important items, if only to decrease the chance the network overfits to them. Maybe 1 / 100 batches, excludes the important item and the rest include it. But perhaps it doesn't matter.

pramodhrachuri t1_j8ihgqs wrote on February 14, 2023 at 3:42 PM

Reply to comment by Fast-for-a-starfish in [R] [P] LUCAS: LUng CAncer Screening dataset by kandalete

Well, I can't access it from India

[deleted] t1_j8if3mw wrote on February 14, 2023 at 3:26 PM

Reply to comment by 2blazen in [Discussion] The need for noise in stable diffusion by AdministrationOk2735

[deleted]

jrkirby t1_j8ibjzo wrote on February 14, 2023 at 3:01 PM

Reply to [D] Repeating important samples in every batch for NN training? by zxkj

https://en.wikipedia.org/wiki/Importance_sampling

Recent comments in /f/MachineLearning