Recent comments in /f/MachineLearning
Red-Portal t1_j8ke7vj wrote
It's literally called importance sampling in the SGD literature. You normally have to downweigh the "important samples" to counter the fact that you're sampling them more often. Whether this practice actually accelerates convergence has been an important question in SGD until very recently. Check this paper.
RoboticJan t1_j8kdf3r wrote
Reply to comment by DLamikins in [R] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances by Maleficent_Stay_7737
Why? It's open access.
astonzhang t1_j8kcydh wrote
Main_Mathematician77 t1_j8kawas wrote
Reply to comment by Main_Mathematician77 in [D] Sentient AI Encryption by Prestigious_Tap8633
But like yeah if you zipped the NPY files storing the weight matrices if most likely would not run
Main_Mathematician77 t1_j8kaol4 wrote
Reply to [D] Sentient AI Encryption by Prestigious_Tap8633
Intelligence is compression
ComplexColor t1_j8kaasf wrote
Reply to [D] Sentient AI Encryption by Prestigious_Tap8633
You're not making a lot of sense. It's not clear you understand what software is, what an AI is, what encryption is. Why do you think encrypting an AI would affect it running? Do you have some specific encryption in mind?
​
Are you just high?
ntaylor- t1_j8k7v8y wrote
Reply to comment by bik1230 in [P] Introducing arxivGPT: chrome extension that summarizes arxived research papers using chatGPT by _sshin_
I had the same thought... Im fairly sure any gpt based model can only handle 4k tokens.
DLamikins t1_j8k2f9n wrote
Reply to [R] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances by Maleficent_Stay_7737
Arxiv link?
GlorifiedPlumber100 t1_j8jrhpl wrote
Reply to [D] Sentient AI Encryption by Prestigious_Tap8633
If it really had human like intelligence the encryption key would be 123, so it would be easy to fix.
WikiSummarizerBot t1_j8jrghx wrote
Reply to comment by The-Last-Lion-Turtle in [D] Sentient AI Encryption by Prestigious_Tap8633
>Homomorphic encryption is a form of encryption that allows computations to be performed on encrypted data without first having to decrypt it. The resulting computations are left in an encrypted form which, when decrypted, result in an output that is identical to that produced had the operations been performed on the unencrypted data. Homomorphic encryption can be used for privacy-preserving outsourced storage and computation. This allows data to be encrypted and out-sourced to commercial cloud environments for processing, all while encrypted.
^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)
The-Last-Lion-Turtle t1_j8jrena wrote
Reply to [D] Sentient AI Encryption by Prestigious_Tap8633
It could work with a heavy efficiency penalty.
https://en.m.wikipedia.org/wiki/Homomorphic_encryption
Though I don't think gradient descent will select for something like this.
Far more likely is to obfuscate how it works so while it's not encrypted we learn little with the tools we have, and would have an extremely difficult time verifying something is absent.
[deleted] t1_j8jn8ja wrote
Reply to [D] Simple Questions Thread by AutoModerator
[deleted]
BrotherAmazing t1_j8jd23p wrote
Reply to [D] Is a non-SOTA paper still good to publish if it has an interesting method that does have strong improvements over baselines (read text for more context)? Are there good examples of this kind of work being published? by orangelord234
Yes.
“SoTA” is also often ill-defined and while important, can sometimes be a bit overhyped IMO.
Most practitioners and engineers want something that is as good as it can be or is above some threshold in accuracy, given constraints that can often be severe. If a “SoTA” approach cannot meet these real-world constraints, I would argue it’s not “SoTA” for that particular problem of interest.
If you have something that performs very well under such real-world constraints and can demonstrate value to the practitioner, it should be considered for publication by the editors.
redmx t1_j8jbgo5 wrote
In Deep Reinforcement Learning is called prioritized experience replay: https://arxiv.org/abs/1511.05952
Pfohlol t1_j8j6pfb wrote
Reply to comment by d0cmorris in [D] Constrained Optimization in Deep Learning by d0cmorris
Here's one to get started https://proceedings.mlr.press/v98/cotter19a.html
tdgros t1_j8j2wbd wrote
Reply to [R] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances by Maleficent_Stay_7737
Unless I missed it, the paper does mention the fact that the degradation mapping should be estimated but does not detail or cite papers that do that. (examples: KernelGAN, KernelNet, doubleDIP or MetaKernelGAN...)
SleekEagle t1_j8ix4fz wrote
Reply to comment by MustBeSomethingThere in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho
Authors publish papers on research, experiments, findings, etc. They do not always release the code for the models they are studying.
The lucidrains' repos implement the models, creating an open-source implementation for the research
The next step would then be to train the model, which requires a lot more than just the code (most notably, money). I assume you're referring to these trained weights when you say "the needed AI model". Training would require a huge amount of time and money for a team, never mind a single person, to train even one of these models let alone a whole portfolio of them
For this reason, it's not very reasonable to expect lucidrains or any other person to train these models - the open-source implementations are a great contribution on their own!
fl2ooo t1_j8itlks wrote
Oversampling
nerdimite t1_j8is883 wrote
This seems somewhat similar to hard example mining except that you already know which ones are hard here.
perta1234 t1_j8iqbah wrote
Reply to comment by Final-Rush759 in [R] What are some papers that describe TikTok's algorithm? by Thin-Shirt6688
There is the claim that any system can be (approximately) reverse engineered if one has access to the results of the system. Are those too hidden from the public?
What is "best" is subjective. At least I was reading last week that any moderate fitness related interest brings quite unhealthy content very quickly. But it has to be better than Amazon's system, anyway.
BossOfTheGame t1_j8ikmcj wrote
Because you have a small batch size, my feeling is that you probably want a very small dropout rate on the important items, if only to decrease the chance the network overfits to them. Maybe 1 / 100 batches, excludes the important item and the rest include it. But perhaps it doesn't matter.
pramodhrachuri t1_j8ihgqs wrote
Reply to comment by Fast-for-a-starfish in [R] [P] LUCAS: LUng CAncer Screening dataset by kandalete
Well, I can't access it from India
[deleted] t1_j8if3mw wrote
Reply to comment by 2blazen in [Discussion] The need for noise in stable diffusion by AdministrationOk2735
[deleted]
lucidrage t1_j8kewo9 wrote
Reply to comment by drcopus in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho
> allow it to
generalise togenerate new ones!FTFY, that's how you get skynet!