[deleted] t1_j731mih wrote on February 3, 2023 at 6:39 PM

You'll probably be fine learning transformers directly, but a better understanding of RNNs might make some of the NLP tutorials/papers containing transformers more easily comprehensible.

Attention is an very important component of transformers, but attention can be applied to RNNs, too.

Noddybear t1_j72kce3 wrote on February 3, 2023 at 4:50 PM

Reply to [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849

Hey dude, this caught my eye before realising I spoke to you about this in person! I’ll have a play with it.

based_goats t1_j72jd9z wrote on February 3, 2023 at 4:44 PM

Reply to comment by jimmymvp in [D] Normalizing Flows in 2023? by wellfriedbeans

There are some papers showing diffusion working better for high-dimensional data in likelihood free inference, even just using an elbo bound. Can dig up later if wanted

_Arsenie_Boca_ t1_j72g4g4 wrote on February 3, 2023 at 4:23 PM

Reply to comment by alpha-meta in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

Im not sure if they vary the sampling hyperparemeters. The point is that langauge modelling objectives are to some degree ill-posed because we calculate the loss on intermediate results rather than the final output that we care about.

alpha-meta OP t1_j72dxx7 wrote on February 3, 2023 at 4:09 PM

Reply to comment by bigabig in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

I think it's probably the non-differentiable nature of the sampling techniques. If it's just about limited training data and using the reward model, in that case you can also use weakly supervised learning with that reward model.

alpha-meta OP t1_j72dpto wrote on February 3, 2023 at 4:08 PM

Reply to comment by _Arsenie_Boca_ in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

Good point, so you mean they incorporate things like beam search + changing temperature, top-k sampling, and nucleus sampling in the RL PPO-based optimizaton?

YOLOBOT666 t1_j72cncj wrote on February 3, 2023 at 4:01 PM

Reply to [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen

Iterative as in continuing until there’s no more neighbours left as you continuously add neighbours to your index and query?

SimonJDPrince t1_j72bw7l wrote on February 3, 2023 at 3:56 PM

Reply to [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad

Explained in my forthcoming book:

https://udlbook.github.io/udlbook/

Should be a good place to start, and if it isn't then I'm really interested to know where you struggled so I can improve the explanation.

data_wizard_1867 t1_j72a77y wrote on February 3, 2023 at 3:45 PM

Reply to comment by EducationalCicada in [D] What does a DL role look like in ten years? by PassingTumbleweed

I would even say machine learning is not the be all and end all of solving problems with data.

nicholsz t1_j728g2l wrote on February 3, 2023 at 3:34 PM

Reply to comment by juanigp in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad

OS. Kernel. Bus. Processor. Transistor. p-n junction

mostlyhydrogen OP t1_j727t8z wrote on February 3, 2023 at 3:30 PM

Reply to comment by Kacper-Lukawski in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen

Thanks for the link!

Kacper-Lukawski t1_j725avq wrote on February 3, 2023 at 3:13 PM

Reply to [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen

Qdrant has a recommendation API that allows doing exactly what you want, I suppose: https://qdrant.tech/documentation/search/#recommendation-api

Meddhouib10 t1_j72529u wrote on February 3, 2023 at 3:11 PM

Reply to [p] Is it possible to add more classes to an already trained resnet image classifier model without the need to retrain it in all dataset again? [p] by YukkiiCode

Yes ! You only need to change the last classifier layer (and initialize the added weights) to add more outputs and then further train the model on data containing all the classes (including the new ones)