Recent comments in /f/MachineLearning
ThunderySleep t1_j72y4qb wrote
Reply to comment by ooonurse in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Are your friends?
edit: oh wait, you already told us they're not.
ooonurse t1_j72xui9 wrote
Reply to comment by ThunderySleep in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
r u ok hun?
ThunderySleep t1_j72xf6o wrote
Reply to comment by ooonurse in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Why? I don't care about your friend's feelings.
This comment was a fine addition to the discussion until you thought you could tell me what to do.
[deleted] t1_j72u4c2 wrote
Reply to comment by Jurph in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
[deleted]
PHEEEEELLLLLEEEEP t1_j72txfo wrote
Reply to comment by jimmymvp in [D] Normalizing Flows in 2023? by wellfriedbeans
Diffusion models can also generate exact likelihoods so maybe we'll see a shift to those in the future
Erosis t1_j72rzdl wrote
Reply to comment by SAbdusSamad in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
You'll probably be fine learning transformers directly, but a better understanding of RNNs might make some of the NLP tutorials/papers containing transformers more easily comprehensible.
Attention is an very important component of transformers, but attention can be applied to RNNs, too.
Noddybear t1_j72kce3 wrote
Reply to [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
Hey dude, this caught my eye before realising I spoke to you about this in person! I’ll have a play with it.
based_goats t1_j72jd9z wrote
Reply to comment by jimmymvp in [D] Normalizing Flows in 2023? by wellfriedbeans
There are some papers showing diffusion working better for high-dimensional data in likelihood free inference, even just using an elbo bound. Can dig up later if wanted
_Arsenie_Boca_ t1_j72g4g4 wrote
Reply to comment by alpha-meta in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta
Im not sure if they vary the sampling hyperparemeters. The point is that langauge modelling objectives are to some degree ill-posed because we calculate the loss on intermediate results rather than the final output that we care about.
alpha-meta OP t1_j72dxx7 wrote
Reply to comment by bigabig in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta
I think it's probably the non-differentiable nature of the sampling techniques. If it's just about limited training data and using the reward model, in that case you can also use weakly supervised learning with that reward model.
alpha-meta OP t1_j72dpto wrote
Reply to comment by _Arsenie_Boca_ in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta
Good point, so you mean they incorporate things like beam search + changing temperature, top-k sampling, and nucleus sampling in the RL PPO-based optimizaton?
YOLOBOT666 t1_j72cncj wrote
Reply to [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
Iterative as in continuing until there’s no more neighbours left as you continuously add neighbours to your index and query?
SimonJDPrince t1_j72bw7l wrote
Explained in my forthcoming book:
https://udlbook.github.io/udlbook/
Should be a good place to start, and if it isn't then I'm really interested to know where you struggled so I can improve the explanation.
data_wizard_1867 t1_j72a77y wrote
Reply to comment by EducationalCicada in [D] What does a DL role look like in ten years? by PassingTumbleweed
I would even say machine learning is not the be all and end all of solving problems with data.
nicholsz t1_j728g2l wrote
Reply to comment by juanigp in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
OS. Kernel. Bus. Processor. Transistor. p-n junction
mostlyhydrogen OP t1_j727t8z wrote
Reply to comment by Kacper-Lukawski in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
Thanks for the link!
Kacper-Lukawski t1_j725avq wrote
Reply to [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
Qdrant has a recommendation API that allows doing exactly what you want, I suppose: https://qdrant.tech/documentation/search/#recommendation-api
Meddhouib10 t1_j72529u wrote
Reply to [p] Is it possible to add more classes to an already trained resnet image classifier model without the need to retrain it in all dataset again? [p] by YukkiiCode
Yes ! You only need to change the last classifier layer (and initialize the added weights) to add more outputs and then further train the model on data containing all the classes (including the new ones)
blacksnowboader t1_j724p27 wrote
Reply to comment by comfytoday in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Hey ChatGPT can you phrase this [sentence] to be politically correct?
mostlyhydrogen OP t1_j724ctr wrote
Reply to comment by RingoCatKeeper in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
That was an interesting read, but I don't think it solves my problem. Their examples don't show joint vector searches: https://github.com/google-research/google-research/blob/master/scann/docs/example.ipynb
BiryaniSenpai t1_j7249ok wrote
Reply to comment by mostlyhydrogen in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
I mean pass your queries through a self attention layer and then some fcns and have it output your final query vector
mostlyhydrogen OP t1_j723ya3 wrote
Reply to comment by nobody202342 in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
What about marking samples as "irrelevant"?
mostlyhydrogen OP t1_j723us3 wrote
Reply to comment by BiryaniSenpai in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
What does it mean for a vector to attend to another vector?
new_name_who_dis_ t1_j723k5w wrote
Reply to comment by tripple13 in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
I mean the more you understand the better obviously. But it's not necessary, it's just context for what we don't do anymore.
[deleted] t1_j731mih wrote
Reply to comment by ThunderySleep in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
[removed]