rbain13 t1_ja742s1 wrote on February 27, 2023 at 9:38 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

Computers are largely failed attempts at doing what our brains do. Our brains use RL (i.e. dopamine + serotonin) and neural networks. It is probably useful to study for that reason alone :shrug:

Tea_Pearce t1_ja73tpy wrote on February 27, 2023 at 9:34 AM

Reply to comment by CellWithoutCulture in [D] Is RL dead/worth researching these days? by [deleted]

fyi, GATO used imitation learning, which is closer to supervised than RL.

KBM_KBM t1_ja71zrh wrote on February 27, 2023 at 9:08 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

Chat gpt works using a combination of rl and llm

tdgros t1_ja71ave wrote on February 27, 2023 at 8:57 AM

Reply to comment by CellWithoutCulture in [D] Is RL dead/worth researching these days? by [deleted]

>toolformer

Are you sure there's RL in Toolformer? I thought it was mostly self-supervised and fine-tuned.

KBM_KBM t1_ja70ytj wrote on February 27, 2023 at 8:52 AM

Reply to [N] New 1.0 release of Deep Graph Library (DGL) by jermainewang

Hopefully it is easier to use than pytorch geometric

[deleted] t1_ja70o3j wrote on February 27, 2023 at 8:48 AM

Reply to [P] [N] Democratizing the chatGPT technology through a Q&A game by coconautico

[removed]

cthorrez t1_ja70abd wrote on February 27, 2023 at 8:43 AM

Reply to comment by PassingTumbleweed in [D] Is RL dead/worth researching these days? by [deleted]

I find it a little weird that RLHF is considered to be reinforcement learning.

The human feedback is collected offline and forms a static dataset. They use the objective from PPO but it's really more of a form of supervised learning. There isn't an agent interacting with an env, the "env" is just sampling text from a static dataset and the reward is the score from a neural net trained on a static dataset.

Centigonal t1_ja708qj wrote on February 27, 2023 at 8:42 AM

Reply to comment by hackinthebochs in [R] Large language models generate functional protein sequences across diverse families by MysteryInc152

I think it's fair to call RNA a language.

[deleted] t1_ja6zwva wrote on February 27, 2023 at 8:37 AM

Reply to comment by harharveryfunny in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada

[removed]

_learn_faster_ OP t1_ja6zovh wrote on February 27, 2023 at 8:34 AM

Reply to comment by machineko in [D] Faster Flan-T5 inference by _learn_faster_

We have GPUs (e.g. A100) but can only use 1 GPU per request (not multi-gpu). We are also willing to take a bit of an accuracy hit.

Let me know what you think would be best for us?

When you say compression do you mean things like pruning and distillation?

pyonsu2 t1_ja6xz70 wrote on February 27, 2023 at 8:10 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

Hottest ever. RHFL, robotics

tripple13 t1_ja6xhe0 wrote on February 27, 2023 at 8:03 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

If all you do is following trends, and whats in the "spotlight" you probably don't care about your research, but care about the accolades.

jj_HeRo t1_ja6wl0q wrote on February 27, 2023 at 7:51 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

Yann LeCun said on Twitter that it is dead... go figure.

PassingTumbleweed t1_ja6w9ai wrote on February 27, 2023 at 7:46 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

It's weird to read this when RLHF has been one of the key components of chat GPT and friends

hpstring t1_ja6uzk4 wrote on February 27, 2023 at 7:29 AM

Reply to comment by [deleted] in [D] Is RL dead/worth researching these days? by [deleted]

Understood. That depends on personal prediction of the research landscape in the future but I would say it is still researched by institutions like DeepMind. But both RL and LLM share a common aspect: they are very, very expensive.

hackinthebochs t1_ja6tpln wrote on February 27, 2023 at 7:12 AM

Reply to [R] Large language models generate functional protein sequences across diverse families by MysteryInc152

At what point do we stop calling it a language model?

IndieAIResearcher t1_ja6t4ba wrote on February 27, 2023 at 7:05 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

RL + NLP and RL + Vision would have some future, I guess. It would be an integral part.

walk-the-rock t1_ja6sp5s wrote on February 27, 2023 at 7:00 AM

Reply to comment by st8ic in [R] Large language models generate functional protein sequences across diverse families by MysteryInc152

Richard Socher ~~(chief scientist at salesforce)~~ is one of the world leaders in natural language processing within ML/AI

impossiblefork t1_ja6rt6s wrote on February 27, 2023 at 6:48 AM

Reply to comment by okokoko in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

I doubt it's possible, but I imagine something like [ed:the] DAN thing with ChatGPT.

Most likely you'd talk to the AI such that the rationality it has obtained from its training data make it reason things out that it's owner would rather it stay silent about it.

[deleted] OP t1_ja6r4xl wrote on February 27, 2023 at 6:40 AM

Reply to comment by hpstring in [D] Is RL dead/worth researching these days? by [deleted]

[deleted]

darthstargazer t1_ja6qf4v wrote on February 27, 2023 at 6:32 AM

Reply to comment by Alert_Ad2 in [D] Navigating Academic Conferences by MyActualUserName99

Jeez why so toxic

darthstargazer t1_ja6qchw wrote on February 27, 2023 at 6:31 AM

Reply to [D] Navigating Academic Conferences by MyActualUserName99

Haha I'm sure u will get some good advice! I would say 1. Enjoy the trip 2. If you are aiming postdocs time to do some networking. 3. Enjoy the free food! 4. If you don't understand much about what other papers are about don't stress 😊

hpstring t1_ja6pm05 wrote on February 27, 2023 at 6:22 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

Do you specifically mean applications in NLP? RL seems to have a lot of applications in fields like game playing, robotics, neural theorem proving, etc. which seems to have no direct connection with LLMs

CellWithoutCulture t1_ja6pjet wrote on February 27, 2023 at 6:21 AM

Reply to [D] Is RL dead/worth researching these days? by [deleted]

Seems more like an AskML question.

But RL is for situations when you can't backprop the loss. It's noisier than supervised learning. So if you can use supervised learning, then that's what you should generally use.

RL is still used, for example the recent GATO and Dreamer v3. Or used in training an LLM to use tools like in toolformer. And also OpenAI's famous RLHF, which stands for reinforcement learning with human feedback. This is what they use to make ChatGPT "aligned" although in reality it doesn't get there.

Smallpaul t1_ja6pbdt wrote on February 27, 2023 at 6:19 AM

Reply to comment by [deleted] in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

Replying to yourself doesn't get anyone's attention.

Recent comments in /f/MachineLearning