mr_house7 t1_j7pgmqm wrote on February 8, 2023 at 2:07 PM

Reply to comment by sonofmath in [D] List of RL Papers by C_l3b

This really good, thanks!

UnusualClimberBear t1_j7pdue6 wrote on February 8, 2023 at 1:45 PM

Reply to comment by EmbarrassedFuel in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

This is because the information is in the books.

(free online) http://www.cds.caltech.edu/~murray/amwiki/index.php/Main_Page

https://www.amazon.com/Modern-Control-Systems-12th-Edition/dp/0136024580

Yet nonlinear breaks everything there. The usual approach is to linearize at well-chosen positions and compute the control using the closest linearization.

FHIR_HL7_Integrator t1_j7pcenv wrote on February 8, 2023 at 1:33 PM

Reply to comment by NoSleep19 in [D] Should I focus on python or C++? by NoSleep19

I wouldn't worry so much about how much you need to know to be good. Sometimes you need to know something enough to get a job done which is how I think of Python. Reality is python is a general purpose language, it's useful in about a million different ways and it's really not a difficult language. Everyone should have python in their toolbox. So just start using python. Use it for fun little projects that aren't necessarily school work. C++ is probably going to be the more difficult of the two. C++ is what you'll want to do when you are writing anything that needs to be really fast.

I'd take the classes in C++ and teach myself python through fun projects

Someone who is a skilled with C++ is an asset and often a useful and desirable part of a research or implementation team.

wonderingandthinking OP t1_j7pbiy7 wrote on February 8, 2023 at 1:25 PM

Reply to comment by Dr_Love2-14 in [Discussion] Is ChatGPT and/or OpenAI really the leader in the space? by wonderingandthinking

Purposefully vaguely defined so that I increase the chances of getting an answer like this. Thanks for the info. And don’t underestimate or undervalue something that appears not well thought out or developed.

nmfisher t1_j7pawou wrote on February 8, 2023 at 1:20 PM

Reply to comment by gunshoes in [D] Which is the fastest and lightweight ultra realistic TTS for real-time voice cloning? by akshaysri0001

That's why I added the qualifier "good" :)

gunshoes t1_j7p91py wrote on February 8, 2023 at 1:03 PM

Reply to comment by nmfisher in [D] Which is the fastest and lightweight ultra realistic TTS for real-time voice cloning? by akshaysri0001

You can throw GasTs or use a speaker embedding to influence the energy/ pitch outputs. The sound is meh but it works.

[deleted] t1_j7p8yxl wrote on February 8, 2023 at 1:03 PM

Reply to [D] Image object detection, but for 1 dimensional data? by Optoplasm

[deleted]

sodafizzer77 t1_j7p74jr wrote on February 8, 2023 at 12:45 PM

Reply to [N] Microsoft announces new "next-generation" LLM, will be integrated with Bing and Edge by currentscurrents

Ha ha ha ha ha...wow the power of bing & edge......dude Microsoft stop. you lost.....

NoSleep19 OP t1_j7p57ga wrote on February 8, 2023 at 12:26 PM

Reply to comment by FHIR_HL7_Integrator in [D] Should I focus on python or C++? by NoSleep19

just one question, how much python do I need to know to be considered good? does that mean every popular feature + design patterns? vice versa in C++?

IamNotMike25 t1_j7p535w wrote on February 8, 2023 at 12:25 PM

Reply to comment by Mescallan in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada

chatGPT playground is (currently) free,
chatGPT API has 18$ of free credits:

https://community.openai.com/t/chatgpt-usage-limits/23920

EmbarrassedFuel OP t1_j7p519o wrote on February 8, 2023 at 12:24 PM

Reply to comment by jimmymvp in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

Basically given some predicted environment state, going forward for say 100 time steps, we need to find an optimal cost course of action. Although the environment state has been predicted, for the purposes of this task the agent can consider it deterministic. The agent has one variable of internal state and can take actions to increase or decrease this value based on interactions with the environment. We can then calculate the new cost over the given time horizon by simulating the actions chosen at each step, but this simulation is fundamentally sequential and wouldn't allow backpropagation of gradients.

>you can go with sampling approaches

What exactly do you mean by this? something like REINFORCE?

> I guess it is if you're using a MILP approach.

Not sure I follow here, but I'm not using a MILP (as in mixed integer linear program). At the moment I'm using a linear programming approximation and heuristics, which doesn't generalize well.

> some combination of MCTS with value function learning

I think this could work, however without looking into it I'm not sure that it would work at inference time in my resource-constrained setting

MonsieurBlunt t1_j7p4o0j wrote on February 8, 2023 at 12:21 PM

Reply to [Discussion] Cognitive science inspired AI research by theanswerisnt42

Neural networks was a successful idea

C_l3b OP t1_j7p45n8 wrote on February 8, 2023 at 12:15 PM

Reply to comment by sonofmath in [D] List of RL Papers by C_l3b

Thanks!

EmbarrassedFuel OP t1_j7p40eo wrote on February 8, 2023 at 12:13 PM

Reply to comment by UnusualClimberBear in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

oh also the model needs to run at inference time in a relatively short period of time on cheap hardware :)

EmbarrassedFuel OP t1_j7p3xc1 wrote on February 8, 2023 at 12:13 PM

Reply to comment by UnusualClimberBear in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

I haven't been able to find anything about optimal control with all of:

non-linear dynamics/model
non-linear constraints
both discrete and continuously parameterized actions in the output space

but in general, discovery of papers/techniques in control theory seems to be much harder for some reason

emerging-tech-reader t1_j7p3gn4 wrote on February 8, 2023 at 12:07 PM

Reply to comment by WokeAssBaller in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada

> Please without the transformer we would never be able to scale,

Without back propagation we wouldn't have transformers. 🤷‍♂️

spacetimefrappachino t1_j7p14z7 wrote on February 8, 2023 at 11:41 AM

Reply to comment by sonofmath in [D] List of RL Papers by C_l3b

This is an incredible resource, thank you!

Impossible-Manager-7 t1_j7p13qz wrote on February 8, 2023 at 11:40 AM

Reply to [D] What do you think about this 16 week curriculum for existing software engineers who want to pursue AI and ML? by Imaginary-General687

Where’s this course being offered?

mikljohansson t1_j7p0o1o wrote on February 8, 2023 at 11:35 AM

Reply to comment by ramv0001 in [Discussion] Best practices for taking deep learning models to bare metal MCUs by ramv0001

Nope, haven't used any emulators for this project. The ESP32 hardware I've been using is so cheap and convenient to use that there's been no need

sonofmath t1_j7p0ml4 wrote on February 8, 2023 at 11:35 AM

Reply to [D] List of RL Papers by C_l3b

Not up to date, but a solid basis is Spinning Up

DustinKli t1_j7p03xt wrote on February 8, 2023 at 11:28 AM

Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

I'm still waiting for GPT to be integrated into EXCEL.

None_365 t1_j7oyr25 wrote on February 8, 2023 at 11:11 AM

Reply to [D] List of RL Papers by C_l3b

You should create a github repo first.

jimmymvp t1_j7oybk9 wrote on February 8, 2023 at 11:05 AM

Reply to Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel

Ok, first off, I'm very curious what's the actual problem that you're solving. Can you describe it a bit more in detail or give a link?

If you have a perfect model that's cheap to compute, you can go with sampling approaches, I don't know how your constraints look like though. If your state/action space is too big, you might want to reduce it somehow by learning an embedding.

Is the model differentiable? I guess it is if you're using a MILP approach.

I guess some combination of MCTS with value function learning is plausible if your search space is big, such as it's done with alpha zero etc. I find the hybrid aspect of it very interesting though. It sounds like if you want to do amortized search, you need to combine MCTS and search in continuous space (sampling). Should be simple enough with a perfect model. Probably some ideas from mu zero would come in handy.

C_l3b OP t1_j7oy3z4 wrote on February 8, 2023 at 11:02 AM

Reply to comment by JuryOk5543 in [D] List of RL Papers by C_l3b

I took courses about Machine Learning and Deep Learning at uni.

JuryOk5543 t1_j7oxtz8 wrote on February 8, 2023 at 10:58 AM

Reply to [D] List of RL Papers by C_l3b

That kind of depends where you're starting from. What level are you at now?

Recent comments in /f/MachineLearning