Recent comments in /f/MachineLearning
Feeling_Card_4162 OP t1_j742xzl wrote
Reply to comment by blimpyway in [R] Topologically evolving new self-modifying multi-task learning algorithms by Feeling_Card_4162
Sorry I don't think I understand your question.
blimpyway t1_j742oes wrote
Reply to [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta
I guess the point of the reward model is to approximate human feedback and instead of hiring humans to actually rank (e.g.) 1billion chats needed to update the LLM, train a reward model with 1% of them then use it to simulate human evaluators 99% of the times.
ggf31416 t1_j741sxn wrote
Reply to [D] PC takes a long time to execute code, possibility to use a cloud/external device? by Emergency-Dig-5262
One possibility is GPU acceleration using the cuML framework, but if you are must use a specific framework like sklearn it won't be feasible. https://medium.com/rapids-ai/accelerating-random-forests-up-to-45x-using-cuml-dfb782a31bea
There are some alternatives such as Google, AWS, Gradient, you may be able to get student credits. Also, even if you don't need a GPU, you can rent an instances with many CPU cores at Vast.ai for cheap (even with the GPU it's cheaper than a CPU only AWS instance with the same amount of cores), for example the cheapest instance with 16vCPU is < $0.20/hour and only needs a credit card. The main issue with vast.ai is that you should save your results before shutting down the instance because they are tied to the machine which may become unavailable.
blimpyway t1_j73zt6s wrote
Reply to [R] Topologically evolving new self-modifying multi-task learning algorithms by Feeling_Card_4162
So what they (it?) evolve for?
CatalyzeX_code_bot t1_j73ydy5 wrote
Reply to [R] Topologically evolving new self-modifying multi-task learning algorithms by Feeling_Card_4162
Found relevant code at https://github.com/marsggbo/automl_a_survey_of_state_of_the_art + all code implementations here
--
To opt out from receiving code links, DM me
[deleted] t1_j73pvyi wrote
Reply to comment by fuscarili in [D] I'm at a crossroads: Bayesian methods VS Reinforcement Learning, which to choose? by fuscarili
[removed]
Jurph t1_j73ozbe wrote
Reply to comment by [deleted] in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
Hey, I dove into "Progressive Growing of GANs" without knowing what weights were. And now here I am, four or five years later. I've trained my own classifiers based on ViTs, DNNs, written python interfaces for them, and I'm working on tooling to make Automatic1111's GUI behave better with Stable Diffusion. We've all got to start somewhere.
DM-me-ur-tits-plz- t1_j73n2dw wrote
Reply to comment by anananananana in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
When they originally went closed-source they claimed it was because of the dangers that being open-sourced presented.
About a year later they dropped their non-profit status and sold out to Microsoft.
Love the company, but that's some crazy double speak there.
mostlyhydrogen OP t1_j73k4xe wrote
Reply to comment by YOLOBOT666 in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
Not exactly. I have millions of points, most of which are not related to my query vectors. I want to iteratively refine my search: search, mark results as "relevant" or "irrelevant", repeat search with updated query.
[deleted] t1_j73h9fl wrote
[removed]
asarig_ OP t1_j73g1ne wrote
Reply to comment by gdpoc in [R] Graph Mixer Networks by asarig_
Thanks for your interest. If you open an issue on GitHub about this, I will keep it in mind as a reminder, and I can share pre-trained weights at the appropriate time.
juanigp t1_j73a6z4 wrote
Reply to comment by nicholsz in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
It was my grain of sand, self attention is a bunch of matrix multiplications. 12 layers of the same, it makes sense to understand why QK^t. If the question would have been how to understand maskrcnn the answer would have been different.
Edit: 12 layers in ViT base / BERT base
[deleted] t1_j739rea wrote
Reply to comment by ISitAndWatch in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
They just need to call it "money hype train, we will fix it as we go." If a company ran that honest PR campaign I'd be a customer.
[deleted] t1_j739mc4 wrote
Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
CLIPPY MAKES HIS GLORIOUS RETURN!!?!?!!!!
ALL HAIL CLIPPY THE AI SENTIENT SUPER GOD
seattleite849 OP t1_j737hp8 wrote
Reply to comment by swappybizz in [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
I got a bunch of credits from cloud hosting providers haha. Also since this is a beta I wanted a generous free tier. To connect with banana.dev, you would need to sign up for your own account and pass in your API key to the Python function that’s getting run on cakework. Thin
swappybizz t1_j736x59 wrote
Reply to comment by seattleite849 in [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
Wow! How do you manage to say afloat?
[deleted] t1_j736sam wrote
seattleite849 OP t1_j736n71 wrote
Reply to comment by swappybizz in [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
We are spinning up the serverless gpu hosting the model using banana.dev btw (which I’ve really liked so far). Cakework spins up CPU-only microVMs for now, since the Firecracker virtual machine monitor runs only on CPUs.
swappybizz t1_j735029 wrote
Reply to comment by seattleite849 in [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
You have a sign up!
seattleite849 OP t1_j734kwb wrote
Reply to comment by swappybizz in [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
Yup, that’s one of our examples! You can run this project to run a stable diffusion model on a serverless GPU: https://github.com/usecakework/cakework/tree/main/examples/image_generation
fermangas t1_j734867 wrote
Reply to comment by SimonJDPrince in [D] Understanding Vision Transformer (ViT) - What are the prerequisites? by SAbdusSamad
I was going to recommend this book. You beat me to it.
swappybizz t1_j7345xf wrote
gdpoc t1_j7337zm wrote
Reply to comment by asarig_ in [R] Graph Mixer Networks by asarig_
That is fascinating work.
I'd like to read the paper and will, given the time; are the results promising?
It seems reasonable that a graph with a small branching factor could reasonably replicate logarithmic search complexity of the input space to at least some extent; I'm very interested in exploring this space.
seattleite849 OP t1_j732oth wrote
Reply to comment by Noddybear in [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
🙌 heck yeah!
blimpyway t1_j744ina wrote
Reply to comment by Feeling_Card_4162 in [R] Topologically evolving new self-modifying multi-task learning algorithms by Feeling_Card_4162
If you use an evolutionary algorithm like NEAT what is the selection criteria?