chaosmosis t1_j4fpg6g wrote on January 15, 2023 at 11:30 AM

Reply to comment by jimmymvp in Why is Super Learning / Stacking used rather rarely in practice? [D] by Worth-Advance-1232

I'd love the reference if you can find it.

ml-research t1_j4fpav0 wrote on January 15, 2023 at 11:28 AM

Reply to [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue

Thanks for sharing!

> The website works by fetching new papers daily from arxiv.org, using PapersWithCode to filter out the most relevant ones.

What do you mean by "relevant"? What kinds of papers do you fetch?

IcySnowy t1_j4filbx wrote on January 15, 2023 at 9:58 AM

Reply to [D] Has ML become synonymous with AI? by Valachio

I agree, at first I thought AI consists of ML and DL, after reading the book Artificial Intelligence by Oxford I realized there is much bigger AI out there

derpderp3200 t1_j4fhekc wrote on January 15, 2023 at 9:42 AM

Reply to comment by Emphasises_Words in [R] from a human motion sequence, SUMMON synthesizes physically plausible and semantically reasonable objects by t0ns0fph0t0ns

I know, but the jitter combined with the colorless subsurface-scattered model make this look like straight up David Lewandowski videos

Emphasises_Words t1_j4fd7k6 wrote on January 15, 2023 at 8:46 AM

Reply to comment by derpderp3200 in [R] from a human motion sequence, SUMMON synthesizes physically plausible and semantically reasonable objects by t0ns0fph0t0ns

The movements are motion capture, not generated. They are performed by a real human.

jimmymvp t1_j4fcjly wrote on January 15, 2023 at 8:37 AM

Reply to comment by chaosmosis in Why is Super Learning / Stacking used rather rarely in practice? [D] by Worth-Advance-1232

Hm, I'm not sure about that. There's the mixture of experts idea that does not exactly stacking, but rather specializes multiple models to parts of the data so each data point gets assigned to a specific shallow model. What you need then is an assignment rule, mostly done by a classifier and it's been shown that this is cheaper in terms of compute at evaluation time. I'm not sure if the idea is abandoned by now, but Google Brain published a paper on this and there were subsequent works.

derpderp3200 t1_j4fa4wp wrote on January 15, 2023 at 8:05 AM

Reply to [R] from a human motion sequence, SUMMON synthesizes physically plausible and semantically reasonable objects by t0ns0fph0t0ns

Love how natural and human those movements seem.

Naive-Progress4549 t1_j4f9x8o wrote on January 15, 2023 at 8:02 AM

Reply to [D]: Are there models like CODEX but work in a reversed way? by GoodluckH

The professor Romain Robbes has a research group focusing on this, you might look at his papers or also contact him!

MrHumun t1_j4f8i6q wrote on January 15, 2023 at 7:44 AM

Reply to comment by Apprehensive-Tax-214 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214

Ah, so you guys are 24x7 running a GPU server? isn't cost too high in this way? What you do when things are stale?

nullbyte420 t1_j4f72e8 wrote on January 15, 2023 at 7:26 AM

Reply to comment by NoPause9252 in [D] Is MusicGPT a viable possibility? by markhachman

Guy doesn't know anything about it. There are many famous copyright claim lawsuits in music. Chuck Berry vs The beatles is a cool one I think. Lana del Rey vs I can't remember is a more recent case 🙂 I'm sure you can find a list of famous copyright cases in music.

eyeofthephysics t1_j4f2w85 wrote on January 15, 2023 at 6:36 AM

Reply to comment by IamTimNguyen in [R] Greg Yang's work on a rigorous mathematical theory for neural networks by IamTimNguyen

>u/IamTimNguyen

Hi Tim, just to add on to your comment, Sho Yaida (one of the co-authors of PDLT) also wrote a paper on the various infinite width limits of neural nets, https://arxiv.org/abs/2210.04909. He was able to construct a family of infinite width limits and show that in some of them there is representation learning (and he also found agreement with Greg's existing work).

m98789 t1_j4f27pu wrote on January 15, 2023 at 6:28 AM

Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691

Tokenizer is also my guess

WigglyHypersurface t1_j4f1r8b wrote on January 15, 2023 at 6:23 AM

Reply to [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691

Did you forget to change the tokenizer?

m98789 t1_j4f135j wrote on January 15, 2023 at 6:16 AM

Reply to comment by GoodluckH in [D]: Are there models like CODEX but work in a reversed way? by GoodluckH

Got it, this is how I believe it was implemented:

Stage 0: All code was split into chunks and had their embeddings taken, and saved into one table for lookups, e.g., code in one field and embedding in the adjacent field.
Stage 1: semantic search to find code. Take your query and encode it into an embedding. Then apply dot product over all the code embeddings in the table to find semantically similar code chunks.
Stage 2: combine all the top-K similar chunks into one string or list we can call the “context”.
Stage 3: stuff the context into a prompt as a preamble, then append the actual question you want to ask.
Stage 4: execute the prompt to a LLM like gpt-3 and collect the answer and show it to the user.

GoodluckH OP t1_j4evyhl wrote on January 15, 2023 at 5:23 AM

Reply to comment by m98789 in [D]: Are there models like CODEX but work in a reversed way? by GoodluckH

https://twitter.com/hwchase17/status/1611071272301260801?s=20&t=WFa0awEG43KTXfwV-Mb49Q

m98789 t1_j4eutfz wrote on January 15, 2023 at 5:13 AM

Reply to comment by GoodluckH in [D]: Are there models like CODEX but work in a reversed way? by GoodluckH

Can you please link me to the tweet you are referring to?

From my understanding of Q&A from LangChain is it can answer “what” questions like “What did XYZ say…” but not “why” because the “what” questions are really just text similarity searching.

But maybe there is more to it, so I’d like to see the tweet.

GoodluckH OP t1_j4espcc wrote on January 15, 2023 at 4:53 AM

Reply to comment by m98789 in [D]: Are there models like CODEX but work in a reversed way? by GoodluckH

Wow, that's really cool. But I can actually ask things like "what does XYZ do?", and it can give me some explanations like ChatGPT.

Clearly, they are using more than OpenAI's embedding to make this possible. I read if from Twitter that GPTDuck also uses LangChain which I'm not so familiar with.

Any idea how they're able to go from advanced search to conversational?

thank you for your insight!

throwaway2676 t1_j4eqxk2 wrote on January 15, 2023 at 4:39 AM

Reply to comment by MegavirusOfDoom in [D] Simple Questions Thread by AutoModerator

GPT-4 should be coming out, right?

GasZealousideal8691 OP t1_j4eo0ov wrote on January 15, 2023 at 4:17 AM

Reply to comment by CKtalon in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691

Oh sorry if that wasn’t clear, but the stuff I’m training on isn’t code, it’s natural language.

CKtalon t1_j4enpew wrote on January 15, 2023 at 4:14 AM

Reply to [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691

GPT2 was trained on a different dataset, with little code (other than those obtained from the CommonCrawl). GPT Neo uses The Pile which contains a lot of code.

H_P_D t1_j4e908a wrote on January 15, 2023 at 2:21 AM

Reply to [D] Is MusicGPT a viable possibility? by markhachman

It's definitely a viable possiblity, and there's quite a few companeis already doing it. If you want to explore doing it yourself, I'd check out https://web.mit.edu/music21/ and build some basic models using LSTM etc. to have some fun using open source MIDI data sets like https://magenta.tensorflow.org/datasets/maestro .