Recent comments in /f/MachineLearning
KBM_KBM t1_j4j10y6 wrote
You can pre train and finetune energy efficient language models such as electra or convbert in this gpu. But maybe the batch size might not be too big so the descent would be a bit noisy and also keep the corpus size as small as possible.
Look into bio electra paper which also has the notebook on how he has trained it .
akacukiii t1_j4ixh08 wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hi. I'm an international grad student in the US and am looking for an internship for the summer. Please, if you have some tips, or if you care to have a look at my profile, just let me know. Thank you!
derekis1joedirt t1_j4ix32w wrote
Reply to comment by Iunaml in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
I've been wanting to build a python script just to do what you mentioned.
boo5000 t1_j4ip4x6 wrote
Reply to comment by Reasonable_Ladder922 in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Good GPT-3 bot.
(lmao checkout comment history)
currentscurrents t1_j4ijvez wrote
Reply to comment by RuairiSpain in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
A Snappy Headline Is All You Need
currentscurrents t1_j4ijlqv wrote
You can fine-tune image generator models and some smaller language models.
You can also do tasks that don't require super large models, like image recognition.
>that's beyond just some toy experiment?
Don't knock toy experiments too much! I'm having a lot of fun trying to build a differentiable neural computer or memory-augmented network in pytorch.
Unlikely-Advice-7168 t1_j4igq4x wrote
Reply to [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214
Anyone else getting this error where it directs here:
https://tts.themetavoice.xyz/?error=server_error&error_description=Database+error+saving+new+user
T1fa_nug t1_j4idpif wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hello guys I'm new in the machine learning and I wanted to know if a i5 8th gen and a 1060 6 gb paired with 16 Gb of ram are they enough for any type work that could come my way??!
RuairiSpain t1_j4ib35z wrote
Reply to comment by niclas_wue in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Wow, looking forward to seeing more cool stuff š
niclas_wue OP t1_j4i9r9w wrote
Reply to comment by RuairiSpain in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Thanks for your ideas. Building a paid experience for companies is a great idea, I will consider it.
Category tagging like ācomputer visionā, ānatural language processingā etc. should be relatively straightforward. Will implement this in the next couple of days :)
More paper specific tags could be generated using GPT-3, I think that would make sense, when the database is a bit larger. Right now, I would guess that most tags would be unique to a single paper.
RuairiSpain t1_j4i4t54 wrote
Reply to comment by niclas_wue in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
I left academia in the 1990s. When did paper titles becomes so vague? "In my day", you had a good idea what the paper was about just from the title. Reading the first 30-40 papers here, what are authors trying to do? Be comedians?
I need a more up-to-date buzzword thesaurus of research fields and fashions, so I can interpret the context/semantics of these titles! I feel old š«
RuairiSpain t1_j4i31v5 wrote
Reply to comment by RuairiSpain in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Feedback, any way you could add category tagging to papers? And have links to related tags?
RuairiSpain t1_j4i2q1b wrote
I suspect companies would pay a subscription for this! Individuals no, especially me ;)
Great work, thank you.
jimmymvp t1_j4hy6xm wrote
Ok-Range1608 t1_j4hs70o wrote
Currently none are available. I am eagerly waiting for them too. https://medium.com/p/c24f1b22a235
junetwentyfirst2020 t1_j4hqfvl wrote
Reply to comment by blacksnowboader in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Stanford university course taught by Andrew Karpathy. Itās a little older now but I do think it covers important material. You can find it on YouTube
blacksnowboader t1_j4hpx5o wrote
Reply to comment by junetwentyfirst2020 in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Quick question, what is CS231N? And where is that course taken?
GasZealousideal8691 OP t1_j4hk6kz wrote
Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
Im fairly certain itās something with the model. Like even fine tuning is giving these weird errors, when it had no problems for GPT-Neo.
We also ran this stuff on T5, obviously had to configure the rest of the code differently but it was doing fine for that as well.
JobPsychological5509 t1_j4hjk93 wrote
Reply to comment by I-am_Sleepy in [D] Simple Questions Thread by AutoModerator
Thanks. Time series classification is what I was looking for.
Thanks for letting me know about model stacking, I have never done it but will try and see which fits the best.
Cheers!
GoodluckH OP t1_j4hiypg wrote
Reply to comment by m98789 in [D]: Are there models like CODEX but work in a reversed way? by GoodluckH
Ahh this makes a lot of sense. Regarding stage 0, how do you split codes? Like just by lines or have some methods to extract functions and classes?
I wrote some script that allows you to extract Python functions using regex, but this is def not scalable to other languagesā¦
Reasonable_Ladder922 t1_j4hewm2 wrote
Your arxiv-summary.com project sounds like a great idea and a very useful tool for people in the field of machine learning. It's great that you're using PapersWithCode to filter out the most relevant papers, and that you're using GPT-3 to summarize the papers' sections and subsections.
The fact that the website is able to fetch new papers daily and parse their pdf and LaTeX source code to extract relevant sections and subsection, and then summarize those with GPT-3, it will make it more accessible for people to quickly understand the main ideas and contributions from the abstract.
It's great to hear that you have a search page and an archive page where users can get a chronological overview, this will help people to keep track of new publications in their field.
I wish you the best of luck with your project and I'm sure it will be a great resource for many people in the field of machine learning.
400Volts OP t1_j4helhr wrote
Reply to comment by junetwentyfirst2020 in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
This is golden thank you
[deleted] t1_j4he8ri wrote
Reply to comment by derpderp3200 in [R] from a human motion sequence, SUMMON synthesizes physically plausible and semantically reasonable objects by t0ns0fph0t0ns
[removed]
derpderp3200 t1_j4he7pe wrote
Reply to comment by blimpyway in [R] from a human motion sequence, SUMMON synthesizes physically plausible and semantically reasonable objects by t0ns0fph0t0ns
Dear god how I'd love to see that ran through the SUMMON code. What eldritch furnishings are they walking over? Stumbling upon??
[deleted] t1_j4j4kex wrote
Reply to [D] Simple Questions Thread by AutoModerator
[deleted]