Recent comments in /f/MachineLearning
junetwentyfirst2020 t1_j4h8p07 wrote
Reply to comment by 400Volts in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
If you want a job with the title research in it, then you are 99% going to need top tier conference publications in your masters. Even one ICCV, ECCV, CVPR should be enough, but they are very competitive. I wish I knew that a masters was different from an undergrad because I was completely unready.
I’d suggest reading some research papers to gauge your math, especially. All of Computer Science for ML/DL is basically applied math contributions. Look up the papers noted in the course CS231N and if you can’t get through them, then you need to improve your math skills. I wish someone told me this before my masters because my math sucked and it held me back significantly, and it’s hard to try to both do a masters and then play catch up on math because the masters itself is a lot of work.
I have an undergrad and masters in CS, thesis on DL, and 3.5 years industry experience as a Machine Learning/Computer Vision Engineer and I don’t even both applying for jobs that say Research in the title because everyone in the world with a pub is applying for those same jobs.
You can do it if your math is solid (linear algebra, calculus, and probability), knowing how to code is needed but not the most needed thing and you can tell my the horrible research code out there, so don’t rely solely on your software engineering skills.
400Volts OP t1_j4h5ne2 wrote
Reply to comment by junetwentyfirst2020 in Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Not yet, I'm trying to gather as much information as possible to make the best move career-wise
SnooChipmunks2237 t1_j4h55pg wrote
Reply to Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Messaged you directly but for others awareness we are hiring and we do very cutting edge research in AI. If you are interested I ask that you reach out to me
junetwentyfirst2020 t1_j4h4p4s wrote
Reply to Is a MSc in ML and industry experience enough for an ML Research Engineer position? [D] by 400Volts
Do you already have a masters or not?
WigglyHypersurface t1_j4gzweu wrote
Reply to comment by GasZealousideal8691 in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
The GPT2 LM is causal. If you do AutoModelForCausalLM with gpt2 it works fine.
WigglyHypersurface t1_j4gzjvd wrote
Reply to comment by GasZealousideal8691 in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
If you're messing with the weights that deeply and directly I'm not sure. But it smells like a bug to me.
GasZealousideal8691 OP t1_j4gst0f wrote
Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
Dont think there is a causal version for GPT2
GasZealousideal8691 OP t1_j4gs8gc wrote
Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
But would it affect it to this extent? To be clear, this is not just "bad performance", or "horrendous performance". Our project is loosely investigating the performance of different editing methods on LMs given some datasets we made, and none of the editing methods, from fine-tuning to gradient-methods, change the performance at all.
Furthermore, GPT2 outputs an equal accuracy and specificity values (specificity is basically the degree to which it "remembers" other unrelated facts; the goal here is to minimize catastrophic forgetting), which makes absolutely 0 sense, because they aren't even measured on the same scale. Accuracy is usually >0, <1 and specificity is usually ~26 based on our measures.
It doesn't have anything to do with the way accuracy/specificity are computed, because the code for GPT-Neo is identical minus the model= and tokenizer= statements, and it works fine for GPT-Neo. So there is something fundamentally crazy going on with GPT2...
WigglyHypersurface t1_j4grftr wrote
Reply to comment by GasZealousideal8691 in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
I think those are the same but make both the causal version and see.
WigglyHypersurface t1_j4gr979 wrote
Reply to comment by GasZealousideal8691 in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
The amount of code in the training data might effect specific task performance, even if the task itself involves no code. Seems to maybe be particularly the case for tasks requiring attention to long range dependencies and abstract reasoning.
GasZealousideal8691 OP t1_j4gpu8j wrote
Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
GPT Neo is GPTNeoForCausalLM, and GPT2 is GPT2LMHeadModel. Like I said, I am not 100% familiar with these, but the huggingface docs listed both as “GPT-neo/GPT2 with an LM head”, so I figured they were analogous.
WigglyHypersurface t1_j4gpm5i wrote
Reply to comment by GasZealousideal8691 in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
What kind of head is on the models for the task?
RuairiSpain t1_j4gk0re wrote
Reply to comment by hisglasses66 in [News] "Once $92 billion in profit plus $13 billion in initial investment are repaid (to Microsoft) and once the other venture investors earn $150 billion, all of the equity reverts back to OpenAI." by Gmroo
Search and integration into Office products would be big revenue generators. Killing Google and revenue would be a double whammy for the Tech sector, it would destabilise a main competitor and put MS at the front of the Tech arms race for the next decade or two.
I foresee Google losing search market share, which is looking more and more likely, given their terrible search results and spammed too results. That leaves Google with Android and Youtube, which are dependant on a good search engine for revenue.
If MS can move the needle on Bing market share, it could bring them back into the B2C market.
Imaging ChatGPT integrated into Word, PowerPoint, Excel and SharePoint! It would be middle managers wet dream to waste even more time on documents and paperwork 😜
RuairiSpain t1_j4giuhl wrote
Reply to comment by The-Unstable-Writer in [News] "Once $92 billion in profit plus $13 billion in initial investment are repaid (to Microsoft) and once the other venture investors earn $150 billion, all of the equity reverts back to OpenAI." by Gmroo
Do you think ChatGPT will be able to fix the ambiguity in later responses? And improve the partial gibberish that it can add?
I'm not sure people have looked closely at the ChatGPT semantics. To debug where the model goes wrong when it adds gibberish, is a big step in ML. The first hurdle is to get explainability into the model results. I've not wee much discussion on this 2ith ChatGPT
chaitanya1123 t1_j4gg8fx wrote
Reply to comment by Iunaml in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Why use many word when few word do trick
niclas_wue OP t1_j4g8v9w wrote
Reply to comment by Iunaml in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Haha, sounds good, make sure to send me an invite :)
GasZealousideal8691 OP t1_j4g8djf wrote
Reply to comment by WigglyHypersurface in [D] Is there any reason hugging face GPT2 would behave (fundamentally) differently from GPT-Neo? by GasZealousideal8691
No, both use the GPT2 tokenizer. GPT-Neo uses GPT2Tokenizer.from_pretrained(‘EleutherAI/gpt-neo-1.3B)”, and GPT2 uses GPT2Tokenizer.from_pretrained(‘gpt2-xl’).
Iunaml t1_j4g88q7 wrote
Sometime I wish we could just write papers directly in a summarized manner (and not automatically)
I'm starting my bullet-point conference soon..
niclas_wue OP t1_j4fyzia wrote
Reply to comment by transgalpower in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Yes, in the long run, there needs to be some sort of monetization to afford the API tokens. For now, I just want to see if people find it useful at all.
Thanks for letting me know, for me it works on mobile, but I will look into that.
transgalpower t1_j4fwztn wrote
It coule be nice if u ket people donate compute power. That way as a comunity we could keep it runing.
Also would be nice if it worked on mobile. Idk why but it says the domain isnt safe
Unlikely-Advice-7168 t1_j4fuzop wrote
Reply to comment by Apprehensive-Tax-214 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214
very cool. Trying to log in to it now and when I click "sign in with github" the link changes to this: https://tts.themetavoice.xyz/?error=server_error&error_description=Database+error+saving+new+user
Apprehensive-Tax-214 OP t1_j4ftp26 wrote
Reply to comment by MrHumun in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214
Nope, a new container is spun up when someone sends a request and spun down when the request is over. This is why we're able to provide at-cost and charge per-second.
niclas_wue OP t1_j4fqqy6 wrote
Reply to comment by ml-research in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Thanks for asking! My first prototype collected all new arxiv papers in certain ML-related categories via the API, however I quickly realized that this would be way to costly. Right now, I collect all papers from PapersWithCode's "Top" (last 30 days) and the "Social" Tab, which is based on Twitter likes and retweets. Finally, I filter using this formula:
p.number_of_likes + p.number_of_retweets > 20 or p.number_github_stars > 100
In rare cases, when the paper is really long or not parsable with "grobid", I will exclude the paper for now.
ApolloniusOfPerga420 t1_j4fqa2g wrote
You could probably just do this with Codex. It’s zero-shot performance is very high.
blimpyway t1_j4hbgtt wrote
Reply to comment by derpderp3200 in [R] from a human motion sequence, SUMMON synthesizes physically plausible and semantically reasonable objects by t0ns0fph0t0ns
> David Lewandowski videos It looks like they search for some furniture