Recent comments in /f/MachineLearning

junetwentyfirst2020 t1_j4h8p07 wrote

If you want a job with the title research in it, then you are 99% going to need top tier conference publications in your masters. Even one ICCV, ECCV, CVPR should be enough, but they are very competitive. I wish I knew that a masters was different from an undergrad because I was completely unready.

I’d suggest reading some research papers to gauge your math, especially. All of Computer Science for ML/DL is basically applied math contributions. Look up the papers noted in the course CS231N and if you can’t get through them, then you need to improve your math skills. I wish someone told me this before my masters because my math sucked and it held me back significantly, and it’s hard to try to both do a masters and then play catch up on math because the masters itself is a lot of work.

I have an undergrad and masters in CS, thesis on DL, and 3.5 years industry experience as a Machine Learning/Computer Vision Engineer and I don’t even both applying for jobs that say Research in the title because everyone in the world with a pub is applying for those same jobs.

You can do it if your math is solid (linear algebra, calculus, and probability), knowing how to code is needed but not the most needed thing and you can tell my the horrible research code out there, so don’t rely solely on your software engineering skills.

3

GasZealousideal8691 OP t1_j4gs8gc wrote

But would it affect it to this extent? To be clear, this is not just "bad performance", or "horrendous performance". Our project is loosely investigating the performance of different editing methods on LMs given some datasets we made, and none of the editing methods, from fine-tuning to gradient-methods, change the performance at all.

Furthermore, GPT2 outputs an equal accuracy and specificity values (specificity is basically the degree to which it "remembers" other unrelated facts; the goal here is to minimize catastrophic forgetting), which makes absolutely 0 sense, because they aren't even measured on the same scale. Accuracy is usually >0, <1 and specificity is usually ~26 based on our measures.

It doesn't have anything to do with the way accuracy/specificity are computed, because the code for GPT-Neo is identical minus the model= and tokenizer= statements, and it works fine for GPT-Neo. So there is something fundamentally crazy going on with GPT2...

1

RuairiSpain t1_j4gk0re wrote

Search and integration into Office products would be big revenue generators. Killing Google and revenue would be a double whammy for the Tech sector, it would destabilise a main competitor and put MS at the front of the Tech arms race for the next decade or two.

I foresee Google losing search market share, which is looking more and more likely, given their terrible search results and spammed too results. That leaves Google with Android and Youtube, which are dependant on a good search engine for revenue.

If MS can move the needle on Bing market share, it could bring them back into the B2C market.

Imaging ChatGPT integrated into Word, PowerPoint, Excel and SharePoint! It would be middle managers wet dream to waste even more time on documents and paperwork 😜

1

RuairiSpain t1_j4giuhl wrote

Do you think ChatGPT will be able to fix the ambiguity in later responses? And improve the partial gibberish that it can add?

I'm not sure people have looked closely at the ChatGPT semantics. To debug where the model goes wrong when it adds gibberish, is a big step in ML. The first hurdle is to get explainability into the model results. I've not wee much discussion on this 2ith ChatGPT

1

niclas_wue OP t1_j4fqqy6 wrote

Thanks for asking! My first prototype collected all new arxiv papers in certain ML-related categories via the API, however I quickly realized that this would be way to costly. Right now, I collect all papers from PapersWithCode's "Top" (last 30 days) and the "Social" Tab, which is based on Twitter likes and retweets. Finally, I filter using this formula:

p.number_of_likes + p.number_of_retweets &gt; 20 or p.number_github_stars &gt; 100

In rare cases, when the paper is really long or not parsable with "grobid", I will exclude the paper for now.

10