deep_noob t1_j6g9pgq wrote on January 30, 2023 at 2:47 AM

Reply to comment by bombay_doors in [D] CVPR Reviews are out by banmeyoucoward

three weak accepts, remain same after rebuttal

CriticalTemperature1 t1_j6g6xv5 wrote on January 30, 2023 at 2:29 AM

Reply to [D] AI Theory - Signal Processing? by a_khalid1999

The S4 Transformer uses structured state spaces which is a concept from EE that models the hidden state with differential equations. Seems to have SOTA results on a lot of tasks

halohalobeetch t1_j6g5hgf wrote on January 30, 2023 at 2:19 AM

Reply to [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

amazing

tysam_and_co OP t1_j6g5fb8 wrote on January 30, 2023 at 2:19 AM

Reply to comment by unhealthySQ in [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co

Thank you very much, I appreciate your kind words. Good luck to you in all of your future endeavors as well! :D :) <3 <3 :))))

unhealthySQ t1_j6g4zsz wrote on January 30, 2023 at 2:16 AM

Reply to comment by tysam_and_co in [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co

Thank you for the answer!
your work is highly impressive and I wish you continued success in your efforts; as I could see the work you do here having very appealing applications down the line.

Professional-Ebb4970 t1_j6g3lf2 wrote on January 30, 2023 at 2:07 AM

Reply to comment by Sofi_LoFi in [D] Remote PhD by TheRealMrMatt

Being a full-time commitment and being remote aren't mutually exclusive though

tysam_and_co OP t1_j6g3e49 wrote on January 30, 2023 at 2:05 AM

Reply to comment by unhealthySQ in [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co

Hello! Thanks so much for comment, I really appreciate it. This is a convnet-based architecture, so it's carrying on the torch of some of the old DawnBench entries.

Transformers have the best top-end of all of the neural networks, and convolutional networks tend to have an edge in the smaller/tiny regime, IIRC. One could maximize training speed for a transformer architecture, but the cost of just 1-2 layers could be several times the cost of an entire forward pass through this very tiny convnet. I even tried to just add a really tiny 16x16 attention multiply at the end of the network and it totally tanked the training speed.

However, that said, I'd really like to pick up the work of https://arxiv.org/abs/2212.14034 and continue from there, the concept of getting an algorithm to really compress that info can start opening up the horizon to some of the hard laws that underlie neural network training in the limit. For example, somewhere along the way now, apparently we have really strong consistency with scaling laws on the convnet for this project. I'm not sure why.

But in any case -- language models are hopefully next (if I get the time and have the interest/don't burn myself out on this project in the meantime!). I'll probably be focused on picking up some part-time research work in the field between here and then first, as that's my first priority right now (aside from a few community code contributions. This codebase is my living resume after all, and I think a good one at that! :D)

Hope that helped answer your question, and if not, please let me know and I'll give you my best shot! :D

unhealthySQ t1_j6g2h65 wrote on January 30, 2023 at 1:58 AM

Reply to comment by tysam_and_co in [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co

So just to be sure I read things correctly, this project is about optimizing training speed for Transformer neural networks?

Own_Quality_5321 t1_j6g2gk0 wrote on January 30, 2023 at 1:58 AM

Reply to comment by Fancy-Jackfruit8578 in [D] Remote PhD by TheRealMrMatt

It depends on the university.

Own_Quality_5321 t1_j6g29ka wrote on January 30, 2023 at 1:57 AM

Reply to comment by Sofi_LoFi in [D] Remote PhD by TheRealMrMatt

There are part-time PhD studentships. I'm pretty sure of that.

tysam_and_co OP t1_j6g0mvc wrote on January 30, 2023 at 1:46 AM

Reply to [R] Train CIFAR10 in under 10 seconds on an A100 (new world record!) by tysam_and_co

Hello everyone,

We're continuing our journey to training CIFAR10 to 94% in under 2 seconds, carrying on the lovely work that David Page began when he took that one single-GPU dawnbench entry from over 10 minutes to 24 seconds. Things are getting much, much tighter now as there is not as much left to trim, but we do have a "comfortable" road ahead still, provided enough sweat, blood, and tears are put in to make certain methods work under the (frankly ridiculous) torrent of information being squeezed into this network. Remember, we're breaking 90% having only seen each training set image 5 times during training. 5. times! Then 94% at 10 times. To me, that is hard to believe.

I am happy to answer any questions, please be sure to read the v0.3.0 patch notes if you would like a more verbose summary of the changes that we've made to bring this network from ~12.34-12.38 seconds in the last patch to ~9.91-9.96 seconds in the current one. The baseline of this implementation started at around ~18.1 seconds total, so incredibly we have almost halved our starting speed, and that is only within a few months of the project's start back in October/November of last year.

Please do ask or say anything if it's on your mind, this project hasn't gotten a lot of attention and I'd love to talk to some like-minded people about it. This is pretty darn cool stuff!

Many thanks,

Tysam&co

JohnConquest t1_j6fzh66 wrote on January 30, 2023 at 1:37 AM

Reply to comment by omgpop in [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

Thanks for the suggestion, just tried it out however and there seems to be a bug or two, one of which is where it loops the same subtitle over and over.

gunshoes t1_j6fyskw wrote on January 30, 2023 at 1:33 AM

Reply to comment by MrEloi in [P] AI Content Detector by YoutubeStruggle

Eh, depends on context. People forget that all the things that go into writing (drafting, rewriting, sounding words out to make sure they articulate what you mean), is a pedagogical act in itself. Assignments aren't supposed to be busy work, they're additional opportunities for learning in which students have to evaluate their own writing strategies. Using AI tools removes that element of metacognition and reduces assignments to just prompt tuning. If you're just filling out reports and are suffering writer's block, sure, why not. But other cases the writing process is the lesson.

Maleficent-Rate6479 t1_j6fx4hp wrote on January 30, 2023 at 1:21 AM

Reply to comment by RogerKrowiak in [D] Simple Questions Thread by AutoModerator

If your response variable is sex then you meed to make it binary, otherwise I do not see a problem I think.

JaCraig t1_j6fws2x wrote on January 30, 2023 at 1:18 AM

Reply to comment by royalemate357 in [P] AI Content Detector by YoutubeStruggle

Just adding on that I used ChatGPT and adding any sort of "write it in the style of X" to the end fools it. Tell it to do some run on sentences, etc. same thing.

[deleted] t1_j6fviej wrote on January 30, 2023 at 1:09 AM

Reply to [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

[removed]

[deleted] t1_j6fv3vy wrote on January 30, 2023 at 1:06 AM

Reply to [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

[deleted]

Drisku11 t1_j6fnn3p wrote on January 30, 2023 at 12:15 AM

Reply to comment by [deleted] in [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

You realize that's a dude, right?

idsardi t1_j6fjmjl wrote on January 29, 2023 at 11:49 PM

Reply to [D] Remote PhD by TheRealMrMatt

In addition to what others have said, many institutions have a "residency" requirement for the PhD, requiring at least one year of full-time on-campus study. Personally (and I am a department chair), I think that this all needs to be modernized, but I don't expect that to happen for several years yet, even though COVID. There are some online programs, but in terms of admissions you're competing against people who are happy to be on-campus, so whether you like it or not, the admissions committees are going to prefer those people over you.

[deleted] t1_j6fhotk wrote on January 29, 2023 at 11:36 PM

Reply to [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

[removed]

Fancy-Jackfruit8578 t1_j6ffvgh wrote on January 29, 2023 at 11:24 PM

Reply to [D] Remote PhD by TheRealMrMatt

Usually no, because phd students most of the most have to TA.

tealocked t1_j6ff2pz wrote on January 29, 2023 at 11:19 PM

Reply to [D] Meta AI Residency 2023 by BeautyInUgly

I've applied for the UK one aswell, also didn't hear anything yet..

MadScientist-1214 t1_j6fbt5k wrote on January 29, 2023 at 10:59 PM

Reply to [D] Remote PhD by TheRealMrMatt

Yes, but that depends on your supervisor. I did my PhD completely remotely for half a year but I'm not at a top institute.

mr_birrd t1_j6f7h26 wrote on January 29, 2023 at 10:33 PM

Reply to [D] AI Theory - Signal Processing? by a_khalid1999

Well like reinforcement learning uses a lot of markov chains, forward/backwards filtering/smoothing etc. Kalman filters are also a sort of Gaussian Process Regression. There is a huge overlap in the classical ML part with signal processing. No specific paper but it's just that ML and especially deep learning often takes already existing ideas from physics or ee and try to apply it on some data, see what happens.

gunshoes t1_j6f796o wrote on January 29, 2023 at 10:32 PM

Reply to [D] Remote PhD by TheRealMrMatt

It's not really a thing. Also kinda defeats the purpose of a PhD (research within an intellectual community of scholars). There's just too many variables across program needs and university funding limitations for it to be worth developing. Also, for a good number of people, you develop remote opportunities during the course of your PhD. Like mine is effectively remote in practice but that's just because of my research focus.

Recent comments in /f/MachineLearning