maizeq t1_j69vuec wrote on January 28, 2023 at 8:18 PM

Reply to comment by Taenk in [R] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot by Secure-Technology-78

Nice. How are you converting between dataset size and number of tokens?

Doesn’t common crawl get deduplicated and that’s why the number of usable tokens decreases - or is it also curation? How much of that 380TiB is actually utilisable.

Given the ostensibly impressive performance of the bilingual GLM-130B (Chinese+English) model that came out of Tsinghua university that might very well be the case.

pandasiloc t1_j69uj8v wrote on January 28, 2023 at 8:09 PM

Reply to comment by Featureless_Bug in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

The human brain doesn’t work like this. It’s not a question about “being smart” or simply having learned something previously. In order to perform an implementation of this on the spot in a stressful situation, the relevant theory needs to be very fresh in your memory.

I highly doubt you would be able to reproduce a proof of the Fundamental Theorem of Algebra on the spot, even though it’s a simple concept that many people learn in middle school.

I would probably fail this question because I haven’t worked with deep learning much since I graduated 4 years ago. I majored in math at an Ivy League school and graduated with a pretty good GPA, so I don’t think my math is ‘weak’, either.

This kind of question does not make sense to ask on a live call unless someone claims to be working with deep learning architectures as part of their daily work.

jobeta t1_j69u8a4 wrote on January 28, 2023 at 8:07 PM

Reply to [P] Launching my first ever open-source project and it might make your ChatGPT answers better by Vegetable-Skill-9700

Sweet! That dashboard looks really nice too!

currentscurrents OP t1_j69u2gb wrote on January 28, 2023 at 8:06 PM

Reply to comment by Lord_of_Many_Memes in [D] Could forward-forward learning enable training large models with distributed computing? by currentscurrents

> I tried that on gpt and wikitext it just doesn’t converge on real problems

Would you be able to share your code? How were you generating negative data?

STEMeducator1 t1_j69tx9w wrote on January 28, 2023 at 8:05 PM

Reply to comment by Dontgooo in [R] META presents MAV3D — text to 3D video by SpatialComputing

It'll literally be like lucid dreaming.

Mechanical_Number t1_j69tl7z wrote on January 28, 2023 at 8:02 PM

Reply to [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

I don't think it is great question generally but this also depends on the position, the level of technical aptitude and seniority expected.

It is not something I would expect most people to rock up with in 15' while someone is looking over their shoulder. Probably it is a question that will help me to distinguish a kick-ass junior than someone who has a standard Keras syntax understanding but for more seniors roles this is likely a bad indicator. Far less senior engineer tasks fail because the person couldn't code backprop from scratch than because the wrong architecture was chosen, or they didn't know what part of an existing pipeline to optimise, or where to look for potential bugs given a particular unexpected behaviour, etc.

In general, I think it was more a point of "showing your thought process" than actually getting the code right. I would "abstract" things quite a bit first and then "start coding". But as others said, if they absolutely need to "split hairs" that is a way to do it too.

Best of luck with your interview in any case!

xorbinant_ranchu t1_j69tasv wrote on January 28, 2023 at 8:00 PM

Reply to comment by Featureless_Bug in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

Would be interested to know what kind of experience you have?

I think literally none of the ML engineers I work with (myself very much included) could pull a chain rule implementation out in 10 mins.

90% of this job is just finding an existing implementation of something to make work.

Lord_of_Many_Memes t1_j69sr57 wrote on January 28, 2023 at 7:56 PM

Reply to [D] Could forward-forward learning enable training large models with distributed computing? by currentscurrents

my general feeling is even if it works, it will take more steps to get to the same loss than backprop, which in some sense cancel out the hardware advantage of the forward forward setting. I tried that on gpt and wikitext it just doesn’t converge on real problems, maybe something crucial is still missing.

TywinASOIAF t1_j69sa8l wrote on January 28, 2023 at 7:53 PM

Reply to [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

Stupid question. There is no time to code whole NN from scratch that actually can optimize a function.
Interviewer should ask use pytorch/tf to code a nn frame.

Hannekiii t1_j69peit wrote on January 28, 2023 at 7:33 PM

Reply to comment by deathtosquishy in [R] META presents MAV3D — text to 3D video by SpatialComputing

Probably not is the answer

vivehelpme t1_j69okrj wrote on January 28, 2023 at 7:27 PM

Reply to [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

Just have chatGPT open at a side monitor and type in the prompt on a silent keyboard.

OkAssociation8879 OP t1_j69oa79 wrote on January 28, 2023 at 7:25 PM

Reply to comment by Featureless_Bug in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

You are right. Interviewers generally ask about backpropagation. They should definitely test me on neural network concepts. But do you not think, asking to code entire neural network from scratch was an overdo for the interview?

Featureless_Bug t1_j69nsjq wrote on January 28, 2023 at 7:22 PM

Reply to comment by OkAssociation8879 in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

>It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.

I mean, if it was on leetcode, it wouldn't make sense to ask it in the interview, because then you will get prepared answers.

>Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch

If you cannot apply chain rule, your math is very weak. If your math is very weak, you probably won't be a great ML engineer. It's not that you need a lot of math, but you need a broad general understanding of what can work and what can't quite often, actually.

OkAssociation8879 OP t1_j69n96y wrote on January 28, 2023 at 7:18 PM

Reply to comment by Featureless_Bug in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.

Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch. This question is more suited for someone fresh out of college, in my opinion.

OkAssociation8879 OP t1_j69movh wrote on January 28, 2023 at 7:14 PM

Reply to comment by seba07 in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

Everything. Loss function, optimizer, backpropagation, update weights

Featureless_Bug t1_j69mojw wrote on January 28, 2023 at 7:14 PM

Reply to comment by marcingrzegzhik in [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

I mean, it is kind of a very basic question and it takes like 15 minutes at most if you understand what you are doing. It is similar to leetcode-style questions for SE, it is not something that you will do on the job, but if you are smart, you will pass easily, and if you are not, you will struggle - so a great interview task

seba07 t1_j69mjy3 wrote on January 28, 2023 at 7:13 PM

Reply to [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

What does that even mean? Just the matrix multiplication of a perceptron or including the back propagation of an optimizer?

SteffenGO t1_j69kd3w wrote on January 28, 2023 at 6:58 PM

Reply to [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

I think often times with these absurdly complicated interview questions, they’re less interested in the final answer and more interested that you have the correct knowledge and problem solving skills to work through how your would attempt it. Often times with highly competitive positions they’re splitting hairs for the best candidate and the most extreme questions can nudge their decision one way or another when on paper candidates are fairly equivalent. Super stressful to be put on the spot like that nonetheless.

marcingrzegzhik t1_j69k6qd wrote on January 28, 2023 at 6:57 PM

Reply to [D] Interviewer asked to code neural network from scratch with plain python on a live call. Reasonable? by OkAssociation8879

It's definitely a valid interview question, but it's not something you should be asked to do during a live call. It's too much to tackle in the limited time of a call and it's not a fair way to assess your skills. I would suggest asking to review a code sample you've written in the past that demonstrates your knowledge and experience. That would be a better way to assess your skills and it would be much less stressful. Good luck!

[deleted] t1_j69ivw1 wrote on January 28, 2023 at 6:48 PM

Reply to comment by deathtosquishy in [R] META presents MAV3D — text to 3D video by SpatialComputing

[deleted]

mocny-chlapik t1_j69h5ud wrote on January 28, 2023 at 6:37 PM

Reply to [N] OpenAI has 1000s of contractors to fine-tune codex by yazriel0

More and more information is popping up about the huge human annotation efforts going on at OpenAI. It seems that the secret ingredient missing was money, that could buy you lots of relevant data. This has several implications: (1) It might be impossible to replicate some of these models without millions of dollars invested in similar data collection efforts, (2) The range of applications can actually be broader than thought previously, if we are willing to pay people to generate the data. (3) They were not able to find significant improvements with scaling anymore. The scaling era might be nearly over.

SaifKhayoon t1_j69e65n wrote on January 28, 2023 at 6:17 PM

Reply to [R] META presents MAV3D — text to 3D video by SpatialComputing

They had a problem sourcing labeled training data of 3D videos, you can tell this tech is still early from the shield in the bottom right example

They could generate a labeled 3D environments from 2D images using InstantNGP and GET3D with Laion's labeled dataset of 5.85 billion CLIP-filtered image-text pairs to create a useful dataset for training because this currently relies on a workaround of only being trained on text-image pairs and unlabeled videos due to lack of labeled 3D training data.

theoryanddata t1_j69dx28 wrote on January 28, 2023 at 6:15 PM

Reply to [D] Could forward-forward learning enable training large models with distributed computing? by currentscurrents

I remember reading about this type of concept, and iirc it does seem that there is quite a bit of local learning in biological neural networks. But global convergence of the model seems like a challenge with this type of scheme. Maybe there's some way to incorporate a periodic global backprop to address that? Has anyone tried it? Or maybe you don't even need it and the problem will disappear with enough scale

Dontgooo t1_j69ciw8 wrote on January 28, 2023 at 6:05 PM

Reply to comment by kiteguycan in [R] META presents MAV3D — text to 3D video by SpatialComputing

Or a virtual reality you could step in to.. why you think meta is going hard at VR?

[deleted] t1_j69ci5w wrote on January 28, 2023 at 6:05 PM

Reply to [R] META presents MAV3D — text to 3D video by SpatialComputing

[removed]

Recent comments in /f/MachineLearning