Recent comments in /f/MachineLearning
JustOneAvailableName t1_j7v99gd wrote
If the model is sufficiently large (if not, you don't really need to wait long anyways) and no expensive CPU pre/postprocessing is done, the 3090 will be the bottleneck.
A single 3090 might not have enough memory to train GPT 2 large, but it's probably close.
Fully training a LLM on a single 3090 is impossible, but you could finetune one.
RitsusHusband t1_j7v7crt wrote
Reply to [D] Similarity b/w two vectors by TKMater
Could you take the squared difference?
arg_max t1_j7v4mzf wrote
When I went to university, a course with a similar Syllabus was also my first experience with ML and it went fine. This is basically the content of any "old-school" ML lecture and as long as you know your maths you're fine.
_Greasy_D t1_j7v4fpr wrote
Doing that right now! As long as you have basic programming experience you’ll do great.
loadage t1_j7v1lpn wrote
You'll do fine. I'm in my final quarter of a 2 year masters in cs program with no undergrad experience in CS. My undergrad was in engineering which helped, but my friends undergrad was in psychology. Enjoy the ride!
schludy t1_j7v11vj wrote
Reply to comment by zanzagaes2 in [P] Creating an embedding from a CNN by zanzagaes2
The individual steps sound ok, however, if you project 20.000 to 2D, the results you got look very reasonable. I'm not sure about UMAP, but I think for tSNE, it's recommended to have low dimensionality, something more in the order of 32 features. I would probably try to adjust the architecture, as other comments have suggested
Feeling_Card_4162 t1_j7v0c5a wrote
Reply to comment by LeadershipComplex958 in Taking a ML Grad class without any ML experience? [D] by LeadershipComplex958
Wouldn’t know. That was back in 2012 so they only offered the one class at my school. It would make sense to me though that the ML side is more math heavy
ConcertoConta t1_j7uzv7c wrote
Was in the same situation as a math/physics undergrad. I took an ML class with the same curriculum as yours and it was easier for me than most of the CS majors in my class, as it was focused on algorithms rather than implementation. You should be fine!
Dubgarden t1_j7uzghg wrote
Reply to comment by womenrespecter-69 in Taking a ML Grad class without any ML experience? [D] by LeadershipComplex958
This.
mr_house7 t1_j7uza9c wrote
Reply to comment by sonofmath in [D] List of RL Papers by C_l3b
I'm so sorry to bother you again. Just one final question.
Do you know if the Spinning up algos are worth while? Since I'm on Windows it seems to be a little more changeling to install it in my local machine. Is there an alternative to installing on local machine like colab?
LeadershipComplex958 OP t1_j7uz1kl wrote
Reply to comment by Feeling_Card_4162 in Taking a ML Grad class without any ML experience? [D] by LeadershipComplex958
Cool cool. I’ve heard (or at least in my college) that the ML class is more about the Math and A.I class is more on the programming side. Is that true?
Tober447 t1_j7uyy1n wrote
Reply to comment by zanzagaes2 in [P] Creating an embedding from a CNN by zanzagaes2
>I guess I can use the encoder-decoder to create a very low-dimensional embedding and use the current one (~500 features) to find similar images to a given one, right?
Exactly. :-)
Feeling_Card_4162 t1_j7uyqsa wrote
I took a graduate AI class with a similar curriculum in undergrad and did great with it. I think you’ll do fine if you’re good with math.
womenrespecter-69 t1_j7uxna0 wrote
You'll do great if you're good at math and stats.
RemindMeBot t1_j7uxemq wrote
Reply to comment by deadknxght in What are the best resources to stay up to date with latest news ? [D] by [deleted]
I will be messaging you in 1 day on 2023-02-10 16:18:20 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
| ^(Info) | ^(Custom) | ^(Your Reminders) | ^(Feedback) |
|---|
deadknxght t1_j7uxba9 wrote
!remindme 24 hours
[deleted] t1_j7uworz wrote
[deleted]
pommedeterresautee t1_j7uwa71 wrote
Reply to comment by Available_Lion_652 in [D] RTX 3090 with i7 7700k, training bottleneck by Available_Lion_652
At start the weights will be moved on the GPU. Then during training, the tokenizer will convert your strings to a int64 tensors. They are quite light, and those are moved to GPU during training. What you need is not the fastest CPU but one which can feed your GPU faster that the data it will consume. In GPT2 case, CPU like 7700 won't be an issue. Image or sounds (TTS, ASR) may have more demanding preprocessing during training.
logsinh t1_j7uw4wb wrote
Reply to comment by CeFurkan in [D] Are there any AI model that I can use to improve very bad quality sound recording? Removing noise and improving overall quality by CeFurkan
Process with a sliding window would solve your problem, see e.g. https://colab.research.google.com/github/asteroid-team/asteroid/blob/master/notebooks/04_ProcessLargeAudioFiles.ipynb
SnooHesitations8849 t1_j7uuh1w wrote
Reply to comment by anders987 in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee
Yep. I am using it and it is really good for inference.
zanzagaes2 OP t1_j7uual3 wrote
Reply to comment by Tober447 in [P] Creating an embedding from a CNN by zanzagaes2
Yes, that's a great idea. I guess I can use the encoder-decoder to create a very low-dimensional embedding and use the current one (~500 features) to find similar images to a given one, right?
Your perspective has been really helpful, thank you
Tober447 t1_j7uq41s wrote
Reply to comment by zanzagaes2 in [P] Creating an embedding from a CNN by zanzagaes2
You would take the output of a layer of your choice from the trained cnn (as you do now) and feed it into a new model, that is the autoencoder. So yes, the weights from your model are kept, but you will have to train the autoencoder from scratch. Something like CNN (only inference, no backprop) --> Decoder --> Latent Space --> Encoder for training and during inference you take the output of the decoder and use it for visualization or similarity.
zanzagaes2 OP t1_j7unuq2 wrote
Reply to comment by Tober447 in [P] Creating an embedding from a CNN by zanzagaes2
May I use some part of the trained model to avoid retraining from scratch? The current model has very decent precision and I have generated some other visualizations for it (like heatmaps) so doing work around this model would be very convenient.
Edit: I have added an image of the best embedding I have found until now as a reference
zanzagaes2 OP t1_j7unjuw wrote
Reply to comment by schludy in [P] Creating an embedding from a CNN by zanzagaes2
I have not found a very convincing embedding yet, I have tried several that go from ~500 features (class activation map) to ~20.000 features (output of last convolutional layer before pooling), all generated from the full training set (~30.000 samples)
In all cases I do the same, I use PCA to reduce vectors to 1.000 features and UMAP or t-SNE (usually try both) to get a 2d vector I can scatter plot. I have tried to use UMAP for the full process but it doesn't escalate well enough. Is this a good approach?
Edit: I have added an image of the best embedding I have found until now as a reference
schludy t1_j7v9pkm wrote
Reply to comment by zanzagaes2 in [P] Creating an embedding from a CNN by zanzagaes2
I think you're underestimating the curse of dimensionality. In 500d, most vectors will be far away from each other. You can't just use L2 norm when comparing the vectors in that high dimensional space