Recent comments in /f/MachineLearning
Humble_Amphibian7448 t1_j5ykzbw wrote
Reply to [P] Diffusion models best practices by debrises
Здравствуйте уважаемые господа я из Кыргызстана и изобрёл Вечный двигатель и от воздуха работаюший и отопления можно получать без огня опсолютно без огня Господа надо делать первую чтоб люди видели и поверили на Вечнвй двигатель господа ишо в мире нету такой система господа я зделаю Вечный Двигатель ишо не кому не удалось мой телефон Кыргызстане +996 707 52 42 17 или+996 770 77 77 44 Кеңешбек звоните или СМС пишите пожалуста господа я сейчас болею инсультом реч не понятно господа
suntehnik t1_j5ykwei wrote
Reply to comment by besabestin in Few questions about scalability of chatGPT [D] by besabestin
Just speculation here: maybe they store generated text in a buffer and when they run out of memory buffer can be flushed to get allocation back for other tasks.
OmgMacnCheese t1_j5yk6si wrote
Reply to comment by [deleted] in [P] Diffusion models best practices by debrises
OP, do not do this. This person has no idea how HIPAA* compliance works.
randomrushgirl t1_j5yjww3 wrote
Reply to comment by Perfect_Finance7314 in [D] Simple Questions Thread by AutoModerator
Hey! I had a very similar doubt and was hoping you could provide some insight. I came across this CLIP Guided Diffusion Colab Notebook by Katherine Crowson. It's really cool and I've played a little with it.
I want to know if I can generate the same image over and over again. I've tried setting the seed but I'm new to this so can someone give me some intuition or links to some related work in this area. Any help would be appreciated.
[deleted] t1_j5yjc8u wrote
Reply to comment by veb101 in [P] Diffusion models best practices by debrises
[removed]
Kacper-Lukawski t1_j5yineq wrote
Reply to comment by keisukegoda3804 in [D] Efficient retrieval of research information for graduate research by [deleted]
I do not know any benchmark that would measure that. It would also be quite challenging to compare to SaaS like Pinecone (it should be running on the same infrastructure to have comparable results). When it comes to Milvus, as far as I know, they use prefiltering for filtered search (https://github.com/milvus-io/milvus/discussions/12927). So they need to store the ids of matching entries somewhere during the vector search phase, possibly even all the ids if your filtering criteria do not exclude anything.
zaptrem t1_j5yg0ej wrote
Reply to comment by debrises in [P] Diffusion models best practices by debrises
Yes
SimonJDPrince OP t1_j5yc4n2 wrote
Reply to comment by NeoKov in [P] New textbook: Understanding Deep Learning by SimonJDPrince
You are correct -- they don't usually occur simultaneously. Usually, you would train and then test afterwards, but I've shown the test performance as a function of the number of training iterations, just so you can see what happens with generalization.
(Sometimes people do examine curves like this using validation data, so they can see when the best time to stop training is though)
The test loss goes back up because it classifies some of the test answers wrong. With more training iterations, it becomes more certain about it's answers (e.g., it pushes the likelihood of its chosen class from 0.9 to 0.99 to 0.999 etc.). For the training data, where the everything is classified correctly, that makes it more likely and decreases the loss. For the cases in the test data where its classified wrong, it makes it less likely, and so the loss starts to go back up.
Hope this helps. I will try to clarify in the book. It's always helpful to learn where people are getting confused.
yldedly t1_j5ybg0x wrote
FallUpJV t1_j5ya6t5 wrote
Reply to comment by manubfr in Few questions about scalability of chatGPT [D] by besabestin
This is something that I often read, that other LLMs are undertrained, but how come the OpenAI one is the only one not to be ? Datasets ? Computing power ?
besabestin OP t1_j5ya2af wrote
Reply to comment by vivehelpme in Few questions about scalability of chatGPT [D] by besabestin
I see. Interesting. I thought it was generating one by one like that. I wonder why it sometimes encounters error after generating a long text and just stops half way through the task - which happened to me frequently.
CKtalon t1_j5y9deu wrote
Reply to comment by manubfr in Few questions about scalability of chatGPT [D] by besabestin
There's also the rumor mill that Whisper was used to gather a bigger text corpus from videos to train GPT 4.
manubfr t1_j5y8mo0 wrote
Reply to comment by CKtalon in Few questions about scalability of chatGPT [D] by besabestin
You're right, it could be that 3.5 is already using that approach. I guess the emergent cognition tests haven't yet been published for GPT-3.5 (or have they?) so it's hard for us to measure performance as individuals. I guess someone could test text-davinci-003 on a bunch of cognitive tasks on the PlayGround but I'm far too lazy to do that :)
CKtalon t1_j5y87e5 wrote
Reply to comment by manubfr in Few questions about scalability of chatGPT [D] by besabestin
People often quote Chinchilla about performance, claiming that there's still a lot of performance to be unlocked when we do not know how GPT 3.5 was trained. GPT 3.5 could very well be Chinchilla-optimal, even though the 1st version of davinci was not Chinchilla-optimal. We know that OpenAI has retrained GPT 3 due to the increased context length going from 2048 to 4096 to the apparent 8000ish tokens for ChatGPT.
vivehelpme t1_j5y70zt wrote
>what is very special about the model than the large data and parameter set it has
OpenAI have a good marketing department and the web interface is user friendly. But yeah there's really no secret sauce to it.
The model generates the text snippet in a batch, it just prints it a character at a time for dramatic effect(and to keep you occupied for a while so you don't overload the horribly computationally expensive cloud service it runs on with multiple queries in quick succession), so yeah definitely scaling questions before it could be ran as a google replacement general casual search engine.
manubfr t1_j5y6wko wrote
Google (and DeepMind) actually have better LLM tech and models than OpenAI (if you believe their published research anyway). They had a significant breathrough last year in terms of scalability: https://arxiv.org/abs/2203.15556
Existing LLMs are found out to be undertrained and with some tweaks you can create a smaller model that outperforms larger ones. Chinchilla is arguably the most performant model we've heard of to date ( https://www.jasonwei.net/blog/emergence ) but it hasn't been pushed to any consumer-facing application AFAIK.
This should be powering their ChatGPT competitor Sparrow which might be reeleased this year. I am pretty sure that OpenAI will also implement those ideas for GPT-4.
steve-phan t1_j5y6u4i wrote
Reply to [D] CVPR Reviews are out by banmeyoucoward
1 borderline and 2 weak rejects. Is there any chance?
leviaker t1_j5y6jh3 wrote
I am making a solver in pytorch :p
debrises OP t1_j5y2gs4 wrote
Reply to comment by zaptrem in [P] Diffusion models best practices by debrises
Is default unit variance considered low enough?
Meddhouib10 t1_j5y2ete wrote
Reply to [D] Pretraining for CNN by Dense-Smf-6032
There are : check the ConvNext V2 paper
[deleted] t1_j5y1tby wrote
Reply to comment by debrises in [P] Diffusion models best practices by debrises
[removed]
debrises OP t1_j5y1jn4 wrote
Reply to comment by [deleted] in [P] Diffusion models best practices by debrises
Thanks for the suggestion, but I don't think legal will approve. It's medical data.
debrises OP t1_j5y1h2l wrote
Reply to comment by zaptrem in [P] Diffusion models best practices by debrises
X-rays
hellrail t1_j5y0ok5 wrote
Reply to comment by its_ya_boi_Santa in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
Wrong, I am not in any rut.
New accounts, being in a rut, saying wrong just for the sake of saying something eventhough nothing was wrong....
If i look at your behaviour it clearly shows that you are fighting your own inner demons instead of really replying to what somebody has said (otherwise u wouldnt put so much of self-fantasized allegations in your posting).
I hope this kind of self-therapy works out for you, but i doubt it helps with anything.
LetWrong1932 t1_j5ymzk4 wrote
Reply to comment by Gershel in [D] CVPR Reviews are out by banmeyoucoward
try your best to explain and convince WR. also the two WAs might help u convince in post rebuttal, so just spend a lot on it!