Recent comments in /f/MachineLearning
CuriousCesarr OP t1_j54lapx wrote
Reply to comment by NamerNotLiteral in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr
Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)
CuriousCesarr OP t1_j54l9iq wrote
Reply to comment by Legitimate_Light7143 in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr
Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)
Ok-Cartoonist8114 t1_j54l5yh wrote
Reply to comment by IntrepidTieKnot in [D] is it time to investigate retrieval language models? by hapliniste
Your pipeline is fine! Cherche is not fancy, it just allow to create hybrid pipelines that rely both on language models and lexical matching which can help a lot. Also Cherche is primarly design for computing embeddings with Sentence Transformers which have a better ratio <precision / number of parameters>.
Hobohome t1_j54l1jy wrote
While it may not be exactly what you are looking for, but Deep Image Priors work similarly and have been around for a while.
Rolling_Pig t1_j54gq7q wrote
Reply to [D] ICLR 2023 results. by East-Beginning9987
Coming soon
stardust-sandwich t1_j54em1w wrote
Reply to [D] Simple Questions Thread by AutoModerator
I want to pull data from an API(done) and use NLP to categorize that information. Then with those results push it into a webpage or GUI tool where it will highlight the text and say, is the correct? So I can use this GUI so that I can "teach" the learning model how to classify text
e.g
Category 1 - words 1, words 2, words 3 and similar
Category 2 - word4, words 5, words 6 and so on
Then it will go and try that and come back and ask me to tune it again and rinse and repeat. Once this model is trained I then want to see it later in a different script to point a news article at it for example and it will split out the data I need.
How can I achieve this please? What are the best tools and services to get this done, ideally open source if possible, if not then happy to use a commercial service if its cheap to do so, as this is just a personal project of mine.
​
Thanks in advance.
Seankala t1_j549ygz wrote
Reply to [D] Simple Questions Thread by AutoModerator
Are there any Slack channels or Discord Servers for ML practitioners to talk about stuff?
_Arsenie_Boca_ t1_j5492g7 wrote
I think the idea is great! How long does it take to execute a query on the ArXiv set? Have you considered making a huggingface space out of this?
IntrepidTieKnot t1_j547lq2 wrote
Reply to comment by Ok-Cartoonist8114 in [D] is it time to investigate retrieval language models? by hapliniste
I made a tool that chops documents in chunks, creates embeddings for the chunks via GPT-3 and stores the embeddings in a REDIS database. When I make a query, I create an embedding for that and look up my stored embeddings via cosine similarity.
My question is: isn't that the same as your tool does? In other words: what can you do with Cherche what I cannot do like I described? Is it that I don't need GPT-3 for the same result? Or what is it?
kvutxdy t1_j531msz wrote
I asked ChatGPT and it said RNN is used in the system as well. (probably not true)
tennismlandguitar OP t1_j52zysi wrote
Reply to comment by Omnes_mundum_facimus in [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar
What about finetuning those models to make sure the performance is satisfactory?
Double-Swimmer3495 t1_j52zvqv wrote
Reply to [D] ICLR 2023 results. by East-Beginning9987
Jan 21 '23 02:00 AM UTC
drumnation t1_j52yrfo wrote
The api docs don’t seem clear in how to remake the same session memory in the main app. It appeared to me as if it uses stop words to achieve this but I’m still trying to figure out how to emulate conversation memory.
Unlikely-Advice-7168 t1_j52rc7s wrote
Reply to comment by Apprehensive-Tax-214 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214
For clarity, I've tried it with Chrome, brave, firefox on mobile and 2 laptops with two different github accounts. One I've used in the past and a new one I just made to test it.
Unlikely-Advice-7168 t1_j52r43f wrote
Reply to comment by kdr4t3 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214
Tried it on a separate device using a different github login, same error.
Ok-Cartoonist8114 t1_j52mjrw wrote
Here is a great paper from IBM following the retriever-reader paradigm. Love those "light" models that can be specialized by switching index.
IMO the loss of ChatGPT is still interesting for retriever-reader approachs to generate either human like or structured answers from input documents.
Here is a tool I made to create retriever-reader pipeline in a minute: Cherche, would recommend also Haystack on github !
dancingnightly t1_j52k7sv wrote
Yup, I fully believe retrieval of sources will go up in value over time, in addition to the benefits you have outlined. Because when lots of things are AI generated, being able to trust and see a source has value (even for some AI summary answer say)
Omnes_mundum_facimus t1_j52i6fr wrote
Reply to [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar
- Because lawyers, and 2) because performance on academic data sets doesn't translate into good performance on whatever domain specific problem we might be having.
currentscurrents t1_j525hto wrote
Reply to comment by hapliniste in [D] is it time to investigate retrieval language models? by hapliniste
Retrieval language models do have some downsides. Keeping a copy of the training data around is suboptimal for a couple reasons:
-
Training data is huge. Retro's retrieval database is 1.75 trillion tokens. This isn't a very efficient way of storing knowledge, since a lot of the text is irrelevant or redundant.
-
Training data is still a mix of knowledge and language. You haven't achieved separation of the two types of information, so it doesn't help you perform logic on ideas and concepts.
-
Most training data is copyrighted. It's currently legal to train a model on copyrighted data, but distributing a copy of the training data with the model puts you on much less firm ground.
Ideally I think you want to condense the knowledge from the training data down into a structured representation, perhaps a knowledge graph. Knowledge graphs are easy to perform logic on and can be human-editable. There's also already an entire sub-field studying them.
EmmyNoetherRing t1_j5253a8 wrote
Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon
>Softmax activation function
Ok, got it. huh (on reviewing wikipedia). so to rephrase the quoted paragraph, they find that the divergence between the training and testing distribution (between the compressed versions of the training and testing data sets in my analogy) starts decreasing smoothly as the scale of the model increases, long before the actual final task performance locks into place successfully.
Hm. Says something more about task complexity (maybe in some computability sense, a fundamental task complexity, that we don't have well defined for those types of tasks yet?). Rather than imagination I think, but I'm still with you on imagination being a factor, and of course the paper and the blog post both leave the cliff problem unsolved. Possibly there's a definition of imagination such that we can say degree X of it is needed to successfully complete those tasks.
emreddit0r t1_j523nlc wrote
One thing I find glossed over/lacking in the diffusion model materials is the contribution of the UNet.
Coming from someone that is just trying to catch up on what's going on, the UNet seems to play a huge role (if I understand right, this is where the convolutional neural networks are discovering 2d features.)
Relatively speaking, CNNs are kind of old news.. but they're a big deal. Unless I have something wrong? Do you know where I can learn more about how the UNet aspect works in depth?
EmmyNoetherRing t1_j522inn wrote
Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon
So I'm in a different flavor of data science, which means I've got the basic terminology, but not the specifics. I know what a loss function is and what entropy is. What role does "cross" play here? A cross between what?
mycall t1_j51zz0w wrote
I wonder how pruning the sparsity affects emergent abilities in scaling parameters.
mycall t1_j51zmqh wrote
Reply to comment by EmmyNoetherRing in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon
https://ai.googleblog.com/2022/11/characterizing-emergent-phenomena-in.html
This is another paper worth looking at.
CuriousCesarr OP t1_j54lden wrote
Reply to comment by BitterAd9531 in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr
Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)