CuriousCesarr OP t1_j54lden wrote on January 20, 2023 at 10:28 AM

Reply to comment by BitterAd9531 in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr

Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)

CuriousCesarr OP t1_j54lapx wrote on January 20, 2023 at 10:27 AM

Reply to comment by NamerNotLiteral in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr

Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)

CuriousCesarr OP t1_j54l9iq wrote on January 20, 2023 at 10:26 AM

Reply to comment by Legitimate_Light7143 in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr

Sorry for the late reply but I had a very busy period. In the end, I found a small Greek ML company that was excited about the project and we entered deeper discussions. I also updated my post to reflect this. Have a great day! :)

Ok-Cartoonist8114 t1_j54l5yh wrote on January 20, 2023 at 10:25 AM

Reply to comment by IntrepidTieKnot in [D] is it time to investigate retrieval language models? by hapliniste

Your pipeline is fine! Cherche is not fancy, it just allow to create hybrid pipelines that rely both on language models and lexical matching which can help a lot. Also Cherche is primarly design for computing embeddings with Sentence Transformers which have a better ratio <precision / number of parameters>.

Hobohome t1_j54l1jy wrote on January 20, 2023 at 10:23 AM

Reply to [D] Question about using diffusion to denoise images by CurrentlyJoblessFML

While it may not be exactly what you are looking for, but Deep Image Priors work similarly and have been around for a while.

Rolling_Pig t1_j54gq7q wrote on January 20, 2023 at 9:23 AM

Reply to [D] ICLR 2023 results. by East-Beginning9987

Coming soon

stardust-sandwich t1_j54em1w wrote on January 20, 2023 at 8:53 AM

Reply to [D] Simple Questions Thread by AutoModerator

I want to pull data from an API(done) and use NLP to categorize that information. Then with those results push it into a webpage or GUI tool where it will highlight the text and say, is the correct? So I can use this GUI so that I can "teach" the learning model how to classify text

e.g

Category 1 - words 1, words 2, words 3 and similar

Category 2 - word4, words 5, words 6 and so on

Then it will go and try that and come back and ask me to tune it again and rinse and repeat. Once this model is trained I then want to see it later in a different script to point a news article at it for example and it will split out the data I need.

How can I achieve this please? What are the best tools and services to get this done, ideally open source if possible, if not then happy to use a commercial service if its cheap to do so, as this is just a personal project of mine.

Thanks in advance.

Seankala t1_j549ygz wrote on January 20, 2023 at 7:52 AM

Reply to [D] Simple Questions Thread by AutoModerator

Are there any Slack channels or Discord Servers for ML practitioners to talk about stuff?

_Arsenie_Boca_ t1_j5492g7 wrote on January 20, 2023 at 7:41 AM

Reply to [P] paper-hero: Yet Another Paper Search Tool by Spico197

I think the idea is great! How long does it take to execute a query on the ArXiv set? Have you considered making a huggingface space out of this?

IntrepidTieKnot t1_j547lq2 wrote on January 20, 2023 at 7:23 AM

Reply to comment by Ok-Cartoonist8114 in [D] is it time to investigate retrieval language models? by hapliniste

I made a tool that chops documents in chunks, creates embeddings for the chunks via GPT-3 and stores the embeddings in a REDIS database. When I make a query, I create an embedding for that and look up my stored embeddings via cosine similarity.

My question is: isn't that the same as your tool does? In other words: what can you do with Cherche what I cannot do like I described? Is it that I don't need GPT-3 for the same result? Or what is it?

kvutxdy t1_j531msz wrote on January 20, 2023 at 1:19 AM

Reply to [D] Inner workings of the chatgpt memory by terserterseness

I asked ChatGPT and it said RNN is used in the system as well. (probably not true)

tennismlandguitar OP t1_j52zysi wrote on January 20, 2023 at 1:07 AM

Reply to comment by Omnes_mundum_facimus in [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar

What about finetuning those models to make sure the performance is satisfactory?

Double-Swimmer3495 t1_j52zvqv wrote on January 20, 2023 at 1:07 AM

Reply to [D] ICLR 2023 results. by East-Beginning9987

Jan 21 '23 02:00 AM UTC

drumnation t1_j52yrfo wrote on January 20, 2023 at 12:59 AM

Reply to [D] Inner workings of the chatgpt memory by terserterseness

The api docs don’t seem clear in how to remake the same session memory in the main app. It appeared to me as if it uses stop words to achieve this but I’m still trying to figure out how to emulate conversation memory.

Unlikely-Advice-7168 t1_j52rc7s wrote on January 20, 2023 at 12:06 AM

Reply to comment by Apprehensive-Tax-214 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214

For clarity, I've tried it with Chrome, brave, firefox on mobile and 2 laptops with two different github accounts. One I've used in the past and a new one I just made to test it.

Unlikely-Advice-7168 t1_j52r43f wrote on January 20, 2023 at 12:05 AM

Reply to comment by kdr4t3 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214

Tried it on a separate device using a different github login, same error.

Ok-Cartoonist8114 t1_j52mjrw wrote on January 19, 2023 at 11:33 PM

Reply to [D] is it time to investigate retrieval language models? by hapliniste

Here is a great paper from IBM following the retriever-reader paradigm. Love those "light" models that can be specialized by switching index.

IMO the loss of ChatGPT is still interesting for retriever-reader approachs to generate either human like or structured answers from input documents.

Here is a tool I made to create retriever-reader pipeline in a minute: Cherche, would recommend also Haystack on github !

dancingnightly t1_j52k7sv wrote on January 19, 2023 at 11:16 PM

Reply to [D] is it time to investigate retrieval language models? by hapliniste

Yup, I fully believe retrieval of sources will go up in value over time, in addition to the benefits you have outlined. Because when lots of things are AI generated, being able to trust and see a source has value (even for some AI summary answer say)

Omnes_mundum_facimus t1_j52i6fr wrote on January 19, 2023 at 10:57 PM

Reply to [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar

Because lawyers, and 2) because performance on academic data sets doesn't translate into good performance on whatever domain specific problem we might be having.

currentscurrents t1_j525hto wrote on January 19, 2023 at 9:33 PM

Reply to comment by hapliniste in [D] is it time to investigate retrieval language models? by hapliniste

Retrieval language models do have some downsides. Keeping a copy of the training data around is suboptimal for a couple reasons:

Training data is huge. Retro's retrieval database is 1.75 trillion tokens. This isn't a very efficient way of storing knowledge, since a lot of the text is irrelevant or redundant.
Training data is still a mix of knowledge and language. You haven't achieved separation of the two types of information, so it doesn't help you perform logic on ideas and concepts.
Most training data is copyrighted. It's currently legal to train a model on copyrighted data, but distributing a copy of the training data with the model puts you on much less firm ground.

Ideally I think you want to condense the knowledge from the training data down into a structured representation, perhaps a knowledge graph. Knowledge graphs are easy to perform logic on and can be human-editable. There's also already an entire sub-field studying them.

EmmyNoetherRing t1_j5253a8 wrote on January 19, 2023 at 9:31 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

>Softmax activation function

Ok, got it. huh (on reviewing wikipedia). so to rephrase the quoted paragraph, they find that the divergence between the training and testing distribution (between the compressed versions of the training and testing data sets in my analogy) starts decreasing smoothly as the scale of the model increases, long before the actual final task performance locks into place successfully.

Hm. Says something more about task complexity (maybe in some computability sense, a fundamental task complexity, that we don't have well defined for those types of tasks yet?). Rather than imagination I think, but I'm still with you on imagination being a factor, and of course the paper and the blog post both leave the cliff problem unsolved. Possibly there's a definition of imagination such that we can say degree X of it is needed to successfully complete those tasks.

emreddit0r t1_j523nlc wrote on January 19, 2023 at 9:22 PM

Reply to [D] The Illustrated Stable Diffusion (Video) by jayalammar

One thing I find glossed over/lacking in the diffusion model materials is the contribution of the UNet.

Coming from someone that is just trying to catch up on what's going on, the UNet seems to play a huge role (if I understand right, this is where the convolutional neural networks are discovering 2d features.)

Relatively speaking, CNNs are kind of old news.. but they're a big deal. Unless I have something wrong? Do you know where I can learn more about how the UNet aspect works in depth?

EmmyNoetherRing t1_j522inn wrote on January 19, 2023 at 9:16 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

So I'm in a different flavor of data science, which means I've got the basic terminology, but not the specifics. I know what a loss function is and what entropy is. What role does "cross" play here? A cross between what?

mycall t1_j51zz0w wrote on January 19, 2023 at 9:01 PM

Reply to [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

I wonder how pruning the sparsity affects emergent abilities in scaling parameters.

mycall t1_j51zmqh wrote on January 19, 2023 at 8:59 PM

Reply to comment by EmmyNoetherRing in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

https://ai.googleblog.com/2022/11/characterizing-emergent-phenomena-in.html

This is another paper worth looking at.

Recent comments in /f/MachineLearning