suflaj t1_j71c8j2 wrote on February 3, 2023 at 10:52 AM

Reply to [p] Is it possible to add more classes to an already trained resnet image classifier model without the need to retrain it in all dataset again? [p] by YukkiiCode

Generally, no. It would be better to just use all the classes you need now, and then use masks to regulate which classes are being tested at a given moment. The thing you are suggesting, even when done correctly, would not let the model learn about the relationships between different classes.

With neural network surgery, it's trivial to downscale, but fairly hard to upscale.

One thing you could test, ex. is try to cluster your images with vanilla pretrained resnet features. Then, once you need to add new classes, you can look at which images from the new class are the most similar to the ones from existing classes, and you can maybe get away with only finetuning it on that subset, instead of the whole dataset.

Obviously, finalization will include doing at least one epoch on the whole dataset, but that might not be viable to do n times, while the similarity method will be, you can just adjust the similarity threshold.

jimmymvp t1_j71bvhf wrote on February 3, 2023 at 10:47 AM

Reply to comment by based_goats in [D] Normalizing Flows in 2023? by wellfriedbeans

The problem with diffusion from an SDE view is that you still don't have exact likelihoods because you're again not computing the exact Jacobian to make it tractable and you have ODE solving errors. People mostly resolve to Hutchinson trace estimator, otherwise it would be too expensive to compute, so I don't think that diffusion in this way is going to enter the MCMC world anytime soon.

tripple13 t1_j71b0xh wrote on February 3, 2023 at 10:36 AM

Reply to comment by visarga in [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

Certainly, one hundred per cent agree, if I understand you correctly.

Don't know about human entitlement, but from a simple time/energy-limitation perspective:

The more time and energy you have in surplus, the more you're able to achieve. Like what is stopping human kind from populating the universe?

I'm sure time and energy is some of the reasons.

comfytoday t1_j71avhz wrote on February 3, 2023 at 10:33 AM

Reply to comment by blacksnowboader in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

care to share a sample?

EnzoTrent t1_j719xal wrote on February 3, 2023 at 10:20 AM

Reply to comment by PinusPinea in [D] I'm at a crossroads: Bayesian methods VS Reinforcement Learning, which to choose? by fuscarili

I'm aware it is a set of principles.

I keep having the same conversations - its like your talking about the 2022 pre-season to me right now in February, right before the Super Bowl. I'm having a hard time with where everyone seems to be at.

I'm sick of explaining things, so I'll assume your fairly familiar with Data Science.

An AI is going to do the cherry picking of our lives now - not a human being, or even an algorithm, a new thing.

Do you believe it is going to look at our data like a human would?

Do you not understand the immensity of what that means for Data Science?

So much new data is about collected out of the same world we collect data in now AND all of the data we collect now is about to be completely re-analyzed - that will also generate new data. All of this new data generated by the AI will then be managed by the AI - people won't be making sense of how they see the world fast enough to keep up, or at all.

The way all of that data is then cross tabulated and that data cross tabulated - How long do you think human beings are going to be able to understand what is happening? The Data won't look anything like data we see now but will be far more accurate.

What if it pulls something like the Meta AI and says - "oh I see how you structure data - I'm going to do it like this" the Meta AI created a further breakdown of time to meet its ends easier - how much harder do you think that made it for any human that now has to account for a new unit of time? I'm assuming its actually something Meta devs deal very little with - which is my point but I really do want to stress that we do not understand something that can adopt a new subsect of time on a whim.

What will AI code that only AI will ever interact look like? There is no reason to assume that it looks anything like what we would do.

I'm trying to put perspective on the scale and speed. I'm still hung up that you called this hype.

dansmonrer t1_j719m7a wrote on February 3, 2023 at 10:16 AM

Reply to comment by anananananana in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

They're very open to your money!

VP4770 t1_j7186vz wrote on February 3, 2023 at 9:55 AM

Reply to comment by _Arsenie_Boca_ in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

This

glauberlima t1_j717t0c wrote on February 3, 2023 at 9:50 AM

Reply to comment by Imonfire1 in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

They should ask ChatGPT to make a better Teams app.

A_Again t1_j716fo0 wrote on February 3, 2023 at 9:30 AM

Reply to [p] Is it possible to add more classes to an already trained resnet image classifier model without the need to retrain it in all dataset again? [p] by YukkiiCode

You could always correlate the existing weights to the existing classes in the dataset and wipe the lowest-N correlated weights from each layer while adding a new output with new weights. this could catastrophically impact performance but also would guarantee you minimize impact on existing classes ...

I work with AI but can't guarantee this works since you have no notion of how weights earlier in the network impact latter layers....

anananananana t1_j715cp1 wrote on February 3, 2023 at 9:14 AM

Reply to comment by bokonator in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

Wow, OpenAI indeed. They couldn't have gone more against the original intention of democratizing AI if they tried.

CyberDainz t1_j715ayh wrote on February 3, 2023 at 9:14 AM

Reply to [D] ImageNet normalization vs [-1, 1] normalization by netw0rkf10w

use trainable normalization

self._in_beta = nn.parameter.Parameter( torch.Tensor(in_ch,), requires_grad=True)
self._in_gamma = nn.parameter.Parameter( torch.Tensor(in_ch,), requires_grad=True)
...
self._out_gamma = nn.parameter.Parameter( torch.Tensor(out_ch,), requires_grad=True)
self._out_beta = nn.parameter.Parameter( torch.Tensor(out_ch,), requires_grad=True)

...

x = x + self._in_beta[None,:,None,None]
x = x * self._in_gamma[None,:,None,None]
...
x = x * self._out_gamma[None,:,None,None]
x = x + self._out_beta[None,:,None,None]

visarga t1_j714oy8 wrote on February 3, 2023 at 9:05 AM

Reply to [P] An open source tool for repeatable PyTorch experiments by embedding your code in each model checkpoint by latefordinnerstudios

I save my code and hyper-params in a JSON file in the same folder.

kalyanganji2123 t1_j714ag9 wrote on February 3, 2023 at 8:59 AM

Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

Hiw it's gonna be helpful in teams? Any idea

visarga t1_j713ik7 wrote on February 3, 2023 at 8:48 AM

Reply to comment by frequenttimetraveler in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

(psst, don't tell teachers about that)

PinusPinea t1_j713h2k wrote on February 3, 2023 at 8:48 AM

Reply to comment by EnzoTrent in [D] I'm at a crossroads: Bayesian methods VS Reinforcement Learning, which to choose? by fuscarili

Bayesian statistics is a set of fundamental principles, and will not be out of date in 5-10 years. The hype about AI for data science in industry is way overblown.

devl82 t1_j713box wrote on February 3, 2023 at 8:46 AM

Reply to [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

No for a couple of reasons, but the most important being that software is almost never developed in isolation. You need to interact with other libraries, other engineers, clients. Software engineering is not about writing the most exotic tree data structure in the least amount of time. I can look it up in stackoverflow and I argue that (currently) it is faster than writing a whole prompt about it.

KingsmanVince t1_j712rzn wrote on February 3, 2023 at 8:38 AM

Reply to [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

>Since ChatGPT, many articles have been popping up about how AI will replace software engineers and developers. (Maybe not in the near future, but eventually)

Most of them are clickbaits.

hblarm t1_j712p9u wrote on February 3, 2023 at 8:37 AM

Reply to [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta

For tasks like summarisation and abstractive question answering, there is no single correct way to phrase the target sequence/answer.

“Some of the cups contained brown liquid” means almost the same as “A few vessels had brown fluid in them”. Now imagine how many different ways you could phrase a 4 paragraph essay on globalisation.

In SL, the model is forced to learn the precise answer you feed it, and metrics like ROUGE penalise the use of synonyms. This causes models to perform badly when testing for human preference. The only reliable way to train/evaluate a model to impress humans is to directly incorporate human preferences into training.

This doesn’t lend itself to SL very well, due to the unlimited possible phrasings of sentences, so instead the authors train a reward function that can estimate human preference, and use RL to update model weights to create better and better predictions. Any valid, nicely written phrasing will now get a good score.

Importantly, the model they start with is almost SOTA on the summarisation tasks they are learning. So RL can take them further and further towards human preferences.

In a nutshell, RL allows human preference to be trained on directly, which allows the model to exhibit remarkably creativity.

visarga t1_j712mwb wrote on February 3, 2023 at 8:36 AM

Reply to comment by theoneandonlypatriot in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

My task is in the NLP space, maybe that makes it more approacheable - information extraction from semistructured documents. I can do extraction from existing documents with GPT-3 (question answering) or I can generate new data with known tags.

visarga t1_j7127ha wrote on February 3, 2023 at 8:30 AM

Reply to comment by tripple13 in [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

> It’s not about ‘threatening’ jobs, but improving certain aspects of it.

Jobs don't just exist by themselves, it's the people who demand products and services causing jobs to exist. In other words, they are a function of human needs and desires.

The question is - can automation satiate all our desires? I don't think so. We will invent new jobs and tasks because we will desire things automation can't provide yet. In a contest between human entitlement and AI advancement I think entitlement will always win - we will think everything we have is just basic stuff and want something more. If you asked people from 300 years ago what they think about our lifestyles they would think we already reached singularity, but we know we haven't because we feel already entitled to what we have.

[deleted] t1_j711o2f wrote on February 3, 2023 at 8:23 AM

Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

[deleted]

visarga t1_j711jhe wrote on February 3, 2023 at 8:21 AM

Reply to [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

One year ago I tried information extraction from invoices with GPT-3 and it worked very well. Our team has been working on this project for years, collected data, built labelling tools, trained models, etc ... and now this AI does it without any specific training. We shivered fearing for our future.

Now I started using GPT-3 and let me tell you - it's not as easy as it looks in the playground. If you use GPT-3 you need to think of prompt design, demonstrations, prompt evaluation, data pre-processing and post-processing (is the extracted text actually present in the source?), using justifications, CoT or self consistency. In the end I have so much work I don't know what to do first.

AI will assume a number of tasks and open up other tasks around it so the total amount of work will remain the same - which is as much as people can handle. Software is a weird field - it has been cannibalising itself for decades and decades and yet developers are growing in numbers and compensation. That is a testament to our infinite desire for more.

singularineet t1_j710h53 wrote on February 3, 2023 at 8:07 AM

Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

No matter how hard they try to whack-a-mole them, the biases of the model will come through, particularly by omission. Example? It's super bad about minimizing Jewish history, or saying awful things about the Holocaust like that it was harmful to both the victims and the perpetrators. It's basically like working with a raging racist who's trying to follow a list of very specifically worded instructions from a woke but low functioning autistic HR dept.

bacon_boat t1_j710cpy wrote on February 3, 2023 at 8:05 AM

Reply to [D] I'm at a crossroads: Bayesian methods VS Reinforcement Learning, which to choose? by fuscarili

My guess is that the successful reinforcement learning methods of the future will be compatible, or enforce Bayes either explicitly or implicitly.

With that in mind, I'd go with Bayes, it's eternally relevant. Whereas the RL field might be completely different in 5 years.

WikiSummarizerBot t1_j70z6dw wrote on February 3, 2023 at 7:50 AM

Reply to comment by NitroXSC in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

Differential privacy

>Differential privacy (DP) is a system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset. The idea behind differential privacy is that if the effect of making an arbitrary single substitution in the database is small enough, the query result cannot be used to infer much about any single individual, and therefore provides privacy.

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

Recent comments in /f/MachineLearning