abnormal_human t1_j9gjycf wrote on February 21, 2023 at 8:10 PM

Reply to comment by pyepyepie in [D] What would be the ideal map for "learning" machine learning? by Ashb0rn3_

I guarantee that you have used stochastic gradient descent before if you’ve done any significant amount of ML work. This technique and other optimization methods like it are rooted in differential equations.

_Arsenie_Boca_ OP t1_j9gix7q wrote on February 21, 2023 at 8:02 PM

Reply to comment by Professional_Poet489 in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

If I understand you correctly, that would mean that bottlenecks only interesting when

a) you further use the lower dimensional features as output like in autoencoders b) you are interested in knowing if your features have lower intrinsic dimension

Both are not met in many cases such as normal ResNets. Could you elaborate how you believe bottlenecks act as regularizers?

notdelet t1_j9gija3 wrote on February 21, 2023 at 7:59 PM

Reply to comment by notdelet in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

In the future know that blocking someone after replying to them prevents them from responding to your reply. This means you are giving the false impression I am not responding to you (but can) to those who are not one of us.

marcus_hk t1_j9gij1a wrote on February 21, 2023 at 7:59 PM

Reply to [P] The First Depthwise-separable Convolution Animation by Animated-AI

Looks great. Might not be intelligible to those who don't know what they're looking at, though. Maybe include labels of, say, filters, what each slice of input represents, etc.?

Would like to see the same for normalization layers. And RNNs. And transformers. Keep it up!

_Arsenie_Boca_ OP t1_j9ghq1m wrote on February 21, 2023 at 7:53 PM

Reply to comment by MediumOrder5478 in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

That makes a lot of sense. So in that train of thought, bottlenecks are somewhat specific to CNNs, right? Or do you see a similar reasoning in fully connected networks or transformers?

Professional_Poet489 t1_j9gh652 wrote on February 21, 2023 at 7:48 PM

Reply to [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

The theory is that bottlenecks are a compression / regularization mechanism. If you have a smaller number of parameters in the bottleneck than overall in the net, and you get high quality results from the output, then the bottleneck layer must be capturing the information required to drive the output to the correct results. The fact that these intermediate layers are often used for embeddings indicates that this is a real phenomenon.

MediumOrder5478 t1_j9ggg6y wrote on February 21, 2023 at 7:42 PM

Reply to [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

Usually it is to increase the receptive field of the network at a given location (more spatial context). Higher resolution features are then recovered via skip connections if necessary

_Arsenie_Boca_ OP t1_j9gg06n wrote on February 21, 2023 at 7:38 PM

Reply to comment by aMericanEthnic in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

Thanks for your comment. Could you elaborate? Do you mean bottlenecks dont have any benefit? If so, why would people use them?

yaosio t1_j9gfypb wrote on February 21, 2023 at 7:38 PM

Reply to [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining

This has very limited use as they already have the tools to deal with it. There's a second bot of some kind that reads the chat and deletes things if it doesn't like what it sees. Adding the ability to detect when commands are giving through a webpage would close it off. Then you would need some extra clever methods of working around it, such as putting the page in a format Sydney can read but the bot can't read.

aMericanEthnic t1_j9gf0l3 wrote on February 21, 2023 at 7:29 PM

Reply to [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_

Bottlenecks are typically a point that is outside of control, purposeful implementation of a bottleneck can only be explained as an attempt at ambiguity in the sense that it’s an attempt to appear of create the feel of a real world issue’ , they “bottlenecks” are unnecessary and should be removed…

[deleted] t1_j9gepzg wrote on February 21, 2023 at 7:26 PM

Reply to [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining

[deleted]

brucebay t1_j9g8al3 wrote on February 21, 2023 at 6:23 PM

Reply to [P] minLoRA: An Easy-to-Use PyTorch Library for Applying LoRA to PyTorch Models by cccntu

Thank you for this. I never used lora except part of stable diffusion training. You linked MS lora lib too. What are the differences between yours and theirs?

blablanonymous t1_j9g840u wrote on February 21, 2023 at 6:22 PM

Reply to [D] Can we use ChatGPT to implement first-order derivatives? by vladosaurus

“Can we use ChatGPT to use ChatGPT?” GTFO

[deleted] t1_j9g70ay wrote on February 21, 2023 at 6:14 PM

Reply to comment by notdelet in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

[removed]

ThrowRA39384749 t1_j9g6g2j wrote on February 21, 2023 at 6:11 PM

Reply to comment by Dovermore in [D] Simple Questions Thread by AutoModerator

Genomics data can be

notdelet t1_j9g627c wrote on February 21, 2023 at 6:09 PM

Reply to comment by [deleted] in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo

> Assuming Gaussianity and then using maximum likelihood gives yields an L2 error minimization problem.

Incorrect, only true if you fix the scale parameter. I normally wouldn't nitpick like this but your unnecessary usage of bold made me.

> (if you interpret training as maximum likelihood estimation)

> a squared loss does not "hide a Gaussian assumption".

It does... if you interpret training as (conditional) MLE. Give me a non-Gaussian distribution with an MLE estimator that yields MSE loss. Also, residuals are explicitly not orthogonal projections whenever the variables are dependent.

SchweeMe t1_j9g5qqt wrote on February 21, 2023 at 6:07 PM

Reply to [D] Is there any AI model that predict stocks for the next minute? by SandraPlugged

Youve been looking for a week, ive been trying to create such a model for 2 years, still nothing.

OpeningVariable t1_j9g5llt wrote on February 21, 2023 at 6:06 PM

Reply to [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining

I don't think it can make a "real" research paper, but it surely is interesting to know. I think, writing it up in a short workshop paper could work. I also think, if you continue working on this and have multiple instances of observations and injections made over time, it could maybe become an overview article and something that could go in a journal.

marcus_hk t1_j9g5hns wrote on February 21, 2023 at 6:05 PM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

Seems it shouldn't be too difficult to run one stage or layer at a time and cache intermediate results.

Ok-Assignment7469 t1_j9g51o4 wrote on February 21, 2023 at 6:02 PM

Reply to comment by cat_91 in [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining

These models are mainly based on reinforcement learning and the goal is to give you an answer which makes u happy the most. If you keep bugging it , eventually it will tell you the password at some point, because you are asking for it , and the bot s main goal is to satisfy your questions with probability and not reasoning because it was not designed to have a reasonable behavior

currentscurrents t1_j9g3sj6 wrote on February 21, 2023 at 5:54 PM

Reply to [R] ChatGPT for Robotics: Design Principles and Model Abilities by CheapBreakfast9

Interesting! I feel like one of the biggest uses for LLMs will be controlling other systems using plain english instructions.

BrandonBilliard t1_j9g3111 wrote on February 21, 2023 at 5:49 PM

Reply to [D] Simple Questions Thread by AutoModerator

Hey,

Many of the proposed legal regulations for systems such as autonomous vehicles mention the need for explainability or transparency in the decision making processes of said vehicles. My understanding however was that due to their deep-learning processes, this is either extremely hard or impossible to do?

Is my understanding correct? Or is explainability possible in deep-learning systems?

friend_of_kalman t1_j9g2f5h wrote on February 21, 2023 at 5:45 PM

Reply to comment by PassionatePossum in Best free and open Math AI? [D] by lorentzofthetwolakes

>WolframAlpha

The only good answer

friend_of_kalman t1_j9g1k6t wrote on February 21, 2023 at 5:40 PM

Reply to [D] Can we use ChatGPT to implement first-order derivatives? by vladosaurus

>ChatGPT seems to have good skills to calculate

No, it doesn't. It's a statistical language model, not a rule-based calculator.

3rrr6 t1_j9fxs15 wrote on February 21, 2023 at 5:16 PM

Reply to comment by bitemenow999 in [D] Is there any AI model that predict stocks for the next minute? by SandraPlugged

Amature minute.

Recent comments in /f/MachineLearning