Recent comments in /f/MachineLearning
_Arsenie_Boca_ OP t1_j9gix7q wrote
Reply to comment by Professional_Poet489 in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_
If I understand you correctly, that would mean that bottlenecks only interesting when
a) you further use the lower dimensional features as output like in autoencoders b) you are interested in knowing if your features have lower intrinsic dimension
Both are not met in many cases such as normal ResNets. Could you elaborate how you believe bottlenecks act as regularizers?
notdelet t1_j9gija3 wrote
Reply to comment by notdelet in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
In the future know that blocking someone after replying to them prevents them from responding to your reply. This means you are giving the false impression I am not responding to you (but can) to those who are not one of us.
marcus_hk t1_j9gij1a wrote
Looks great. Might not be intelligible to those who don't know what they're looking at, though. Maybe include labels of, say, filters, what each slice of input represents, etc.?
Would like to see the same for normalization layers. And RNNs. And transformers. Keep it up!
_Arsenie_Boca_ OP t1_j9ghq1m wrote
Reply to comment by MediumOrder5478 in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_
That makes a lot of sense. So in that train of thought, bottlenecks are somewhat specific to CNNs, right? Or do you see a similar reasoning in fully connected networks or transformers?
Professional_Poet489 t1_j9gh652 wrote
The theory is that bottlenecks are a compression / regularization mechanism. If you have a smaller number of parameters in the bottleneck than overall in the net, and you get high quality results from the output, then the bottleneck layer must be capturing the information required to drive the output to the correct results. The fact that these intermediate layers are often used for embeddings indicates that this is a real phenomenon.
MediumOrder5478 t1_j9ggg6y wrote
Usually it is to increase the receptive field of the network at a given location (more spatial context). Higher resolution features are then recovered via skip connections if necessary
_Arsenie_Boca_ OP t1_j9gg06n wrote
Reply to comment by aMericanEthnic in [D] Bottleneck Layers: What's your intuition? by _Arsenie_Boca_
Thanks for your comment. Could you elaborate? Do you mean bottlenecks dont have any benefit? If so, why would people use them?
yaosio t1_j9gfypb wrote
Reply to [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining
This has very limited use as they already have the tools to deal with it. There's a second bot of some kind that reads the chat and deletes things if it doesn't like what it sees. Adding the ability to detect when commands are giving through a webpage would close it off. Then you would need some extra clever methods of working around it, such as putting the page in a format Sydney can read but the bot can't read.
aMericanEthnic t1_j9gf0l3 wrote
Bottlenecks are typically a point that is outside of control, purposeful implementation of a bottleneck can only be explained as an attempt at ambiguity in the sense that it’s an attempt to appear of create the feel of a real world issue’ , they “bottlenecks” are unnecessary and should be removed…
[deleted] t1_j9gepzg wrote
brucebay t1_j9g8al3 wrote
Thank you for this. I never used lora except part of stable diffusion training. You linked MS lora lib too. What are the differences between yours and theirs?
blablanonymous t1_j9g840u wrote
“Can we use ChatGPT to use ChatGPT?” GTFO
[deleted] t1_j9g70ay wrote
Reply to comment by notdelet in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
[removed]
ThrowRA39384749 t1_j9g6g2j wrote
Reply to comment by Dovermore in [D] Simple Questions Thread by AutoModerator
Genomics data can be
notdelet t1_j9g627c wrote
Reply to comment by [deleted] in [D] On papers forcing the use of GANs where it is not relevant by AlmightySnoo
> Assuming Gaussianity and then using maximum likelihood gives yields an L2 error minimization problem.
Incorrect, only true if you fix the scale parameter. I normally wouldn't nitpick like this but your unnecessary usage of bold made me.
> (if you interpret training as maximum likelihood estimation)
> a squared loss does not "hide a Gaussian assumption".
It does... if you interpret training as (conditional) MLE. Give me a non-Gaussian distribution with an MLE estimator that yields MSE loss. Also, residuals are explicitly not orthogonal projections whenever the variables are dependent.
SchweeMe t1_j9g5qqt wrote
Youve been looking for a week, ive been trying to create such a model for 2 years, still nothing.
OpeningVariable t1_j9g5llt wrote
Reply to [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining
I don't think it can make a "real" research paper, but it surely is interesting to know. I think, writing it up in a short workshop paper could work. I also think, if you continue working on this and have multiple instances of observations and injections made over time, it could maybe become an overview article and something that could go in a journal.
marcus_hk t1_j9g5hns wrote
Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics
Seems it shouldn't be too difficult to run one stage or layer at a time and cache intermediate results.
Ok-Assignment7469 t1_j9g51o4 wrote
Reply to comment by cat_91 in [D] Maybe a new prompt injection method against newBing or ChatGPT? Is this kind of research worth writing a paper? by KakaTraining
These models are mainly based on reinforcement learning and the goal is to give you an answer which makes u happy the most. If you keep bugging it , eventually it will tell you the password at some point, because you are asking for it , and the bot s main goal is to satisfy your questions with probability and not reasoning because it was not designed to have a reasonable behavior
currentscurrents t1_j9g3sj6 wrote
Interesting! I feel like one of the biggest uses for LLMs will be controlling other systems using plain english instructions.
BrandonBilliard t1_j9g3111 wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hey,
Many of the proposed legal regulations for systems such as autonomous vehicles mention the need for explainability or transparency in the decision making processes of said vehicles. My understanding however was that due to their deep-learning processes, this is either extremely hard or impossible to do?
Is my understanding correct? Or is explainability possible in deep-learning systems?
friend_of_kalman t1_j9g2f5h wrote
Reply to comment by PassionatePossum in Best free and open Math AI? [D] by lorentzofthetwolakes
>WolframAlpha
The only good answer
friend_of_kalman t1_j9g1k6t wrote
>ChatGPT seems to have good skills to calculate
No, it doesn't. It's a statistical language model, not a rule-based calculator.
3rrr6 t1_j9fxs15 wrote
Reply to comment by bitemenow999 in [D] Is there any AI model that predict stocks for the next minute? by SandraPlugged
Amature minute.
abnormal_human t1_j9gjycf wrote
Reply to comment by pyepyepie in [D] What would be the ideal map for "learning" machine learning? by Ashb0rn3_
I guarantee that you have used stochastic gradient descent before if you’ve done any significant amount of ML work. This technique and other optimization methods like it are rooted in differential equations.