millenial_wh00p t1_j98b5ma wrote on February 20, 2023 at 12:46 AM

Reply to [R] Using AI/ML for Quality Control for a factory? by aumzzzz

I apologize for how this post might come across, but your question is actually a very deep one and it will probably take a lot of up front work to get you an answer. Ai/ml is not like cinnamon- you can’t just sprinkle it on your business process and expect it to improve.

First you need to start with instrumenting your processes and building your data warehouse. Is your production flow instrumented for quality and efficiency measurement? If so, are the instruments verified? Do you have baseline performance metrics defined and expectations for improvement? Do you currently conduct any statistical process control? All of these questions have books that go with them, and we haven’t even built a trainable model yet.

I would start with some industrial engineering and applied stats textbooks and go from there. That should give you some idea of how to formulate a hypothesis and determine a method to validate it. From there you can start with the classics like an introduction to statistical learning by James et al and introduction to machine learning by alpaydin.

liquiddandruff t1_j989luo wrote on February 20, 2023 at 12:34 AM

Reply to comment by thecodethinker in [R] neural cloth simulation by LegendOfHiddnTempl

the stochastic parrot argument is a weak one; we are stochastic parrots

the phenomenon of "reasoning ability" may be an emergent one that arises out of the recursive identification of structural patterns in input data--which chatgpt is shown to do.

prove that "understanding" is not and cannot ever be reducible to "statistical modelling" and only then is your null position intellectually defensible

CoderHD t1_j989j2g wrote on February 20, 2023 at 12:33 AM

Reply to comment by MadScientist-1214 in [D] Lion , An Optimizer That Outperforms Adam - Symbolic Discovery of Optimization Algorithms by ExponentialCookie

In my limited testing on a UNet like CNN, it doesnt even come close to the performance of adam sadly. With that said, i might be doing something wrong.

Kumacyin t1_j9889u0 wrote on February 20, 2023 at 12:23 AM

Reply to [R] neural cloth simulation by LegendOfHiddnTempl

what about clipping? from the point of the users, we're gonna focus on the stuff that we can notice right away and one of the biggest is clipping, where you gotta mix large motions and object collisions

nuclear_knucklehead t1_j9876bt wrote on February 20, 2023 at 12:15 AM

Reply to comment by Flag_Red in [R] neural cloth simulation by LegendOfHiddnTempl

Think of the zillions of FEA and CFD simulations done in the engineering world that a fast-running physics model would greatly accelerate and improve. These things are often less visible to the general audience than the high profile stuff you mention, but still have potentially billions of dollars in economic impact and productivity improvements.

liquiddandruff t1_j984iw5 wrote on February 19, 2023 at 11:54 PM

Reply to comment by Ulfgardleo in [D] Please stop by [deleted]

the point you're missing is we're seeing surprising emergent behaviour from LLMs

ToM is not sentience but it is a necessary condition of sentience

> it is also not clear whether what we measured here is theory of mind

crucially, since we can define ToM, definitionally this is infact what is being observed

none of the premises you've used are sufficiently strong to preclude LLMs attaining sentience

it is not known if interaction with the real world is necessary for the development of sentience
memory is important to sentience but LLMs do have a form of working memory as part of its attention architecture and inference process. is this sufficient though? no one knows
sentience if it has it at all may be fleeting and strictly limited during inference stage of the LLM

mind you i agree it's exceedingly unlikely that current LLMs are sentient

but to arrive to "LLMs cannot ever achieve sentience" from these weak premises combined with our of lack of understanding of sentience, a confident conclusion like that is just unwarranted.

the intellectually defensible position is to say you don't know.

blablanonymous t1_j982taj wrote on February 19, 2023 at 11:41 PM

Reply to [R] neural cloth simulation by LegendOfHiddnTempl

Damn, the more you know… what does the loss function look like for this problem?

W_O_H t1_j97zesm wrote on February 19, 2023 at 11:16 PM

Reply to [D] Lack of influence in modern AI by I_like_sources

You can fine tune stable diffusion,TTS and NLP. You can't expect authors to tend to every need for users they gave you the tool and have no requirement to teach you how to use it. Yes some models can't be fine tuned but in 99% of cases there is a different one you can fine tuned.

If you really don't like what's out there make your own, the papers exist.

TeamRocketsSecretary t1_j97xsud wrote on February 19, 2023 at 11:04 PM

Reply to comment by pyepyepie in [D] Please stop by [deleted]

The reason of why overparameterized networks work at all theoretically is still an open question, but that we don’t have the full answer doesn’t mean that the weights are performing “human-like” processing the same way that classical mechanics pre-Einstein didn’t make the corpuscle theory of light any more valid. You all just love to anthromorphize anything and the amount of metaphysical mental snakeoil that chatGPT has generated is ridiculous.

But sure. ChatGPT is mildly sentient 🤷‍♂️

squidward2022 t1_j97veu5 wrote on February 19, 2023 at 10:46 PM

Reply to [D] Relu + sigmoid output activation by mrwafflezzz

Shifting the domain of sigmoid S from (-infty,infty) to (0,infty) is going to be kind of weird. In the first (original) case we would have S(-infty) = 0, S(0) = 1/2, S(infty) = 1, and thus the finite logit values w your network may output will be between -infty and infty and S(w) will give something meaningful. Now if you mentally shift S to be defined between (0, infty) you get S(0) = 0 S(infty) = 1. What value w would be needed to achieve S(w) = 1/2 ? infty / 2 ? It seems important that Sigmoid is defined on the open interval (-infty, infty) not just because we wish logits to be arbitrary valued, but also because we want S to be "expressive" around the logit values we see in practice, which must be finite.

Here is something you could do that doesn't require a shifted sigmoid: You have network f(x) = w which maps an input x to a score w. Take tanh(f(x)) and you get something with range (-1,1). Any negative w is mapped to a negative value in the range(-1,0) Now just take the ReLU of this, relu(tanh(f(x)) and all negative values from the tanh, which come from negative w's, go to 0 and all the positive values from the tanh, which come from positive w's, are unnafected.

In this way we have, negative w --> (-1,0) --> 0 and positive w --> (0,1) --> (0,1).

currentscurrents t1_j97v09x wrote on February 19, 2023 at 10:43 PM

Reply to comment by Cheap_Meeting in [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho

In Bulgaria, no less.

__lawless t1_j97v07m wrote on February 19, 2023 at 10:43 PM

Reply to [D] Relu + sigmoid output activation by mrwafflezzz

Easiest solution no sigmoid no relu in the last layer just clamp it between 0 and 1. Works surprisingly well

bremen79 t1_j97sb9r wrote on February 19, 2023 at 10:24 PM

Reply to [D] Relu + sigmoid output activation by mrwafflezzz

The sigmoid will make effectively very hard for the network to produce values close to 1, because it would require a pre activation value close to infinity. Would this be a good behavior in your application?

Ol_OLUs22 t1_j97s5vv wrote on February 19, 2023 at 10:23 PM

Reply to [D] what are some open problems in computer vision currently? by Fabulous-Let-822

adversarial examples

fasttosmile t1_j97r2fc wrote on February 19, 2023 at 10:15 PM

Reply to [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

byobu > tmux

Repulsive_Tart3669 t1_j97p211 wrote on February 19, 2023 at 10:00 PM

Reply to [D] Relu + sigmoid output activation by mrwafflezzz

I believe a common approach is to use a linear activation function for regression problems unless target variable has certain semantics that suggest the use of other non-linearities (sigmoid, tanh etc.). Also consider rescaling your targets instead of trying to match the desired output with activation functions.

From you description (I might be wrong though), it seems like the 0 output is a special case. In this case you might want to use a binary classifier to classify input samples into two classes first. For class 0 the output is 0. For class 1 you use another model (regressor) that outputs a prediction.

Ulfgardleo t1_j97nb2q wrote on February 19, 2023 at 9:48 PM

Reply to [D] Relu + sigmoid output activation by mrwafflezzz

sigmoid of 0 is 0.5

tooquickforwords t1_j97lef7 wrote on February 19, 2023 at 9:35 PM

Reply to comment by iacolippo in [D] Does langchain upload all user’s data to Openai? by westeast1000

If you use the Azure version, the data does not get used elsewhere. It has the same enterprise guarantees as most of Azure.

labloke11 t1_j97hz6s wrote on February 19, 2023 at 9:11 PM

Reply to [D] Is Google a language transformer like ChatGPT except without the G (Generative) part? by Lets_Gooo_123

Can we remove this posting?

GaseousOrchid t1_j97gxhy wrote on February 19, 2023 at 9:04 PM

Reply to [D] Simple Questions Thread by AutoModerator

What are some good tools for data pipelines that scale well? I'm locked into Jax/Flax for work, but would like to disconnect from TensorFlow to the greatest extent possible. I was looking at the huggingface dataloaders, does anyone have experience with those?

Appropriate_Ant_4629 t1_j97gjhy wrote on February 19, 2023 at 9:01 PM

Reply to comment by royalemate357 in [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

>egress fees / data transfer fees

On the bright side, ingress is often free.

It costs surprisingly little to stream live video ***into*** the cloud and spew back tiny embedding vectors from models running there.

trnka t1_j97ftj9 wrote on February 19, 2023 at 8:56 PM

Reply to comment by not_mig in [D] Simple Questions Thread by AutoModerator

I haven't seen a guide on that, but I remember it being challenging! Feel free to post one that's giving you trouble.

buyIdris666 t1_j97eom6 wrote on February 19, 2023 at 8:48 PM

Reply to comment by currentscurrents in [D] what are some open problems in computer vision currently? by Fabulous-Let-822

Interesting! I didn't realize that

Sandy_dude OP t1_j97eij3 wrote on February 19, 2023 at 8:47 PM

Reply to comment by PhoibusApollo in [R] Looking for papers which are modified variational autoencoder (VAE) by Sandy_dude

Thanks!

tripple13 t1_j97cxap wrote on February 19, 2023 at 8:36 PM

Reply to comment by IDefendWaffles in [D] Is Google a language transformer like ChatGPT except without the G (Generative) part? by Lets_Gooo_123

Yes, indeed. While the lightbulb may contain properties which may or may not exhibit the Quantum Tunnel Effect (QTE), one must take great care not to confuse this with the Superposition Lightspeed Diffraction (SDL), as it is of paramount importance, that we do not make light of such phenomena - Essentially making all of humanity into sub-particle atoms in the progress towards enlightenment.

Recent comments in /f/MachineLearning