Recent comments in /f/MachineLearning

blablanonymous t1_j7b3vcx wrote

That’s a very narrow perspective. Not all technological progress is inherently good. It obviously just depends what you do with it. These new tools have the potential to create extremely useful applications but also to destroy many jobs concentrating wealth even more in the hands of a small population very rapidly. This can have profound effects on this generation and is definitely worth thinking about. Think the socioeconomic mess that big tech brought San Francisco but at a global scale. SF was heaven 20 years ago. Now it’s bell on earth.

0

race2tb t1_j7az723 wrote

I mean AI will be able to generate their own unique art styles like humans can and copyright it instantly. Copyright is over once these generative models are doing pretty much everything and reasoning their own unique solution. It is time to start thinking about how to restructure society away from human creator to AI creators. I have no idea how the patent office is going to keep up honestly without an AI doing the approvals. I'm pretty sure patents and property rights are going to no longer be functional concepts in a society where AI is producing everything.

Even politicians jobs are going to end up being done by AIs in the end that are just data driven decision makers with some oversight by human validators.

3

matth0x01 t1_j7ayc9e wrote

Seems that you are more interested on the crawling and ETL side.

Maybe you should look more into Data warehouse or Data lake literatur. Especially the shift in paradigm from ETL (extract, transform, load) to ELT (extract, load, transform) respectively schema-on-read.

2

---AI--- t1_j7au2sj wrote

The Chinese room experiment is proof that a Chinese room can be sentient. There's no difference between a Chinese room and a human brain.

> It doesn't consider the context of the problem because it has no context.

I do not know what you mean here, so could you please give a specific example that you think ChatGPT and similar models will never be able to correctly answer.

2

Dr_Love2-14 t1_j7aqm6x wrote

During model training, I imagine the model would benefit from some form of "self-reflection" at recurrent intervals, similar to human sleep. For a crude workflow, one could design the model to recall through auto-prompting onto a context window everything its learned that is relevant to the newly exposed training data, and then the model makes a rationale decision (following a constant pre-encoded prompt) to restate the information and classify it as factual or non-factual, and then this self-generated text is backpropagated to the model.

(Disclaimer: I follow ML research as a layman)

1

Ggronne OP t1_j7aj3co wrote

I have written small web scrapers for different applications, but none were based on theory. An upcoming project requires more extensive information retrieval and I would therefore like to get a better foundation.

I will start with Introduction to Information Retrieval, thanks!

I will start with Introduction to Information Retrieval; thanks!

1

jimmymvp t1_j7aend6 wrote

Indeed, if your model is bad at modeling the data there's not much use in computing the likelihoods. If you want to just sample images that look cool, you don't care that much about likelihoods. However, there are certain use-cases where we care about exact likelihoods, estimating normalizing constants and providing guarantees for MCMC. Granted, you can always run MCMC with something close to a proposal distribution. However, obtaining nice guarantees on convergence and mixing times (correctness??) is difficult then, I don't know how are you supposed to do this when using a proposal for which you can't evaluate the likelihood. Similarly when you talk about importance sampling, you can only obtain correct weights if you have the correct likelihoods, otherwise it's approximate, not just in the model but also in the estimator.

This is the way I see it at least, but I'll be sure to read the aforementioned paper. I'm also not sure how much having the lower bound hurts you in estimation.

2

Myxomatosiss t1_j7abejl wrote

That's a fantastic question. ChatGPT is a replication of associative memory with an attention mechanism. That means it has associated strings with other strings based on a massive amount of experience. However, it doesn't contain a buffer that it works through. We have a working space in our heads where we can replay information, ChatGPT does not. In fact, when you pump in an input, it cycles through the associative calculations, comes to an instantaneous answer, and then ceases to function until another call is made. It doesn't consider the context of the problem because it has no context. Any context it has is inherited from its training set. To compare it with the Chinese room experiment, imagine if those reading the output of the Chinese room found it to have some affect. Maybe it has a dry sense of humor, or is a bit of an airhead. That affect would come exclusively from the data set, and not from some bias in the room. I really encourage you to read more about neuroscience if you'd like to learn more. There have been brilliant minds considering intelligence since long before we were born, and every ML accomplishment has been inspired by their work.

1