Recent comments in /f/MachineLearning

MediterraneanPirate t1_j77bh7n wrote

Calculus, linear algebra and statistics are a must in ML, which are almost always taught at 11th or 12th grade (at least in my country). So I'd recommend not to worry about math immediately. Best thing you can do is to see what you can build with what you know TODAY. If it doesn't satisfy you, you'll immediately know what you should learn next. As an added bonus, anything math-related you learn about ML will probably help you at school too.

4

ThirdMover t1_j77bf6z wrote

> I think it's likely the ability to determine what is true and what isn't will come from a capability of the model rather than it being told what is and isn't true. It's not possible to mark text as true or not true as this assumes whomever is mafking these things is the sole authority on the truth and never makes mistakes.

I think there is a bit of a misunderstanding here. The issue isn't that GPT3 has wrong opinions about stuff. The issue is that it doesn't have any opinions about what is real or isn't whatsoever. Of course any future AI will operate on limited and flawed information and thus have opinions that are not perfectly true. But before we can even get to that point a model needs to even have the idea of "real" and "not real" as fundamental categories. For GPT3 everything is just text, Harry Potter is as real as Obama. Maybe I am wrong and inference can actually get you there through pure consistency checks, as you say. But we will have to see about that.

5

Dry_Painter9816 t1_j77agm0 wrote

DM me. I can help you create a model so you can see for yourself. Honestly you just need to simplify your data(looks like excel spreadsheet). And learn how to interpret graphs(confusion matrix, roc curve, feature rankinga, MDA value etc). Basically like 3/4 chapters out of stat book. And I have high level machine learning model project and interpretation by professionals including myself to share. But all in all ofcourse knowing the math deepens your understanding. But is not quite necessary to createand interpret a model. As I am currently obtaining my AI and ethics grad cert. Along with among other ai/ machine learning classes.

−4

oldkottor t1_j779n81 wrote

You can use common tools and frameworks without math knowledge. Math is needed though if you want to be able to make a breakthrough (that is to even have a chance).

You also can concentrate on computer science stuff instead of math and go into optimizing learning process and inference, it is highly in demand right now.

7

Lopsided-Factor-780 t1_j7743wv wrote

Question from a noob:
When they say H_Fuse is fed into the decoder model, such that Y = Decoder(H_Fuse), how is it fed in? Is it fed in like the encoder output in an encoder-decoder transformer with cross-attention? Or something else?

Also, if there is a separate encoder and decoder component, are they trained together or separately?

3

EmbarrassedHelp OP t1_j76zkur wrote

The future of open source AI seems to be up in the air right now, with the EU potentially seeking to place heavy restrictions on generative AI that would severely hamper or outright ban open source projects.

The EU industry chief Thierry Breton wants generative AI like ChatGPT to be considered "high risk" and thus tightly controlled (including downstream applications), which would make open source versions extremely difficult or even impossible to release: https://www.reuters.com/technology/eus-breton-warns-chatgpt-risks-ai-rules-seek-tackle-concerns-2023-02-03/

26

__lawless t1_j76xpgk wrote

Just 2 points a) They fine tuned this model to death. Where as GPT3.5 has a handful of examples to fine tune b) This is a multi modal model which consumes the image directly. Where as GPT can only consume text, so they fed it caption of the image

26

yaosio t1_j76vwr2 wrote

I think it's likely the ability to determine what is true and what isn't will come from a capability of the model rather than it being told what is and isn't true. It's not possible to mark text as true or not true as this assumes whomever is mafking these things is the sole authority on the truth and never makes mistakes.

At a certain level of capability the AI will be able to use all of its knowledge to determine what is and isn't true. For example, if you know enough about physics and the Earth, you'll know that the sky is blue without seeing it. For something that can't be confirmed or denied, such as, "Bob puts his shoes on before his pants." The AI could determine the likelihood of such an action based on what it knows about Bob, pants, and shoes.

If it's trained on lies it could determine they are lies because the data is not consistent. If I train you that every number plus another number is a number, but 2+2 is special and equals chair, you could determine I'm lying because it's not consistent with all the data as a whole.

Truth has a consistency to it that lies don't have, and a model can learn that.

18

dancingnightly t1_j76uuee wrote

In this goal, you may find Mixture of Experts architectures interesting.

I like your idea. I have always thought too that in ML we are trying to replicate one human on one task with the worlds data for that task, or one human on many tasks, more recently.

But older ideas and replicating societies and communication for one or many tasks could be equally or more effective. Which this heads in the direction of. There is a library called GeNN which is pretty useful for these experiments, although it's a little slow due to deliberate true-to-biology design.

3

badabummbadabing t1_j76tfqt wrote

Fully agree from a technical perspective with you.

The difference is that at best, you only get the likelihood under your model of choice. If that happens to be a bad model of reality (which I'd argue is the case more often than not with NFs), you might be better off just using some approximate likelihood (or ELBO) of a more powerful model.

But I am not an expert in MCMC models, so I might be talking out of my depth here. I was mainly using these models for MAP estimation.

1

dancingnightly t1_j76t0gh wrote

In theory training T5 alongiside the image embedding models they use (primarily DETR?) shouldn't take much more than a 3090 or Collab Pro GPU. You could train T5s on even consumer high end GPUs in 2020, for example, but the DETR image model probably needs to be ran for each image at the same time which might take up quite a bit of GPU together. The `main.py` script looks like a nice and fairly short typical training script you'd be able to quickly run if you download their repo, pull the scienceQA dataset and send the training args to see if it crashes.

2

matth0x01 t1_j76dt6k wrote

Depends a bit on your skill level and what you want to achieve.

I started with the Introduction to Information Retrieval (2008) book, which was quite math-heavy back then. But I learned a lot and found it a good starting point.

You get the concept of decompounding, reverse index, ranking functions, etc.

Newer IR strategies involve word2vec methods for item representation instead of handcrafted ones or directly learning the search ranking function, which is a different beast compared to traditional search engines.

1

larswl1 t1_j7688oq wrote

I don't know about the new books, but these seem important to me to start with. They set the main tasks of information retrieval. And to solve some specific problems, there are many different articles, for example, ss conferences SIGIR

2

HunteronX t1_j761xqh wrote

The economics is getting there for these models to be big news...
The key features of this work seem to be:

  1. A multimodal embedding representation obtained by individual modality encoders (patch-level for images, token level for text), combined via attention.

  2. Generate rationales first, then infer answers from them, due to accuracy reduction on answers.
    (Not an expert: but is the greater % of hallucinated rationales in baseline case - no vision features - due to large 'context' needed for both rationale + answer, without those features?)

Seems that multimodal representations (language + n=? other modalities) may be important for introducing a loose physical grounding to avoid hallucinating plausible ideas/suggestions + efficient representation of the remaining ideas.

15