Recent comments in /f/MachineLearning

Soundwave_47 t1_j8fu3r6 wrote

Somewhat, and no.

We generally define AGI as an intelligence (which, in the current paradigm, would be a set of algorithms) that has decision making and inference capabilities in a broad set of areas, and is able to improve its understanding of that which it does not know. Think of it like school subjects, it might not be an expert in all of {math, science, history, language, economics}, but it has some notion of how to do basic work in all of those areas.

This is extremely vague and not universally agreed upon (for example, some say it should exceed peak human capabilities in all tasks).

1

farmingvillein t1_j8ftdg9 wrote

Some helpful gut checks:

  1. Do you have reason to believe that your method will scale (with parameters and data)? Maybe (probably) you can't actually test things at Google scale--but if you have good theoretical reasons to believe that your method would be accretive at scale, that is a major +.

Yes, getting things to run really well at small scale can be of (sometimes extreme!) value--but you're simply going to see less interest from reviewers on its own. There have been a bazillion hacky ML methods that turn out to be entirely irrelevant once you scale up substantially, and people are wary of such papers/discussions.

If you've got to go down this path, then make sure to position it explicitly as hyper-optimizing small-scale models (like for mobile).)

  1. Do you have good reasons to believe that the "top" paper plus your method would further boost SOTA? Even better, can you test it to confirm?

If your method is--at its theoretical core--simply a twist on a subset of the methods from that SOTA used, then you're going to see much less paper interest, unless you can promise significant improvements in simplicity/efficiency.

> But this "SOTA" paper uses some methods that just don't seem practical for applications at all.

  1. Can you demonstrate the superiority of your method on some of these other applications? So that you can, e.g., create an SOTA in some sort of subset? That can be helpful.
2

AdamAlexanderRies t1_j8fsdyh wrote

Oh, yes! My mistake. Definitely check out discord. PM me here if you want to add me there :)

A couple public servers you should probably glance at:

https://discord.com/invite/openai

https://discord.com/invite/midjourney

You can use the Midjourney bot to make your own images if you go to one of their "newbie-##" rooms and type "/imagine [prompt]"

1

farmingvillein t1_j8frv87 wrote

> not to use language models to interact with the world (which seems trivial to me, sorry),

The best argument here is that "true" intelligent requires "embedded" agents, i.e., agents that can interact with our (or, at least, "a") world (to learn).

Obviously, no one actually knows what will make AGI work, if anything...but it isn't a unique/fringe view OP is suggesting.

1

Remarkable_Ad9528 t1_j8frhma wrote

Right now I’m just writing updates. But every publication includes a new tool or code snippet. I just started last week so its evolving. Next week I’m going to add more AI tutorial videos to my YouTube channel that will run through how to use langchain to wire up different tools together and use them with an LLM for some application. I’m thinking I’ll do a lot of small tutorials in Jupyter Notebooks and push them to a public repo on GitHub, then include links to the script I’m referencing in the engineering section of the email I send out. I have to poll my audience first to see if that’s something they’re interested in first. I think it would be though…

1

cd_1999 t1_j8fmlej wrote

Have you heard of Searle's Chinese Room?

Some people (sorry I can't give you references off the top of my head) argue there's something special about the biological nervous system, so the material substrate is not irrelevant. (Sure you could reverse engineer the whole biological system, but that would probably take much longer).

4

coolmlgirl t1_j8fmfpi wrote

I'm using the OctoML platform (https://octoml.ai/) to optimize your model and I got your average inference latency down to 2.14ms on an AWS T4 GPU. On an Ice Lake CPU I can get your latency down to 27.47ms. I'm assuming shapes of [1,128] for your inputs "input_ids," "attention_mask," and "token_type_ids," but want to confirm your actual shapes so that we're comparing apples to apples. Do you know what shapes you're using?

1

patrickkidger t1_j8fdtx2 wrote

Heads-up that my newer jaxtyping project now exists.

Despite the name is supports both PyTorch or JAX; it is also substantially less hackish than TorchTyping! As such I recommend jaxtyping instead of TorchTyping regardless of your framework.

(jaxtyping is now widely used internally.)

3

patrickkidger t1_j8fde35 wrote

On static shape checking: have a look at jaxtyping, which offers compile-time shape checks for JAX/PyTorch/etc.

(Why "JAX"typing? Because it originally only supported JAX. But it now supports other frameworks too! In particular I now recommend jaxtyping over my older "TorchTyping" project, which is pretty undesirably hacky.)

In terms of fitting this kind of stuff into a proper language: that'd be lovely. I completely agree that the extent to which we have retrofitted Python is pretty crazy!

1

Borrowedshorts t1_j8fc7bk wrote

I'd say it's the opposite. 2 million members didn't sign up to this sub for academic only discussions. If you want that, it would be best to start a subreddit expressly for that purpose. ChatGPT is changing the world, so to say those posts are low quality is just gatekeeping discussions away from what people actually want to participate in.

−2

Borrowedshorts t1_j8fc3wu wrote

I'd say it's the opposite. 2 million members didn't sign up to this sub for academic only discussions. If you want that, it would be best to start a subreddit expressly for that purpose. ChatGPT is changing the world, so to say those posts are low quality is just gatekeeping discussions away from what people actually want to participate in.

−3