[deleted] t1_j8fu4du wrote on February 14, 2023 at 12:08 AM

Reply to comment by BashsIash in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

[deleted]

Soundwave_47 t1_j8fu3r6 wrote on February 14, 2023 at 12:08 AM

Reply to comment by kaityl3 in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

Somewhat, and no.

We generally define AGI as an intelligence (which, in the current paradigm, would be a set of algorithms) that has decision making and inference capabilities in a broad set of areas, and is able to improve its understanding of that which it does not know. Think of it like school subjects, it might not be an expert in all of {math, science, history, language, economics}, but it has some notion of how to do basic work in all of those areas.

This is extremely vague and not universally agreed upon (for example, some say it should exceed peak human capabilities in all tasks).

farmingvillein t1_j8ftdg9 wrote on February 14, 2023 at 12:03 AM

Reply to [D] Is a non-SOTA paper still good to publish if it has an interesting method that does have strong improvements over baselines (read text for more context)? Are there good examples of this kind of work being published? by orangelord234

Some helpful gut checks:

Do you have reason to believe that your method will scale (with parameters and data)? Maybe (probably) you can't actually test things at Google scale--but if you have good theoretical reasons to believe that your method would be accretive at scale, that is a major +.

Yes, getting things to run really well at small scale can be of (sometimes extreme!) value--but you're simply going to see less interest from reviewers on its own. There have been a bazillion hacky ML methods that turn out to be entirely irrelevant once you scale up substantially, and people are wary of such papers/discussions.

If you've got to go down this path, then make sure to position it explicitly as hyper-optimizing small-scale models (like for mobile).)

Do you have good reasons to believe that the "top" paper plus your method would further boost SOTA? Even better, can you test it to confirm?

If your method is--at its theoretical core--simply a twist on a subset of the methods from that SOTA used, then you're going to see much less paper interest, unless you can promise significant improvements in simplicity/efficiency.

> But this "SOTA" paper uses some methods that just don't seem practical for applications at all.

Can you demonstrate the superiority of your method on some of these other applications? So that you can, e.g., create an SOTA in some sort of subset? That can be helpful.

AdamAlexanderRies t1_j8fsdyh wrote on February 13, 2023 at 11:55 PM

Reply to comment by daking999 in [D] Quality of posts in this sub going down by MurlocXYZ

Oh, yes! My mistake. Definitely check out discord. PM me here if you want to add me there :)

A couple public servers you should probably glance at:

https://discord.com/invite/openai

https://discord.com/invite/midjourney

You can use the Midjourney bot to make your own images if you go to one of their "newbie-##" rooms and type "/imagine [prompt]"

farmingvillein t1_j8frv87 wrote on February 13, 2023 at 11:52 PM

Reply to comment by pyepyepie in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

> not to use language models to interact with the world (which seems trivial to me, sorry),

The best argument here is that "true" intelligent requires "embedded" agents, i.e., agents that can interact with our (or, at least, "a") world (to learn).

Obviously, no one actually knows what will make AGI work, if anything...but it isn't a unique/fringe view OP is suggesting.

Remarkable_Ad9528 t1_j8frhma wrote on February 13, 2023 at 11:49 PM

Reply to comment by ilovethrills in [D] What ML or ML-powered projects are you currently building? by TikkunCreation

Right now I’m just writing updates. But every publication includes a new tool or code snippet. I just started last week so its evolving. Next week I’m going to add more AI tutorial videos to my YouTube channel that will run through how to use langchain to wire up different tools together and use them with an LLM for some application. I’m thinking I’ll do a lot of small tutorials in Jupyter Notebooks and push them to a public repo on GitHub, then include links to the script I’m referencing in the engineering section of the email I send out. I have to poll my audience first to see if that’s something they’re interested in first. I think it would be though…

starfries t1_j8fp30e wrote on February 13, 2023 at 11:31 PM

Reply to comment by tysam_and_co in [D] Quality of posts in this sub going down by MurlocXYZ

What's the current understanding of why/when batch norm works? I haven't kept up with the literature but I had the impression there was no real consensus.

cd_1999 t1_j8fmlej wrote on February 13, 2023 at 11:14 PM

Reply to comment by BashsIash in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

Have you heard of Searle's Chinese Room?

Some people (sorry I can't give you references off the top of my head) argue there's something special about the biological nervous system, so the material substrate is not irrelevant. (Sure you could reverse engineer the whole biological system, but that would probably take much longer).

coolmlgirl t1_j8fml7y wrote on February 13, 2023 at 11:14 PM

Reply to comment by coolmlgirl in [D] Speed up HuggingFace Inference Pipeline by [deleted]

My results above are for this model: https://huggingface.co/ayameRushia/bert-base-indonesian-1.5G-sentiment-analysis-smsa

It's pretty easy to use that platform to automatically do the same for your other model too-- we can discuss that one also later once we figure out this one.

coolmlgirl t1_j8fmfpi wrote on February 13, 2023 at 11:13 PM

Reply to comment by askingforhelp1111 in [D] Speed up HuggingFace Inference Pipeline by [deleted]

I'm using the OctoML platform (https://octoml.ai/) to optimize your model and I got your average inference latency down to 2.14ms on an AWS T4 GPU. On an Ice Lake CPU I can get your latency down to 27.47ms. I'm assuming shapes of [1,128] for your inputs "input_ids," "attention_mask," and "token_type_ids," but want to confirm your actual shapes so that we're comparing apples to apples. Do you know what shapes you're using?

themusicdude1997 t1_j8fm0tf wrote on February 13, 2023 at 11:10 PM

Reply to comment by sunbunnyprime in [D] Critique of statistics research from machine learning perspectives (and vice versa)? by fromnighttilldawn

:D

[deleted] t1_j8fi0tk wrote on February 13, 2023 at 10:41 PM

Reply to [D] What ML dev tools do you wish you'd discovered earlier? by TikkunCreation

[removed]

patrickkidger t1_j8fdtx2 wrote on February 13, 2023 at 10:13 PM

Reply to comment by 0x00A0C0 in [D] Have their been any attempts to create a programming language specifically for machine learning? by throwaway957280

Heads-up that my newer jaxtyping project now exists.

Despite the name is supports both PyTorch or JAX; it is also substantially less hackish than TorchTyping! As such I recommend jaxtyping instead of TorchTyping regardless of your framework.

(jaxtyping is now widely used internally.)

BedroomScientist92 t1_j8fdlbs wrote on February 13, 2023 at 10:11 PM

Reply to comment by FastestLearner in [D] Is a non-SOTA paper still good to publish if it has an interesting method that does have strong improvements over baselines (read text for more context)? Are there good examples of this kind of work being published? by orangelord234

That is very true. Connectionists were told to go home and stop pursuing that avenue. Great example!

patrickkidger t1_j8fde35 wrote on February 13, 2023 at 10:10 PM

Reply to [D] Have their been any attempts to create a programming language specifically for machine learning? by throwaway957280

On static shape checking: have a look at jaxtyping, which offers compile-time shape checks for JAX/PyTorch/etc.

(Why "JAX"typing? Because it originally only supported JAX. But it now supports other frameworks too! In particular I now recommend jaxtyping over my older "TorchTyping" project, which is pretty undesirably hacky.)

In terms of fitting this kind of stuff into a proper language: that'd be lovely. I completely agree that the extent to which we have retrofitted Python is pretty crazy!

Borrowedshorts t1_j8fc7bk wrote on February 13, 2023 at 10:02 PM

Reply to [D] Quality of posts in this sub going down by MurlocXYZ

I'd say it's the opposite. 2 million members didn't sign up to this sub for academic only discussions. If you want that, it would be best to start a subreddit expressly for that purpose. ChatGPT is changing the world, so to say those posts are low quality is just gatekeeping discussions away from what people actually want to participate in.

Borrowedshorts t1_j8fc3wu wrote on February 13, 2023 at 10:01 PM

Reply to [D] Quality of posts in this sub going down by MurlocXYZ

I'd say it's the opposite. 2 million members didn't sign up to this sub for academic only discussions. If you want that, it would be best to start a subreddit expressly for that purpose. ChatGPT is changing the world, so to say those posts are low quality is just gatekeeping discussions away from what people actually want to participate in.

[deleted] t1_j8f814a wrote on February 13, 2023 at 9:35 PM

Reply to [D] Is a non-SOTA paper still good to publish if it has an interesting method that does have strong improvements over baselines (read text for more context)? Are there good examples of this kind of work being published? by orangelord234

[deleted]

JackBlemming t1_j8f6v8c wrote on February 13, 2023 at 9:28 PM

Reply to comment by EducationalCicada in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

Schmidhuber actually already did this in the 90s

mindmech t1_j8f44bh wrote on February 13, 2023 at 9:10 PM

Reply to comment by CumbrianMan in [D] Quality of posts in this sub going down by MurlocXYZ

Yeah i have no idea how to do that. I tried following some data scientists but they kept posting about politics.

[deleted] t1_j8f2xpq wrote on February 13, 2023 at 9:03 PM

Reply to [D] What ML dev tools do you wish you'd discovered earlier? by TikkunCreation

[removed]

bkaz t1_j8f2v8t wrote on February 13, 2023 at 9:02 PM

Reply to comment by big_gondola in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

That's called MoE: mixture of experts: https://en.wikipedia.org/wiki/Mixture_of_experts

daking999 t1_j8f0a6l wrote on February 13, 2023 at 8:46 PM

Reply to comment by AdamAlexanderRies in [D] Quality of posts in this sub going down by MurlocXYZ

Oh sorry I meant I should check out discord!

I've used ChatGPT for a few tasks and it's been helpful (not perfect), e.g. summarizing a long document. Current issue is mainly just it being overloaded! Haven't tried code writing or brainstorming yet.

SummerFruits2 t1_j8eykje wrote on February 13, 2023 at 8:35 PM

Reply to comment by extracensorypower in [R] [N] Toolformer: Language Models Can Teach Themselves to Use Tools - paper by Meta AI Research by radi-cho

Haha, had a good laugh! Thanks for that!

CyberDainz t1_j8eycl2 wrote on February 13, 2023 at 8:33 PM

Reply to [R] DIGIFACE-1M — synthetic dataset with one million images for face recognition by t0ns0fph0t0ns

112x112 resolution. Completely useless in 2k23

Recent comments in /f/MachineLearning