pyepyepie t1_j95f3m2 wrote on February 19, 2023 at 11:10 AM

Reply to comment by _Minos in [D] Toolformer implementation using only few-shot prompting by MysteryInc152

I actually think your approach shows the idea better than the original paper. However, the original paper can be implemented with smaller language models which might be better for people who want to deploy it. All over, I think the application is almost trivial and I am not surprised it worked well for you (due to the crazy power of LLMs).

Great work!

pyepyepie t1_j95eqm5 wrote on February 19, 2023 at 11:04 AM

Reply to comment by Optimal-Asshole in [D] Please stop by [deleted]

Virology hits me hard, I might have been the idiot once or twice (always did what I was told though, I mean in discussions).

pyepyepie t1_j95e9ka wrote on February 19, 2023 at 10:58 AM

Reply to comment by TeamRocketsSecretary in [D] Please stop by [deleted]

I implemented GPT-like (transformers) models almost since it was out (not exactly but worked with the decoder in the context of NMT and with encoders a lot like everyone who does NLP, so yeah not GPT-like but I understand the tech) - I also argue you guys are just guessing. Do you understand how funny it looks when people claim what it is and what it isn't? Did you talk with the weights?

Edit: what I agree with is that this discussion is a waste of time in this sub.

I_like_sources OP t1_j95d7e4 wrote on February 19, 2023 at 10:42 AM

Reply to comment by derek_ml in [D] Lack of influence in modern AI by I_like_sources

Both are not well related. You seem to argue that you should let AI do it's thing, what it's good at, without interfering, yet keep in mind that results are for humans, not for computers.

andreichiffa t1_j95d303 wrote on February 19, 2023 at 10:40 AM

Reply to [D] Lion , An Optimizer That Outperforms Adam - Symbolic Discovery of Optimization Algorithms by ExponentialCookie

I really think we need an intermediate between conference papers and arxiv, to just evaluate how reproducible/sane the paper is without evaluating whether it is important or not.

Because at this stage I genuinely can't tell if that's a press release, a report in a paper form, or an actual paper.

DevarshTare t1_j95ct2p wrote on February 19, 2023 at 10:37 AM

Reply to [D] Simple Questions Thread by AutoModerator

What matters while running models?

hey guys, I'm new to machine learning and just learning from the basics. I am planning to buy a GPU soon for running pre-built models from google colab.

My question is after you build a model what matters for the models runtime? Is it the Memory, the bandwidth or the cuda core you utilize?

Basically what makes an already trained model run faster when using in application? I can imagine it may vary from application to application, but just wanted to learn what matters the most when running pre trained models?

NS-19 t1_j95clpy wrote on February 19, 2023 at 10:34 AM

Reply to [D] Simple Questions Thread by AutoModerator

If i were to start learning ML today suggest some online resources that guide you along the way on a project by project basis

I_like_sources OP t1_j95ckjv wrote on February 19, 2023 at 10:33 AM

Reply to comment by sharockys in [D] Lack of influence in modern AI by I_like_sources

It's not about wanting a free open box tool. It's about the lack of transparency and accountability in the AI community. Developers need to take responsibility for their creations and provide support and feedback to users, particularly in critical applications like healthcare or finance. By providing more transparency and support, we can improve the quality and reliability of AI systems, which benefits everyone in the field.

derek_ml t1_j95cgy8 wrote on February 19, 2023 at 10:32 AM

Reply to [D] Lack of influence in modern AI by I_like_sources

I empathize, but I'd be curious what you think about this line of thought: http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Morteriag t1_j95b954 wrote on February 19, 2023 at 10:14 AM

Reply to comment by Morteriag in [D] bounding box or instance segmentation by Old_Scallion2173

Last I checked Roboflow only had point-to-point vector masks for segmentation. In my experience that makes getting quality annotations a pain. In Labelbox, you can also hold in the mouse button. Hasty.ai focus on auto annotations, and by the look of the image you posted, it might be a good fit for your usecase.

skippy_nk t1_j95aurm wrote on February 19, 2023 at 10:08 AM

Reply to [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

The discovery of tmux was one of my greatest achievements of the early 2022

Morteriag t1_j95aodr wrote on February 19, 2023 at 10:06 AM

Reply to comment by Old_Scallion2173 in [D] bounding box or instance segmentation by Old_Scallion2173

That size would do well as a PoC, not much more, and you should be able to annotate all the data within a day or two. Automation does not make that much sense at this scale. I love Roboflow for bounding boxes, but LabelBox has superior tools for segmentation. Sure, with this small data set you can use cross validation, although a hold out test set is also preferable. I would almost consider hand-picking the test set at this scale to make sure you get a sense of how it performa on challenging examples. What is the pixel size of your images? I know microscopy/histology images typically can cover large areas and one image could in fact be considered a mosaic of many “normal” sized images.

_Minos t1_j95amf3 wrote on February 19, 2023 at 10:05 AM

Reply to comment by blueSGL in [D] Toolformer implementation using only few-shot prompting by MysteryInc152

Hey, creator of above implementation here.

You're right that there's lots of ways accuracy could feasibly be improved, by using more varied APIs, navigating to search results and creating embeddings of the resulting website etc. Ultimately, a lot of this kind of more advanced chaining of LLM and API requests can be done with libraries like langchain.

For this one, i wanted to show how effective a much more simple approach can be. For search results, i simply chain together the returned google "snippets" and inject the resulting string back into the prompt. Often times, this means there can actually be conflicting information, such as for example dates talking about events adjacent to but ultimately irrelevant to the search query. However, this is where GPT is generally doing an excellent job of picking out the correct bit of info, so no more sophisticated filtering or parsing by the app is required. Just giving a raw dump of the search results to the model.

sharockys t1_j95agar wrote on February 19, 2023 at 10:03 AM

Reply to [D] Lack of influence in modern AI by I_like_sources

It seems that someone is upset by not having free open box tool and requires the others to work more for free for his/her own purpose.

Old_Scallion2173 OP t1_j95a3nm wrote on February 19, 2023 at 9:57 AM

Reply to comment by Morteriag in [D] bounding box or instance segmentation by Old_Scallion2173

I see, currently I'm using roboflow as it is convenient and does have a polygonal labelling tool. By the way, Do you think I should do transfer learning and/or k-fold cross validation too since my dataset is small (325 images)?

[deleted] OP t1_j9585ee wrote on February 19, 2023 at 9:30 AM

Reply to [D] Please stop by [deleted]

[deleted]

I_like_sources OP t1_j957v1f wrote on February 19, 2023 at 9:26 AM

Reply to comment by A1-Delta in [D] Lack of influence in modern AI by I_like_sources

What are your contributions to enabling users customizability of the result without retraining?

>Not a huge fan of massive edits to original posts after people have started responding.

I am not here to make you happy.

Lifaux t1_j957q53 wrote on February 19, 2023 at 9:24 AM

Reply to [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

If you're having to debug code, VSCode has really good integrations for running on your remote server. Unless you're already very familiar with vim, it's going to be quicker to set this up.

Ensure you've got rsync experience - no one wants to include venv when pulling your changes back from the remote side.

Run the image you're using remotely locally via docker first. Check your code works, you don't want to be messing around with fixes while your GPUs sit idle.

If you're running compiled code, check the CPU architecture. I wasted a day debugging a fault that was due to compiling starspace on a build server that had different architecture to our remote server.

Tmux is a godsend.

RideOrDieRemember t1_j957mmn wrote on February 19, 2023 at 9:22 AM

Reply to [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

Is there a trick to spot instances on aws? In the past when I tried to spot instance a gpu it was never available.

Lifaux t1_j9572rl wrote on February 19, 2023 at 9:15 AM

Reply to comment by Demortus in [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself

Alternatively, you can always use WSL2 if you don't want to dual boot.

A1-Delta t1_j9572gt wrote on February 19, 2023 at 9:15 AM

Reply to comment by I_like_sources in [D] Lack of influence in modern AI by I_like_sources

Tweaking a CNN without retraining makes it sound like you want a no-code option on your machine learning.

Totally agree that model interpretability is a challenge, but there is a whole subsection of our field working on that. The fundamental design of deep learning sort of precludes what you’re talking about - at least given our current understanding of model interpretation. At best, a model may be trained to give options on certain aspects based on its input (we see this all the time), but that doesn’t sound like what you want. It sounds like you want to be able to target specific and arbitrary components of an output and intuitively modify the weights of all nodes contributing to that part of the output - presumably in isolation.

I think your challenge might lie with a fundamental lack of understanding of how these models actually work. I don’t mean that as a dig - they’re complicated. I just want to help bring you to a place of understanding about why the field is how you’re experiencing it.

Not a huge fan of massive edits to original posts after people have started responding. Your newly added recommendations put an onerous responsibility on any open source authors who might make their work public as a hobby rather than a career.

Rocksolidbubbles t1_j956kre wrote on February 19, 2023 at 9:08 AM

Reply to comment by TeamRocketsSecretary in [D] Please stop by [deleted]

>To suggest that it’s performing human like processing of emotions because the internal states of a regression model resemble some notion of intermediate mathematical logic is ridiculous especially in light of research showing these autoregressive models struggle with symbolic logic

Not only that. The debate on 'sentience' won't go away, but it will definitely be a lot more grounded when people who are expert in - for example, physiology of behaviour, cognitive linguistics, anthropology, philosophy, sociology, psychology, chemistry get involved.

For one thing they might mention things like neurotransmitters, and microbiomes, and epigenetics, or cultural relativity, or how perception can be relative.

The human brain is embodied and can't be separated from it - and if it were it would stop thinking like a human would. There's a really good case to be made (embodied cognition theory) that human cognition partly lies upon a metaphorical framework made of euclidean geometrical shapes that were derived from the way a body interacts with an environment.

Our environment is classical physics - up and down, in and out, together and apart - it's all straight lines, boxes cylinders. We're out of control, out of our minds, in love - self control, minds and love are conceived of as containers. Even chimps associate the direction UP with the abstract idea of being more superior in the heirarchy. You'll be hard pressed to find any western cultures where Up doesn't mean good or more or better, and DOWN doesn't mean bad or less or worse.

The point being, IF this hypothesis is true, and IF you want something to think at least a little bit like a human, it MAY require a mobile body that can interact with the environment and respond to feeback from jt.

This is just one if the many hypotheses non hard science fields can add to the debate - it really feels they're too absent in ai related subs

issam_28 t1_j956cim wrote on February 19, 2023 at 9:05 AM

Reply to [D] Please stop by [deleted]

Agreed. Mods need to ban all low quality posts.

I_like_sources OP t1_j9565gb wrote on February 19, 2023 at 9:02 AM

Reply to comment by A1-Delta in [D] Lack of influence in modern AI by I_like_sources

Good questions. Machine learning models are usually black boxes, that work or don't as expected.

There is no detail tweaking possible, only re-training. And the specification for good training data is vague at best.

That causes unnecessary frustration, time-consumption and is similar to the blind leading the blind.

The attitude is "offer just more and more data and hope the ai will figure things out, if not, offer more". I am sure I am not the only one who sees fault in this approach.

A1-Delta t1_j955sgx wrote on February 19, 2023 at 8:57 AM

Reply to [D] Lack of influence in modern AI by I_like_sources

I’m not sure I’m following you. Are you concerned that machine learning models are not easily customizable enough?

Is your trouble with the fundamental concept of transfer learning, that data selection and preparation is difficult, that convolutional neural networks are “black boxes”, or something else?

Recent comments in /f/MachineLearning