pyepyepie t1_jad28c2 wrote on February 28, 2023 at 3:51 PM

Reply to [P] Built my first ever open-source project by Comfortable-Rest-373

I think it's a cool effort. Regarding feedback, personally, I would use an independent project in production only if I have no alternative. For example, using Shap was really painful even though the problem is less broad than this one and it had many contributors. That being said, it's a cool educational tool.

deliciously_methodic t1_jad1h8m wrote on February 28, 2023 at 3:46 PM

Reply to comment by MysteryInc152 in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

What does “scale up” mean in this context? I use “scale up” in a ML hardware context vs “scale out” to represent “making a cpu/GPU more powerful” vs “adding more gpus”, but I’m not clear if the analogy is used for AI models, scaling up and out. Or if you simply mean, “the model will get bigger”

Far-Butterscotch-436 t1_jad1g2v wrote on February 28, 2023 at 3:46 PM

Reply to [D] What is the most "opaque" popular machine learning model in 2023? by fromnighttilldawn

Just about every discussion i get notifications for is deleted , what s up with that?

bluebolt789 t1_jad0st0 wrote on February 28, 2023 at 3:42 PM

Reply to comment by professorlust in [Discussion] Can you use a model trained on tweets/product reviews to do sentiment analysis on IT support tickets? by [deleted]

I don’t get your point, could you be more specific please?

[deleted] t1_jad01a4 wrote on February 28, 2023 at 3:37 PM

Reply to comment by mediocregradstudent in [D] More stable alternative to wandb? by not_particulary

[removed]

Frankystew1 t1_jaczw27 wrote on February 28, 2023 at 3:36 PM

Reply to [P] Built my first ever open-source project by Comfortable-Rest-373

How do I use the code you posted on github

ichiichisan t1_jacz676 wrote on February 28, 2023 at 3:31 PM

Reply to [D] More stable alternative to wandb? by not_particulary

Neptune.ai - I personally find wandb unbearable - also they collect by default all your training code which I find extremely shady.

[deleted] t1_jacygxs wrote on February 28, 2023 at 3:26 PM

Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Any idea when we will be able to use the model?

[deleted] t1_jacx9ai wrote on February 28, 2023 at 3:18 PM

Reply to comment by abnormal_human in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

That’s about x100 less than what I’d expected.

bigfish_in_smallpond t1_jacwddt wrote on February 28, 2023 at 3:12 PM

Reply to comment by blackkettle in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

The Internet is the primordial soup for agi

zykezero t1_jacvr1g wrote on February 28, 2023 at 3:08 PM

Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Finally kosmos has arrived. We need her help to fight the gnosis.

MysteryInc152 OP t1_jacswnq wrote on February 28, 2023 at 2:48 PM

Reply to comment by farmingvillein in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

There's pretty much no way it won't scale up.

Clicketrie t1_jacr62e wrote on February 28, 2023 at 2:36 PM

Reply to [D] More stable alternative to wandb? by not_particulary

I work for Comet, but Comet has a super robust free tier. You can check out my public workspace with some runs of yolov5 here: https://www.comet.com/kristenkehrer/dogs-and-cats/view/new/panels

Tons of graphics, stores code, hyperparameters, metrics, data lineage, system metrics, etc, etc, etc

professorlust t1_jacqyy8 wrote on February 28, 2023 at 2:34 PM

Reply to [Discussion] Can you use a model trained on tweets/product reviews to do sentiment analysis on IT support tickets? by [deleted]

Have you looked at any of the sentiment analysis work from the last 5 years?

ggf31416 t1_jacq8pl wrote on February 28, 2023 at 2:29 PM

Reply to comment by ahiddenmessi2 in [D] Training transformer on RTX2060 by ahiddenmessi2

For reference a RTX 3090 can be rented as low as ~ $0.25/hour at vast.ai with just a credit card if you are in a hurry (AWS and GCP require a quota increase to use GPUs), or you may be able to get free credits for research at major cloud providers.

farmingvillein t1_jacq4fn wrote on February 28, 2023 at 2:28 PM

Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

The language-only performance was pretty meh, comparing the versions with and without images. We'll have to see whether scale up helps here (other research suggests yes?... But still need to see proof).

[deleted] t1_jackh6d wrote on February 28, 2023 at 1:44 PM

Reply to comment by [deleted] in [D] More stable alternative to wandb? by not_particulary

[removed]

MysteryInc152 OP t1_jacjpk9 wrote on February 28, 2023 at 1:38 PM

Reply to comment by abnormal_human in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Yeah

abnormal_human t1_jacjmrj wrote on February 28, 2023 at 1:37 PM

Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

Am I reading right that this is a 1.6B parameter model?

ahiddenmessi2 OP t1_jaciwqg wrote on February 28, 2023 at 1:31 PM

Reply to comment by ggf31416 in [D] Training transformer on RTX2060 by ahiddenmessi2

Thanks for your reply. My goal is to train the transformer to read a specific programming language so I I guess there is no pre trained model available. Seems I have to train it from scratch on my laptop GPU :(

Edit: and yes it has 6gb only