Recent comments in /f/MachineLearning
deliciously_methodic t1_jad1h8m wrote
Reply to comment by MysteryInc152 in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
What does “scale up” mean in this context? I use “scale up” in a ML hardware context vs “scale out” to represent “making a cpu/GPU more powerful” vs “adding more gpus”, but I’m not clear if the analogy is used for AI models, scaling up and out. Or if you simply mean, “the model will get bigger”
Far-Butterscotch-436 t1_jad1g2v wrote
Just about every discussion i get notifications for is deleted , what s up with that?
bluebolt789 t1_jad0st0 wrote
Reply to comment by professorlust in [Discussion] Can you use a model trained on tweets/product reviews to do sentiment analysis on IT support tickets? by [deleted]
I don’t get your point, could you be more specific please?
[deleted] t1_jad01a4 wrote
Reply to comment by mediocregradstudent in [D] More stable alternative to wandb? by not_particulary
[removed]
Frankystew1 t1_jaczw27 wrote
How do I use the code you posted on github
ichiichisan t1_jacz676 wrote
Reply to [D] More stable alternative to wandb? by not_particulary
Neptune.ai - I personally find wandb unbearable - also they collect by default all your training code which I find extremely shady.
[deleted] t1_jacygxs wrote
Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
Any idea when we will be able to use the model?
[deleted] t1_jacx9ai wrote
Reply to comment by abnormal_human in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
That’s about x100 less than what I’d expected.
bigfish_in_smallpond t1_jacwddt wrote
Reply to comment by blackkettle in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
The Internet is the primordial soup for agi
zykezero t1_jacvr1g wrote
Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
Finally kosmos has arrived. We need her help to fight the gnosis.
MysteryInc152 OP t1_jacswnq wrote
Reply to comment by farmingvillein in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
There's pretty much no way it won't scale up.
Clicketrie t1_jacr62e wrote
Reply to [D] More stable alternative to wandb? by not_particulary
I work for Comet, but Comet has a super robust free tier. You can check out my public workspace with some runs of yolov5 here: https://www.comet.com/kristenkehrer/dogs-and-cats/view/new/panels
Tons of graphics, stores code, hyperparameters, metrics, data lineage, system metrics, etc, etc, etc
professorlust t1_jacqyy8 wrote
Reply to [Discussion] Can you use a model trained on tweets/product reviews to do sentiment analysis on IT support tickets? by [deleted]
Have you looked at any of the sentiment analysis work from the last 5 years?
ggf31416 t1_jacq8pl wrote
Reply to comment by ahiddenmessi2 in [D] Training transformer on RTX2060 by ahiddenmessi2
For reference a RTX 3090 can be rented as low as ~ $0.25/hour at vast.ai with just a credit card if you are in a hurry (AWS and GCP require a quota increase to use GPUs), or you may be able to get free credits for research at major cloud providers.
farmingvillein t1_jacq4fn wrote
Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
The language-only performance was pretty meh, comparing the versions with and without images. We'll have to see whether scale up helps here (other research suggests yes?... But still need to see proof).
[deleted] t1_jackh6d wrote
Reply to comment by [deleted] in [D] More stable alternative to wandb? by not_particulary
[removed]
abnormal_human t1_jacjmrj wrote
Reply to [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152
Am I reading right that this is a 1.6B parameter model?
ahiddenmessi2 OP t1_jaciwqg wrote
Reply to comment by ggf31416 in [D] Training transformer on RTX2060 by ahiddenmessi2
Thanks for your reply. My goal is to train the transformer to read a specific programming language so I I guess there is no pre trained model available. Seems I have to train it from scratch on my laptop GPU :(
Edit: and yes it has 6gb only
Riatekk t1_jaciqmg wrote
Reply to [D] More stable alternative to wandb? by not_particulary
I have already used Comet.ml
ahiddenmessi2 OP t1_jacin2n wrote
Reply to comment by KingsmanVince in [D] Training transformer on RTX2060 by ahiddenmessi2
My dataset size can be varied cuz the data can be generated. Also, I will consider using gradient accumulation to improve performance too. Thanks
ahiddenmessi2 OP t1_jachs8r wrote
Reply to comment by aigoritma-1 in [D] Training transformer on RTX2060 by ahiddenmessi2
Thank you I will look into it
professorlust t1_jacfxvl wrote
Reply to comment by step21 in [D] What is the most "opaque" popular machine learning model in 2023? by fromnighttilldawn
Regarding ChatGPT, I believe OP is frustrated not by the Transformer architecture but by the improvements made in the inference functionality.
That’s the real “black box” of GPT style LLMs and the least open
pyepyepie t1_jad28c2 wrote
Reply to [P] Built my first ever open-source project by Comfortable-Rest-373
I think it's a cool effort. Regarding feedback, personally, I would use an independent project in production only if I have no alternative. For example, using Shap was really painful even though the problem is less broad than this one and it had many contributors. That being said, it's a cool educational tool.