Recent comments in /f/MachineLearning
BoiElroy t1_j991yix wrote
If you haven't already started I'd say begin with the standard engineering statistical quality control (SQC) stuff. AI/ML is great but honestly only when the existing classical techniques are no longer sufficient.
TeamDman t1_j98z6ou wrote
Reply to comment by MysteryInc152 in [D] Toolformer implementation using only few-shot prompting by MysteryInc152
Mobile is a little wonky
ixent t1_j98xi5g wrote
Reply to comment by Kumacyin in [R] neural cloth simulation by LegendOfHiddnTempl
Same concern for me. All the great cloth simulations I've seen in games have weird clipping issues.
liquiddandruff t1_j98v6ko wrote
Reply to comment by thecodethinker in [R] neural cloth simulation by LegendOfHiddnTempl
it's an open question and lots of interesting work is happening at a frenetic pace here
- Language Models Can (kind of) Reason: A Systematic Formal Analysis of Chain-of-Thought https://openreview.net/forum?id=qFVVBzXxR2V
- Emergent Abilities of Large Language Models https://arxiv.org/abs/2206.07682
A favourite discussed recently:
- Theory of Mind May Have Spontaneously Emerged in Large Language Models https://arxiv.org/abs/2302.02083
currentscurrents t1_j98uu8u wrote
Reply to comment by NotARedditUser3 in [P] Looking to use Chat-GPT for your business? Data-Centric Fine-Tuning Is All You Need! by Only-Caterpillar4057
There's been a lot of SEO garbage posted around here lately. I think there's really only one active mod, so it's basically a free-for-all if he isn't online at the moment.
h-dot t1_j98u1gl wrote
To echo the top comment, it is very difficult to implement AI/ML. Beyond the obvious technical challenges (learning how to code, problem framing, data warehousing, identifying signal in data, model fitting, tuning, retraining, etc etc) there’s the entire business implementation/change management required to actually capitalize on the predictions you’re receiving. I’m in the field and would also recommend a consultant to even see if it’s a worthwhile endeavor for your business.
[deleted] t1_j98tcmd wrote
Reply to [D] Simple Questions Thread by AutoModerator
[deleted]
easy_peazy t1_j98tbgs wrote
Reply to comment by guaranteednotabot in [D] Simple Questions Thread by AutoModerator
They charge per 1000 tokens which is about 750 words. The rate for 1000 tokens is a few cents.
Ferocious_Armadillo t1_j98sbqm wrote
Reply to comment by I_like_sources in [D] Lack of influence in modern AI by I_like_sources
I think I’m gonna have to respectfully disagree on a lot of this. You’re right that it largely comes down to training data used. The thing that largely jumps out to me, though, in the examples you give and in your point (1) is that while you want to train using a large amount of data, especially for such large networks as those you suggest, is that while you need that large amount of data, you want to avoid overfitting your model to your data in the pursuit of accuracy or reliability or whatever metric you choose to determine how “good” or accurate your model is against some ground truth.
And while on the surface, NNs can definitely seem like or appear as though they’re “black boxes” or “we can’t accurately describe their structure or how they work”. That’s largely untrue. In fact, I would claim that it’s precisely because we can design and model NN structure and use a structure (both in terms of # of layers, connectedness between them, inputs, weights, biases, activation functions, etc.) that would lend itself best to a given purpose, that has allowed the field to come as far as it has, to generate the NNS in the examples you provide in the first place.
Sorry about the rant… I didn’t realize I get so passionate about NNs.
NotARedditUser3 t1_j98rxwn wrote
Reply to comment by [deleted] in [P] Looking to use Chat-GPT for your business? Data-Centric Fine-Tuning Is All You Need! by Only-Caterpillar4057
That is not Chat GPT, but a ripoff site using the same name and trying to charge money for it. Reported.
SwordOfVarjo t1_j98rqc2 wrote
Reply to [P] Looking to use Chat-GPT for your business? Data-Centric Fine-Tuning Is All You Need! by Only-Caterpillar4057
Oh wow, MIT PhDs and memes! Ok, I'm sold.
I_will_delete_myself OP t1_j98ql8h wrote
Reply to comment by No_Goat277 in [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself
You can get free credits online if you ask for it up to the thousands for research
https://www.microsoft.com/en-us/azure-academic-research/
https://edu.google.com/intl/ALL_us/programs/credits/research/?modal_active=none
The cloud vs local debate depends on your needs though.
I_will_delete_myself t1_j98pw8k wrote
Get a consultant and they can show you how. It depends on your processes.
No_Goat277 t1_j98pvwn wrote
Reply to comment by I_will_delete_myself in [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself
Thank you. I have scientific team so our PhD is requesting GPU for SD training. Our other team is using Midjourney but there is no API to it, so they happy but we can’t move forward due to lack of API.
thecodethinker t1_j98puob wrote
Reply to comment by liquiddandruff in [R] neural cloth simulation by LegendOfHiddnTempl
Where has chat gpt been rigorously shown to have reasoning ability? I’ve heard that it passed some exams, but that could just be the model regurgitating info in its training data.
Admittedly, I haven’t looked to deeply in the reasoning abilities of LLMs, so any references would be appreciated :)
I_will_delete_myself OP t1_j98p0vg wrote
Reply to comment by No_Goat277 in [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself
I been running the A100 the entire weekend and so far it’s only costing me under 20 bucks. If you need it around an hour and it would probably cost you between 1-3 dollars
I would recommend you plan a budget before you get started and it will almost always be cheaper on a year basis. Try Colab first and see if you will need it longer than 12 hours.
No_Goat277 t1_j98oklc wrote
Reply to [D] Things you wish you knew before you started training on the cloud? by I_will_delete_myself
What is cost of cloud total vs running your servers on prem? I need to start a project with 2/4 RTX cards to train my stable diffusion model.
[deleted] t1_j98inpo wrote
[deleted] t1_j98ezul wrote
[removed]
dmart89 t1_j98eltr wrote
I think what you're asking is how to implement ML instead of building something from the ground up. I don't know your industry, but there are lots of suppliers and startups that would happily partner with you to help you adopt these capabilities without you needing to hire a team to build your own infrastructure. Many other industries already do!
NotARedditUser3 t1_j98cwel wrote
Reply to [P] Looking to use Chat-GPT for your business? Data-Centric Fine-Tuning Is All You Need! by Only-Caterpillar4057
This is an idiotic and unrealistic ad, that makes sweeping, untrue generalizations about the current state of other LLM's. I think it would be hard for anyone to realistically read all the way through without dismissing it as the rantings of some agitated, smaller competitor that can't distinguish itself without trying to make untrue statements to put down the other players in the market, in order to try and prop up it's own value.
Perhaps you should use an LLM to improve your ad copy.
walkingsparrow t1_j98c2qw wrote
Reply to comment by radi-cho in [R] [N] In this paper, we show how a conversational model, 3.5x smaller than SOTA, can be optimized to outperform the baselines through Auxiliary Learning. Published in the ACL Anthology: "Efficient Task-Oriented Dialogue Systems with Response Selection as an Auxiliary Task." by radi-cho
I am a bit confused. So overall, we want to make the generated response to be as close as possible to the ground truth. The paper adds a selection loss that distinguishes the generated response from the ground truth, which would make the generated response as different as possible from the ground truth. How could this help the main task of making these two responses as close as possible?
[deleted] t1_j98bp1g wrote
Reply to comment by thecodethinker in [R] neural cloth simulation by LegendOfHiddnTempl
[removed]
SupplyChainPhd t1_j98b9h0 wrote
So many things you need to look into getting something set up. I don’t even know where I’d suggest you start if you want to take that on as a personal project. TBH, might be worthwhile finding someone to set it up for you and teach you to how to monitor.
[deleted] t1_j994ep5 wrote
Reply to comment by [deleted] in [R] difference between UAI and AISTATS ? by ArmandDerech
[removed]