Recent comments in /f/MachineLearning
LeanderKu t1_j8vxlqa wrote
Reply to comment by Red-Portal in [D] Lion , An Optimizer That Outperforms Adam - Symbolic Discovery of Optimization Algorithms by ExponentialCookie
I think learned optimizers have potential but this is disappointing. Nothing revolutionary in there…there are already sign based optimizers and this is just a slightly different take. I see learned optimizers as the possibility of getting unintuitive results but this just could have been thrown together by some grad student. Random but not surprising.
Cocaaah t1_j8vwcl7 wrote
This paper might also be relevant.
justundertheblack OP t1_j8vw3f2 wrote
Reply to comment by No_Dust_9578 in [P] NLP Model for sentiment analysis by justundertheblack
this seems good I'll look into it
No_Dust_9578 t1_j8vw3cm wrote
Reply to comment by KarmaQueenOfficial in [D] Simple Questions Thread by AutoModerator
Intro to statistical learning 2nd edition is free and amazing resource.
No_Dust_9578 t1_j8vw117 wrote
Reply to comment by justundertheblack in [P] NLP Model for sentiment analysis by justundertheblack
Ah I see. In that case something like this should give you a good direction.
justundertheblack OP t1_j8vvwgw wrote
Reply to comment by No_Dust_9578 in [P] NLP Model for sentiment analysis by justundertheblack
btw this is a school project so we have to train our own model and we have the dataset for it too so do you know any good ones?
justundertheblack OP t1_j8vvt1i wrote
Reply to comment by No_Dust_9578 in [P] NLP Model for sentiment analysis by justundertheblack
Thanks for this man
jimliu741523 OP t1_j8vvi7k wrote
Reply to comment by popollytw in [R] The Table Feature Transformation Library Release by jimliu741523
Thanks for the heads up, HJ tries to connect datasets in the whole world and be a next-generation feature engineering method.
No_Dust_9578 t1_j8vv85i wrote
Few things. Don't make a model from scratch, use a pre-trained one. There are plenty on hugging face. Another thing, later on, if you have your own data, you can use it to fine tune those models to better suit your task. This is a general approach to ML applications where data isn't available or not enough. Side note, speaking from experience, those large sentiment models that are out there do have great performance but some of them have been trained with large sentiment datasets that have inconsistencies. For instance, once I had to validate manually the performance on my data and noticed that the pre-trained models predicted the following sentence as POSITIVE sentiment but to a human, this is not positive: "oh yay, I love cold food...". So be careful and setup some sanity checks. Don't fully assume the predictions are accurate.
popollytw t1_j8vv4bn wrote
Looks super cool!
No-Intern2507 t1_j8vtfqh wrote
dood iits not too smart to train on its wonky trashy results you know ? give it some thought actually instead of crapping sEnTiEnCe fantasy
baffo32 t1_j8vt8ob wrote
Reply to comment by [deleted] in [D] HuggingFace considered harmful to the community. /rant by drinkingsomuchcoffee
if we start one we’ll either make something good or bump into the project we accidentally duplicated as we get popular
baffo32 t1_j8vsq9s wrote
Reply to comment by drinkingsomuchcoffee in [D] HuggingFace considered harmful to the community. /rant by drinkingsomuchcoffee
looks like there is emotional or funded influence here, cointerintuitive votes, strange statements stated as facts
Duplicated code makes a very very _unhackable project_ because one has to learn the code duplicating systems and add functionality to them for every factorization. It does make _hackable examples_ but the codebase doesn’t seem to understand where to draw the line at all.
The library looks like it was made entirely without an experienced lead software engineer. As a corporation they should have one.
​
HuggingFace, please understand that software developers find DRY to be hackable. The two terms usually go together. It reads like a contradiction, like fake news trying to manipulate people by ignoring facts, to state it the other way around.
idly t1_j8vsaao wrote
Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
Look up the M5 forecasting conference/competition, there's papers discussing the results - maybe helpful.
jimliu741523 t1_j8vrlj0 wrote
Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
Unfortunately, "no single machine learning algorithm is universally the best-performing algorithm for all problems" from no free lunch theorem, that is, you just quickly try each algo on your task, and pick the best one on proper validation.
baffo32 t1_j8vrc4d wrote
HuggingFace recently implemented a PEFT library that reimplements the core functionality of AdapterHub. AdapterHub had reached out to them to contribute and integrate work but this failed in February of last year ( https://github.com/adapter-hub/adapter-transformers/issues/65#issuecomment-1031983053 ). Hugging Face was asked how the work related to the old and it was so sad to see they had done it completely independently, completely ignoring the past outreach ( https://github.com/huggingface/peft/issues/92#issuecomment-1431227939 ). The reply reads to me as if they are implementing the same featureset, unaware that it is the same one.
I would like to know why this didn‘t go better. The person who spearheaded AdapterHub for years appears to be one of the most prominent PEFT researchers with published papers. It looks as if they are tossed out in the snow. I can only imagine management never learned of the outreach or equally likely they have no idea how to work with other projects to refactor concepts from multiple codebases together or don’t find it to be a good thing to do so. It would have been nice to at least see lip service paid.
The library and hub are not complex. Is there a community alternative conducive to code organization or do we need to start yet another?
Sometimes I think it would make sense to train language models to transform the code, organize it, merge things, using techniques like langchain and chatgpt, to integrate future work into a more organized system.
Projects where everyone can work together are best.
andreichiffa t1_j8vqd42 wrote
It’s a RedHat for ML and especially LLMs. You want clean internals and things that work? You pay the consulting/on-premises fees. In the meantime they are pushing forwards FOSS models and supporting sharing and experimentation on established models.
I really don’t think you realize how much worse the domains that don’t have their HuggingFace are doing.
pyfreak182 t1_j8vpx4e wrote
Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
In case you are not familiar, there is also Time2Vec embeddings for Transformers. It would be interesting to see how that architecture compares as well.
Sandy_dude OP t1_j8vlgrm wrote
Reply to comment by BlazeObsidian in [R] Looking for papers which are modified variational autoencoder (VAE) by Sandy_dude
Thanks! I will have a read!
BlazeObsidian t1_j8vl4hj wrote
Not sure if it matches your requirements but look into VQ-VAE which is basically a vector quantised VAE. https://ml.berkeley.edu/blog/posts/vq-vae/
Some more ideas are explored in more detail here: https://lilianweng.github.io/posts/2018-08-12-vae/
AbsoluteCondui t1_j8viwrj wrote
Reply to comment by borisfin in [D] Compare open source LLMs by President_Xi_
thanks
casino_alcohol t1_j8vgo4o wrote
Reply to [D] Simple Questions Thread by AutoModerator
Is their a subreddit for finding specific machine learning projects?
I’d like to find something that can read text in my voice. I make a lot of recordings and it would save me tons of time if I could just have it done.
[deleted] t1_j8v5lyb wrote
[deleted]
drinkingsomuchcoffee OP t1_j8v3y80 wrote
Reply to comment by fasttosmile in [D] HuggingFace considered harmful to the community. /rant by drinkingsomuchcoffee
>You cant just copy paste a file if it’s centralized, you’ll have to copy paste multiple, and the main issue is it’s gonna take a while to understand which ones (and you'll have to modify the imports etc., unless you copy the entire repo! are you seriously suggesting that lmao)
Yep apparently they themselves claim to do this for every module. Thank you for pointing out how crazy this is and proving my point.
>Your definition of hackable is almost it. What’s missing is that being decentralized makes things much, much easier to understand because the code is very straightforward and doesn’t have to take 10 different things into account.
Oh really? I think those files depend on pytorch functions and also numpy. Should they copy those entire libraries into the file to be more "hackable"? Lmao
hark_in_tranquillity t1_j8vzo0i wrote
Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
The docs link you shared is not of prophet handling exogenous variables its handling holidays which is a separate "feature".
Nevertheless, prophet's exogenous influence impact/explainability is bad. One other problem with Prophet's regressor (exog features) functionality is that say you have 10 exog vars. You'll have to go through every possible combination of the 10 vars to come up with the best one. This is exponentially increasing compute.
On the other hand ML algorithms are nice for this reason, if you do data pre-processing right and take care of multicollinearity and endogeneity to some extent, influence of exog is much more explainable.
As someone mentioned m5 competition. Do check this out, you'll find a lot of reasons as to why ML based approaches that learn on panel data are SOTA right now. Do not skip experimentations tho.