Recent comments in /f/MachineLearning
tblume1992 t1_j8y9oti wrote
Reply to [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
- MLForecast treats it more like a time series - it does differencing and moving averages as levels to encode the general level of each time series along with the ar lags. Not entirely necessary as you can just scale with like a standard scaler or even box cox at the time series level and pass a time series 'id' as a categorical variable to lightgbm and outperform MLForecast although it is pretty snappy with how they have it written.
- I honestly just wouldn't use Prophet in general...But if you have 50 regressors it (I believe) fits them with a normal prior which is equivalent to a ridge regression so it shrinks the coefficients but you are stuck with this 'average' effect.
- ARIMAX absolutely still has a place but it really all comes down to your features. If you you have good quality predictive features then it is usually better to do ML and 'featurize' the time pieces. You lose out on the time component but gain a lot due to the features. There are other issues like now you have to potentially forecast for those features. The alternative is having bad features. If that is the case then usually you are stuck with just standard time series methods. So it really is 100% dependent on your data and if there is use in learning stuff across multiple time series or not.
An alternative view is hierarchical forecasting which sometimes works well to take advantage of higher level seasonalities and trends that may be harder to see at the lower level and outperforms ML a good chunk in my experience unless you have good regressors.
As many are saying - SOTA are boosted trees with time features. If the features are bad then it is TS stuff like arimax. The best way to find out is to test each.
Edit: In regards to M5 - there was a lot of 'trickery' done to maximize the cost function there so it might not be 100% super useful, at least in my experience.
PassionatePossum t1_j8y9oam wrote
Reply to comment by [deleted] in [D] Coauthor Paper? by [deleted]
I assume that you are based in the U.S. I'm not really familiar with the U.S. system of "grad school" so take what I say with a grain of salt.
Publishing a paper is certainly a good way to show your professor, that you are capable of doing research but probably not absolutely necessary. Having a reputation as a reliable and capable student should also go a long way to convince your professor that you are a good cancidate.
Working with one of the PhD students on their research project should also be a good way to earn your professor's trust.
dancingnightly t1_j8y81v9 wrote
Reply to comment by pyfreak182 in [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
Do you know of any kind of similar encoding where you vectorise relative time? as multiple proportions of completeness, if that makes sense?
​
Say, completeness within a paragraph, within a chapter, within a book? (Besides sinusidal embeddings which push up the number of examples you need)
dancingnightly t1_j8y7fny wrote
"If you look at the internals, it's a nightmare. A literal nightmare."
Yes, the copy paste button is heavily rinsed at HF HQ.
But you won't believe how much easier they made it to run, tokenize and train models in 2018-19, and at that, train compatible models.
We probably owe a month of NLP progress just to them coming in with those one liners and sensible argument API surfaces.
​
Now, yes, it's getting crazy - but if there's a new paradigm, a new complex way to code, then a similar library will simplify it, and we'll mostly jump there except for legacy. It'll become like scikit learn (although that still holds up for most real ML tasks), lots of finegrained detail and slightly questionable amounts of edge cases (looking at the clustering algorithms in particular), but as easy as pie to keep going.
​
I personally couldn't ask for more. I was worried they were going to push auto-switching models to their API at some point, but they've been brilliant. There are bugs, but I've never seen them in inference(besides your classic CUDA OOM), and like Fit_Schedule5951 says, it's all about that with HF.
Appropriate_Ant_4629 t1_j8y3koe wrote
Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
> Hi, I lead product for Colab.
Thanks for your responses here!
And thank google's management chain above you for allowing you to represent the product here.
Your comments here just saved a number of subscriptions that would have otherwise canceled.
cubej333 t1_j8y25e7 wrote
Reply to comment by [deleted] in [D] Coauthor Paper? by [deleted]
I would expect that good recommendations by known people in the field, collaborated by research productivity, would be excellent to get into graduate school. Maybe not to get a great job after graduate school, but you would have all of graduate school to get first author papers.
Arguably if you have a number of first author papers out of undergrad, you don't need graduate school.
[deleted] OP t1_j8xxtxa wrote
Reply to comment by velcher in [D] Coauthor Paper? by [deleted]
[deleted]
[deleted] OP t1_j8xxr1v wrote
Reply to comment by cubej333 in [D] Coauthor Paper? by [deleted]
[deleted]
[deleted] OP t1_j8xxora wrote
Reply to comment by PassionatePossum in [D] Coauthor Paper? by [deleted]
[deleted]
DigThatData t1_j8xxnpp wrote
Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
unrelated to OP: what is the "best practice" method for a notebook to self-test if it's running in a colab environment? i think the method I'm currently using is something like
probably_colab = False
try:
import google.colab
probably_colab = True
except ImportError:
pass
which I'm not a fan of for a variety of reasons. what would you recommend?
Mental-Reference8330 t1_j8xup7w wrote
Reply to comment by tdgros in [D] Constrained Optimization in Deep Learning by d0cmorris
in the early days, researchers considered the architecture itself to be a form of regularization. LeCunn didn't invent it, but he did popularize the idea that a convolutional layer (like LeNet in his case) is like a fully-connected layer, but constrained to only allow solutions where the layer weights could be expressed in terms of a convolution kernel. In their introduction, ResNets were also motivated by the fact that they're "constrained" to start from better minima, even though you could also convert a resnet model to a fully-connected model without loss of precision.
ckperry t1_j8xtufq wrote
Reply to comment by FreePenalties in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
lol please do complain very loudly if we 10x your prices! and thank you!!
in this case it appears only the messaging was affected, and nobody was charged the 94 euros thankfully. I'll update when we get our page fixed. thanks again!
aCuRiOuSguuy t1_j8xsce1 wrote
Reply to [D] Simple Questions Thread by AutoModerator
I am currently a graduate student in Computer Science and am taking a class that talks about the foundation of Machine Learning. The class is very math rigorous in nature.
The textbook that we use is Foundations of Machine Learning by M. Mohri, A. Rostamizadeh, A. Talwalkar.
I am seeking a paid private tutor to help me with the content and homework of the class. Pay is negotiable!
FreePenalties OP t1_j8xrm5m wrote
Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
Thank you very much for the response, will edit the post to be less alarmist. Would also like to just say thank you for making a great platform for data science collaboration, and also for finally bringing pro to scandinavia :D it is a great value product and im very happy to pay 9 euro for it, but 94 would definitely have been too much.
DrunkOrInBed t1_j8xpz05 wrote
Reply to comment by Ronny_Jotten in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
wow nice finding
[deleted] t1_j8xphu8 wrote
Reply to comment by hpstring in [D] HuggingFace considered harmful to the community. /rant by drinkingsomuchcoffee
[deleted]
weeeeeewoooooo t1_j8xoy8u wrote
Reply to comment by BenXavier in [Discussion] Time Series methods comparisons: XGBoost, MLForecast, Prophet, ARIMAX? by RAFisherman
This is a great question. Steve Brunton has some great videos about dynamical systems and their properties that are very accessible. This one I think does a good job showing the behavioral relationship between the eigenvalues and the underlying system: https://youtu.be/XXjoh8L1HkE
Recursive application of a system (model) over a "long" period of time gets rid of transients, so the system will fall onto the governing attractors of the system, which are generally dictated by the eigenvalues of the system. The recursive application also helps isolate the system so you are observing the model autonomously, rather than being driven by external inputs. This helps you tease out how expressive your model actually is versus how dependent it is on you feeding it from the target system's observations, which helps reduce over fitting and reduces bias.
outthemirror t1_j8xotjc wrote
This is like complaining Linux is bad because you have to debug various things
cubej333 t1_j8xo81r wrote
Reply to [D] Coauthor Paper? by [deleted]
Generally the most important thing is your letter's of recommendation, which should be good if the professors are putting you on the paper (so the paper is collaborative of that). A first author on an important paper is probably better, but if someone was a first author on an important paper but had lousy letters of recommendation it would be a red flag.
Electronic_Medicine7 t1_j8xm8fp wrote
Reply to [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
That’s why I don’t Google. I wouldn’t give him the time of day.
medwatt OP t1_j8xlq1b wrote
Reply to comment by Academic-Poetry in [D] Short survey of optimization methods by medwatt
Thanks for the recommendation, but that's a very long book.
athos45678 t1_j8xjcfb wrote
Reply to comment by Sid_b23692 in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
Storage depends on your plan, but any overage is .02 usd per gb/month and the max is 10 TB.
Drive mounting isn’t exactly there, but you can pull any file with wkentaro’s gdown easily enough.
athos45678 t1_j8xj0gi wrote
Reply to [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
Paperspace, lads. Paperspace is where it’s at. except for the storage limitations, my experience there is so much better than colab
ckperry t1_j8xhmdx wrote
Reply to comment by rafaelcm33 in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
We made an update to make our advertised prices in the EU to reflect tax inclusive (standard EU practice). This did not change actual prices paid, you were still paying taxes before.
daking999 t1_j8yao4j wrote
Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
Oh nice. Please can you get invoice based billing set up so I can have my university pay for my students' Colab subscription!