Recent comments in /f/MachineLearning
BenXavier OP t1_j4zd9jv wrote
Reply to comment by mickman_10 in [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
Hey guys, thank you for the great responses!
- "Accurate Intelligible Models with Pairwise Interactions" seems great - as far as I can understand. That's what I've been referring to with "adding structure to models". Crazily how thin is the reference section: lot of exciting work to do!
- - Please do correct me if I'm being too naive, but are there other approaches for "building sub-models" at the splitting point?
- u/mickman_10, u/TheFlyingDrildo, are you also aware of any connection with Symbolic Regression or Association Rule extraction?
Taenk t1_j4zcu0e wrote
Reply to [P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers by bo_peng
Do I understand correctly that I could run this model at home on a graphics card with 8GB VRAM?
Anjum48 t1_j4zazrm wrote
Reply to comment by CaptainDifferent3116 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Oops - didn't realise that. Apologies
Skirlaxx t1_j4z95ow wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Yeah there's a detector on hugging face hub. It's not always correct and it's either sure from 99.99 % or 0.01 % or something. But usually it works.
dataslacker t1_j4z8zm4 wrote
Reply to comment by JClub in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
Yes, your explanations are clear and are also how I understood the paper, but I feel like there's some motivation for the RL training that's missing. Why not "pseudo labeling"? Why is the RL approach better? Also the reward score is non-differentiable because it was designed that way, but they could have designed it to be differentiable. For example instead of decoding the log probs why not train the reward model on them directly? You can still obtain the labels via decoding them doesn't mean that has to be the input to the reward model. There are a number of design choice the authors made that are not motivated in the paper. I haven't read the reference so maybe they are motivated elsewhere in the literature, but RL seems like a strange choice for this problem since there isn't a dynamic environment that the agent is interacting with.
Agitated-Purpose-171 t1_j4z7iz5 wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hi everybody, I have one question about VLAD while I read this paper (Aggregating local descriptors into a compact image representation) on CPVR.
My question is why VLAD works.
Aggregating local descriptors into a compact image representation paper links:
https://lear.inrialpes.fr/pubs/2010/JDSP10/jegou_compactimagerepresentation.pdf
In this paper, there is a network VLAD, it can turn the local features (N*D dimension) into a global feature (k* D dimension).
Below is my understanding of the operations of VLAD, step by step.
=> input: N*D dimension local feature.
(i) use k-means to find the k clusters and the central feature for each cluster.
(ii) for each cluster find a residual sum.
V = summation of ( each local feature in the cluster minus the central feature).
V = sum (Xi - C)
V: residual sum of the cluster
X: local feature in the cluster
C: Central feature of the cluster
(iii) concatenate the residual sum then get the global feature.
global feature = [V1,V2,....Vk]
(V1 is the residual sum of cluster 1, V2 is the residual sum of cluster 2... and so on.)
=> output: k*D dimension global feature.
My question is why the residual sum of each cluster is "not" zero.
Since the central feature of each cluster found by k-means is the average of the local feater of each cluster.
The central feature of cluster 1 = average of the local feature in cluster 1.
C1 = (X1 + X2 + X3 + ...+ Xm) / m
The residual sum of cluster 1 = (X1-C1) + (X2-C1) + (X3-C1) + ... + (Xm-C1) = V1
Based on the above equation, I think the residual sum of each cluster is zero. So the global feature will be a zero matrix = [V1, V2,..., Vk] = [zero vector, zero vector, ..., zero vector].
The only reason that came into my mind is that the iteration of the k means is not enough, so the central feature of each cluster is not equal to the average of the local feature in the cluster. Am I right?
Could anybody let me know why the residual sum is not a zero vector? Thanks a lot.
FastestLearner OP t1_j4z74l7 wrote
Reply to comment by Philpax in [D] Idea: SponsorBlock with a neural net as backend by FastestLearner
Yes. I too agree that a large model in not required for detecting simple words like "Please subscribe to our channel" or "Here is the sponsor of our video". I also have another idea which I think should help in getting better accuracies. Use the channel's unique identifier (UID) or the channel's name as input ( and generate conditional probabilities conditioned on the channel's UID). This should help because any particular YouTube channel almost always use the same phrase to introduce their sponsors in almost all of their videos. Think of LinusTechTips, you always here the same thing, "here's the segue to our sponsor yada yada." So this should definitely allow the model to do more accurate inference. Alternatively, you can just reduce the model complexity to save client's resources.
The other thing you mentioned about the average user not hitting the right arrow two times, I think (and this is my hypothesis), the graph of users using adblocking softwares is just increasing monotonically, because once a user gets to savour the internet without ads, they don't go back. Only the old aged folks and the absolutely-not-computer-savvy people don't use adblockers, and IMO that population is decreasing and in the (near) future, that population would simply vanish. This is similar to what Steve Jobs said when he was asked whether people would ever use the mouse. Look at now, everyone uses the mouse. Coming to sponsor blocking, not hitting the right arrow is just more convenient than hitting the right arrow two times. Sometimes hitting it x number of times does not get the job done and you need to hit it further. Also, you might miss the beginning of the non-sponsored segment, so you need to hit the left arrow once too. All of this is made convenient by the current SOTA SponsorBlock extension. It has just begun its journey and I have no doubt that just like the adblocking extensions, sponsorblocking is going to take off and see an exponential growth.
[deleted] t1_j4z5kat wrote
Reply to comment by junetwentyfirst2020 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
[removed]
JClub OP t1_j4z5ciu wrote
Reply to comment by JoeHenzi in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
This package is pretty simple to use! https://github.com/lvwerra/trl
It supports decoder-only models like GPT and it is in the process of supporting enc-dec like T5.
JClub OP t1_j4z57kr wrote
Reply to comment by dataslacker in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
Yes, the reward model can rank model outputs but it does that by giving a score to each output. You want to train with this score, not with "pseudo labeling" as you are stating. But the reward score is non-differentiable, and RL helps to construct a differentiable loss. Does that make sense?
SnooWords6686 t1_j4z34s8 wrote
Reply to comment by UltimateGPower in [P] featureimpact: A Python package for estimating the impact of features on ML models by cblume
Thanks 👍
monkeysingmonkeynew OP t1_j4z1fum wrote
Reply to comment by Equivalent-Way3 in [D] Is it possible to update random forest parameters with new data instead of retraining on all data? by monkeysingmonkeynew
Thanks! Do you have any more info on how to do it with XGBoost?
junetwentyfirst2020 t1_j4z0oyt wrote
Reply to [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar
🤫 they do. But there tends to be licensing issues, so they don’t.
kdr4t3 t1_j4yy99a wrote
Reply to comment by Apprehensive-Tax-214 in [P] Built an at-cost, pay per second, open-source API for Tortoise text-to-speech (best I've heard!) by Apprehensive-Tax-214
I have a verified GitHub email and it doesn't work for me either.
[deleted] t1_j4yx0ra wrote
Reply to comment by ThrillHouseofMirth in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
[removed]
Beautiful-Lock-4303 t1_j4yvklh wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
If you could you could just make gpt better through a GAN architecture and then you couldn’t anymore
niclas_wue OP t1_j4yukoz wrote
Reply to comment by randomusername11010 in [P] I built arxiv-summary.com, a list of GPT-3 generated paper summaries by niclas_wue
Yes, it is possible to use citations as a measure of a paper's impact. However, when a paper is newly published, there are typically no citations yet, so this would result in a delayed signal. Retweets and GitHub stars provide a faster indication of a paper's impact. I believe that speed is important because, as a paper becomes older, there are already many reviews and articles written by humans that (at least for now) provide a better summary of the paper.
Category-Basic t1_j4yu5ck wrote
Reply to comment by WistfulSonder in [Discussion] If ML is based on data generated by humans, can it truly outperform humans? by groman434
That is the million dollar question. A lot of clever people seem to be finding new ways all the time. I think that, at this point, it is safe to say that any task that has sufficient relevant data probably can be modeled and subject to ML. I might not be able to figure out how, but I am sure someone could.
dataslacker t1_j4yraoc wrote
Reply to comment by JClub in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
Sorry I think didn’t do a great job asking the question. The reward model, as I understand it, will rank the N generated responses from the LLM. So why not take the top ranked response as ground truth, or a weak label if you’d like and train in a supervised fashion predicting the next token. This would avoid a he RL training which I understand is inefficient and unstable.
kyoko9 t1_j4yqrka wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
I'm sorry, I don't know of any model that can detect GPT-generated text.
JoeHenzi t1_j4yowtu wrote
Taking a look - wanting to implement this in my application to explore parameter space, shoot for optimal, but actually am finding ChatGPT gets very cagey on the topic lately. Explored the topic of Genetic Algorithms, which it suggested would be less computationally expensive, then decided to not help me really get to coding it.
EDIT: This is exactly my use case...
randomusername11010 t1_j4yljwd wrote
Could you parse the citations to find which papers are cited the most to determine the most relevant papers rather than relying on papers with code?
crazymonezyy t1_j4yjtuz wrote
Reply to comment by dataslacker in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
Amongst other things, RLs major benefit is for learning from a sequence of reward over simply "a reward" which would be the assumption when you treat this is a SL problem. Do remember IID observations is one of the fundamental premises of SL.
Pavarottiy t1_j4yf6x8 wrote
Hough Transform would work fine in this given case. It is based on setting up a hypothesis and evaluating the points in a voting scheme.
After lines are acquired, one can project on a 2d image from a given viewpoint.
You can check this repo which also provides visualization: https://github.com/cdalitz/hough-3d-lines
BenXavier OP t1_j4zdd7b wrote
Reply to comment by SnooHesitations8849 in [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
can you elaborate a bit?
Are you talking about this? https://arxiv.org/abs/1702.08835