Recent comments in /f/MachineLearning
nateharada OP t1_j4ngy65 wrote
Reply to comment by Zealousideal_Low1287 in [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
It's actually almost entirely ready now, I just need to alter a few things. I'll go ahead and push it soon! Need to do some final tests.
EDIT: The above code should work! See the README on the Github for a complete example.
nadhsib t1_j4ngqgj wrote
Reply to [D] Can ChatGPT flag it's own writings? by MrSpotgold
I've tried that, but no joy.
As it's designed to emulate different styles of writing I doubt it's a "coming soon" thing.
dandandanftw t1_j4nf5i3 wrote
Reply to [D] Model for detecting rectangle corners? by hundley10
A corner detector then depending on how many corner you got, you can brute force any possible rectangle. You can also use hough line detection to limit number of corners. You can also use a simple model like SVM to compare the corners and patterns of the given images. You should also check out glcm for preprocessing the pattern
Zealousideal_Low1287 t1_j4neybb wrote
Reply to comment by nateharada in [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
Yeah, that’s something which would be useful indeed. Don’t worry yourself though, I can put in a PR.
actualsnek t1_j4neego wrote
Reply to comment by Acrobatic-Name5948 in [D] What's your opinion on "neurocompositional computing"? (Microsoft paper from April 2022) by currentscurrents
NECSTransformer appears to be a generalization of the TP-Transformer presented by Schlag et al. 2019 with implementation available at this GitHub repo.
nateharada OP t1_j4ne979 wrote
Reply to comment by Zealousideal_Low1287 in [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
Nice! Right now you can use the end_process trigger to just return 0 when the trigger is hit from the process, but it should be fairly straightforward to externalize the API a little bit more. This would let you do something like this in your script:
from gpu_sentinel import Sentinel, get_gpu_usage
sentinel = Sentinel(
arm_duration=10,
arm_threshold=0.7,
kill_duration=60,
kill_threshold=0.7,
kill_fn=my_callback_fn,
)
while True:
gpu_usage = get_gpu_usage(device_ids=[0, 1, 2, 3])
sentinel.tick(gpu_usage)
time.sleep(1)
Is that something that would be useful? You can define the callback function yourself so maybe you trigger an alert, etc.
SearchAtlantis t1_j4ne6jj wrote
Reply to comment by timdettmers in [D] Tim Dettmers' GPU advice blog updated for 4000 series by init__27
For what it's worth I think 15% seems low. Having just finished an MS with Deep Learning in my thesis, over the course of a year I used it about 25% of the time. Test quick shallow for arch and other changes then running arch changes etc at full depths for comparison.
moschles t1_j4nczb1 wrote
Reply to comment by pm_me_your_pay_slips in [D] Bitter lesson 2.0? by Tea_Pearce
Or worse, is "Foundation Model" just a contemporary buzzword replacement for unsupervised training?
moschles t1_j4nch5w wrote
Reply to [D] Bitter lesson 2.0? by Tea_Pearce
> Seems to be derived by observing that the most promising work in robotics today (where generating data is challenging) is coming from piggy-backing on the success of large language models (think SayCan etc).
There is nothing really magical being claimed here. The LLMs are undergoing unsupervised training. essentially by creating distortions of the text. (one type of "distortion" is Cloze Deletion. But there are others in the panoply of distorted text.)
Unsupervised training avoids the bottleneck of having to manually pre-label your dataset.
When we translate unsupervised training to the robotics domain, what does that look like? Perhaps "next word prediction" is analogous to "next second prediction" of a physical environment. And Cloze Deletion has an analogy to probabilistic "in-painting" done by existing diffusion models.
That's the way I see it. I'm not particular sold on this idea that the pretraining would be literal LLM trained on text, ported seamlessly to the robotics domain. If I'm wrong, set me straight.
learn-deeply t1_j4na276 wrote
Note: graphs comparing GPUs are not actual benchmarks but theoretical results. Nvidia likes to arbitrarily add restraints to their non-datacenter GPUs, so its not clear what the real-word performance is.
avocadoughnut t1_j4n8bp2 wrote
Reply to comment by LetGoAndBeReal in [D] Fine-tuning open source models on specific tasks to compete with ChatGPT? by jaqws
Well, there are projects like WebGPT (by OpenAI) that make use of external knowledge sources. I personally think that's the future of these models: moderated databases of documents. The knowledge is much more interpretable and modifiable that way.
chief167 t1_j4n7wdv wrote
A little bit dangerous, because the A100 is a beast when you look at performance/kWh. If you run a lot of heavy workloads, it's your best option, but on this chart it looks like the worst
TCO != Purchase price
No_Research5050 t1_j4n77xh wrote
spam filter.
LetGoAndBeReal t1_j4n6rfa wrote
Reply to comment by avocadoughnut in [D] Fine-tuning open source models on specific tasks to compete with ChatGPT? by jaqws
Wow, that seems awfully ambitious given that GPT3.5 requires something like 700GB of RAM and the apparent unlikeliness that SoTA model sizes will get smaller anytime soon. Interesting project to watch, though.
junetwentyfirst2020 t1_j4n6amt wrote
Reply to [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
👍 very cool
faker10101891 OP t1_j4n6a3b wrote
Reply to comment by sayoonarachu in [D] What kinds of interesting models can I train with just an RTX 4080? by faker10101891
Thanks, I'll check that out!
[deleted] t1_j4n5wxx wrote
Reply to [D] Model for detecting rectangle corners? by hundley10
[removed]
avocadoughnut t1_j4n5sp8 wrote
Reply to comment by LetGoAndBeReal in [D] Fine-tuning open source models on specific tasks to compete with ChatGPT? by jaqws
From what I've heard, they want a model small enough to run on consumer hardware. I don't think that's currently possible (probably not enough knowledge capacity). But I haven't heard that a decision has been made on this end. The most important part of the project at the moment is crowdsourcing good data.
hundley10 OP t1_j4n5p0y wrote
Reply to comment by robobub in [D] Model for detecting rectangle corners? by hundley10
Edited post with some example pics. I've been leaning toward #3 if I can't find a better solution, but can you provide more info about #2? My labels are the (x,y) coordinates of each corner of the cards.
bubudumbdumb t1_j4n54nk wrote
Reply to comment by hundley10 in [D] Model for detecting rectangle corners? by hundley10
The Key here is that by detecting key points you don't need to detect the corners per se : you detect at least a dozen points from the pattern on the card then assuming the card is a rectangle on a plane you can identify the corners.
In other words this can be very robust to occlusions, like you might not see more than half of the card and still be able to identify where the corners are
hundley10 OP t1_j4n3tz2 wrote
Reply to comment by bubudumbdumb in [D] Model for detecting rectangle corners? by hundley10
Thanks for the suggestions. I edited my post to give some examples of the detection that needs to be performed... notice how sometimes corners can be obscured, and the background can make "simple" rectangle detection a poor fit. I will check out ORB though.
robobub t1_j4n3gcm wrote
Reply to [D] Model for detecting rectangle corners? by hundley10
A couple options off the top of my head
- Add orientation prediction to the bounding box
- Add keypoints for the 4 actual corners as a prediction
- Postprocess boxes with classical techniques, looking for the outermost corners that fit certain properties
- Do everything classically, and deal with the difficulties you have mentioned in your comment.
The first two require annotations of attributes for each box, and will be predicted directly by the model. Though note that you don't have to do this for every label, you can just not train parts of the model when certain attributes are unlabeled.
Both will require some care in modeling, e.g. orientation can have a loss condition at 360 degrees that you'll want to handle, and regressing keypoints can be done well and not well, reference how corners are modeled. And then of course you'll need to postprocess the model's outputs to align/visualize on an image.
sayoonarachu t1_j4n2w5j wrote
Quite a bit and even more if you use optimized frameworks and packages like voltaml, pytorch lighting, colossalai, bitsandbytes, xformers, etc. Those are just the ones I am familiar with.
Some libraries allow balancing between cpu, gpu, and memory, though obviously, that will come at a cost of speed.
General rule, the more parameters the model, the higher the cost of memory. So, unless you're planning to train from scratch or fine tune in the billions of param, you'll be fine.
It's gonna take playing around with hyper parameters, switching between 32, 16, 8 bit quant with pytorch or other python packages, testing between offloading weights to gpu/cpu, etc to get a feel of what you can and can't do.
Also, if I remember correctly, pytorch 2.0 will somewhat benefit the consumer nvidia 40 series to some extent when it is more ready.
Edit: p.s. supposedly a new Forward Forward algorithm can be "helpful" for large models since there's no back propagation
Zealousideal_Low1287 t1_j4n2ahm wrote
Reply to [P] A small tool that shuts down your machine when GPU utilization drops too low. by nateharada
Looks nice. I probably wouldn’t use it for shutting down or anything, but a notification on failure might be useful!
dmart89 t1_j4nio9p wrote
Reply to comment by lumin0va in [D] Can ChatGPT flag it's own writings? by MrSpotgold
Idk, I guess the point is that if text is 100% gpt written and not reviewed by a human, then there is a risk that gpt learns from bad gpt examples. If you review and modify it to remove the watermark, then it is effectively human reviewed/labelled content and ok for re-ingestion in future iterations.
But tbh the guys at openai are pretty capable, I'm sure they'll think of something. I don't know anything more than the headline I read.