Recent comments in /f/MachineLearning
glitteringpenny t1_j5nfl5t wrote
Reply to comment by glitteringpenny in [D] With more compute could it be easy to quickly un Mask all the people on Reddit by using text correlations to non masked publicly available text data? by Loquzofaricoalaphar
Yea I’m being salty because was viewing Lastpass before this subreddit
glitteringpenny t1_j5nfjzq wrote
Reply to [D] With more compute could it be easy to quickly un Mask all the people on Reddit by using text correlations to non masked publicly available text data? by Loquzofaricoalaphar
Start by getting your hands on lastpasses customer data. Bet you could unmask a good amount of us. 25 million users data…unleashed
chip_0 t1_j5naknl wrote
Reply to [P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers by bo_peng
Have you used RL with Human Feedback to fine-tune it yet?
I have an idea about how to use RLHF without expensive human annotation. Let me know if you would like to collaborate on that!
like_a_tensor t1_j5n7akg wrote
Very nice work! Do you plan to release any solutions to the problems?
Oceanboi t1_j5n6p7b wrote
Reply to comment by [deleted] in [D] Simple Questions Thread by AutoModerator
my advice is to proceed. its cool to know the math underneath, but just go implement stuff dude, if it doesn't work you can always remote/rent GPU. what i did for my thesis is google tutorials and re-implement them using my dataset. through all the bugs and the elbow grease, you will know enough to at least speak the language. just do it and don't procrastinate with these types of posts (i do this too sometimes)
EDIT: a lot can be done on colab these days regarding neural networks and huggingface. google huggingface documentation! i implemented a huggingface transformer model to do audio classification (and im a total noob i just copied a tutorial). it was total misuse of the model and accuracy was bad, but at least i learned and given a real problem i could at least find my way forward.
Oceanboi t1_j5n6ed0 wrote
Reply to comment by trnka in [D] Simple Questions Thread by AutoModerator
Can you expand on why one might ever want to apply a neural network to linear regression? It feels like bringing a machine gun to a knife fight.
fkrhvfpdbn4f0x t1_j5n3ypm wrote
Reply to comment by aristotle137 in [P] New textbook: Understanding Deep Learning by SimonJDPrince
>u/SimonJDPrince
could you share a link to a CV textbook
SufficientType1794 t1_j5n1c3y wrote
Reply to comment by Fabulous-Possible758 in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
That's just training a GAN with extra steps /s
aamir23 t1_j5mzuw2 wrote
There's another book with the same title. Understanding deep learning
FastestLearner t1_j5mwz47 wrote
Reply to comment by ArnoF7 in [D] Multiple Different GPUs? by Maxerature
It is possible, but it would require you to write custom code for every memcopy operation that you want to perform i.e. tensor.to(device), which you can get away with on a smaller project but could become prohibitively cumbersome on a large project. Also you'd still need to do two forward passes (one with the data on the 3080 itself, and then another with the data on the 1080 after having it transferred to the 3080). Whether or not this is beneficial boils down to differences in transfer rates between the RAM-3080 route and the RAM-1080-3080 route. I won't be able to tell which one is faster without benchmarking.
DeepSpeed handles the RAM-3080 to-and-fro transfers for large batch sizes automatically.
sweetlou357 t1_j5mtytp wrote
Wow this looks like an amazing resource!
suflaj t1_j5mnzq3 wrote
Reply to comment by WigglyHypersurface in [D] Embedding bags for LLMs by WigglyHypersurface
Why would this matter?
If such examples are present in the training set and adequately expressed, then the model will learn whatever it needs to learn from those words.
If they are not in the training set, you should not expect the model to understand them the same way you do.
I realize this defeats the point of generalization, but LLMs learn to mimic generalization through exposure, not by actually learning to understand the underlying principles. These models do not analyze text like we humans do, but they have been shown to outperform the average human despite that.
Ultimately to do what you are doing you would need to have a tokenizer that has all the syntactical knowledge embedded within itself for a given subset of the language that will be the input. Wasn't AlexNet, a decade ago, enough to convince you to always relegate these kinds of tasks to the DL model, which will always beat a human provided it has the capacity and the data?
profjonathanbriggs t1_j5mljdr wrote
Added to my reading stack. Thanks for this. Will revert with comments.
Pavarottiy t1_j5mhsjq wrote
Reply to comment by EducationalLayer1051 in [D] Automated Extraction of Building Geometry by EducationalLayer1051
Yes, it is a general line detection method, so any point cloud input with lines in it can be inputted. For 3d scans, point clouds have some sensor noise and a robust implementation/version of this method would be better suited. But in case of a perfect cad model like the example, even the one that I provided in my prev comment should work fine. Do you only want to detect lines, or planes and lines? For planes, there are plane fitting approaches and then, one can find intersection of planes (as an alternative to Hough).
shadeofmyheart t1_j5mhq5a wrote
Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
You mean like ChatGPT being able to recognize the tokens in its own output?
K_is_for_Karma t1_j5mh00k wrote
how recent is your chapter on generative models? I’m starting to pursue research in the area and need to get up to date
SimonJDPrince OP t1_j5mfyr1 wrote
Reply to comment by Nhabls in [P] New textbook: Understanding Deep Learning by SimonJDPrince
I'll give people the choice in the end...
PabloEs_ t1_j5mdolq wrote
Looks good and fills a gap, imo there is no good DL book out there. What could be better: state results more clear as a Theorem with all needed assumptions.
NihonNoRyu t1_j5md892 wrote
Will you add a section for Forward-forward algorithm?
muchcharles t1_j5mbjtb wrote
Reply to comment by Avelina9X in [D] Did YouTube just add upscaling? by Avelina9X
> Dedicated graphics is completely idle during this.
Are you sure fixed function decoder/upscale stuff is reported in GPU utilization graphs?
[deleted] t1_j5m9vib wrote
[removed]
Comfortable_End5976 t1_j5m4w76 wrote
having a skim, it looks good mate. i like your writing style. please let us know once it's published and we can pick up a physical copy.
Nhabls t1_j5m4lxa wrote
Obviously haven't had the time to read through it, and this is a clear nitpick but i really don't like when sites like this force you to download the files rather than display it in the browser by default
arsenyinfo t1_j5m33oc wrote
As a practitioner, I am surprised to see no chapter on finetuning
hiptobecubic t1_j5nfprv wrote
Reply to comment by artsybashev in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut
.. yes?