Recent comments in /f/MachineLearning
mrconter1 t1_j4wq1zs wrote
Reply to comment by bo_peng in [P] RWKV 14B Language Model & ChatRWKV : pure RNN (attention-free), scalable and parallelizable like Transformers by bo_peng
How does the memory scale with the context window size?
suflaj t1_j4wndsx wrote
Reply to comment by Acceptable-Cress-374 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Using a black box model for this kind of stuff looks like a nice way to get sued
stablebrick t1_j4wl8xn wrote
Equivalent-Way3 t1_j4wjuxe wrote
Reply to [D] Is it possible to update random forest parameters with new data instead of retraining on all data? by monkeysingmonkeynew
XGBoost can do this and you can set its hyperparameters so that it's a random forest
Anjum48 t1_j4wjh0m wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
I came across this one last week which the author says is a fine-tuned BERT model: https://originality.ai/
notdelet t1_j4wi2gy wrote
Reply to [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
Well there is the stuff that Rudin is doing with Rashomon Sets/small explainable trees. Then there is the stuff on optimal decision trees using mixed-integer programs. I'm not working on the area at the moment, but those are the things I have heard people talk about recently.
numpee t1_j4whr9r wrote
Hi u/timdettmers, I had a great time reading your blog post :) I just wanted to point out something that might be worth mentioning: the issue with 4090 (and probably 4080 as well) is that they won't fit in servers, specifically 4U rack mounted servers. In rack mounted servers, the PCIe slots are placed at the bottom (facing upwards), so the GPUs are placed "vertically" (PCIe pointing downwards). The 4090s are too tall for the 4U server, which makes it unusable (plus, 3.5slots for a single GPU complicates things further).
CaptainDifferent3116 OP t1_j4wendu wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Also, did someone build a recent dataset with chatgpt examples for this ?
CaptainDifferent3116 OP t1_j4wedjz wrote
Reply to comment by Acceptable-Cress-374 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
I'll try and share in a small article how I'm testing with the dataset's details.
TiredOldCrow t1_j4wdufa wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Nothing works consistently, especially if an attacker tests their own outputs against the open source detectors, or makes manual tweaks to the outputs.
junetwentyfirst2020 t1_j4wcwxz wrote
Reply to comment by Acceptable-Cress-374 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
It’s important to remember that these models are statistically robust. So while you may get a false positive or false negative, it does not reflect on the robustness of the model.
Acceptable-Cress-374 t1_j4wcavd wrote
Reply to comment by sfhsrtjn in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
I tested this with text-davinci-03.
Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.
> hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!
This site gave me this:
> Your text is likely human generated!
RoboiosMut t1_j4wanh2 wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Wondering if you can build a GAN on top of GPT
Acceptable-Cress-374 t1_j4w9ssn wrote
Reply to comment by CaptainDifferent3116 in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
From the model's card:
> Direct Use
> The model is a classifier that can be used to detect text generated by GPT-2 models. However, it is strongly suggested not to use it as a ChatGPT detector for the purposes of making grave allegations of academic misconduct against undergraduates and others, as this model might give inaccurate results in the case of ChatGPT-generated input.
Just for lols, I tested this with text-davinci-03.
Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.
Response: hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!
The above detector: > Computation time on Intel Xeon 3rd Gen Scalable cpu: 0.090 s > > Real 0.984
mickman_10 t1_j4w9jyx wrote
Reply to [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
I know that Rich Caruana at Microsft has been pushing interpetable tree-based models for a little while now, and there’s still probably ongoing research there. For example, this paper and this project.
shellyturnwarm t1_j4w61tf wrote
Reply to [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
Ha! i was about to link that exact paper you mention after seeing the title. Gael and his team are doing great work.
sfhsrtjn t1_j4w5dy0 wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
Please be aware of this one as well:
>Edward Tian's app at GPTZero.me
Also cannot vouch for this, just trying to be a bit helpful :)
CaptainDifferent3116 OP t1_j4w06rk wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
The best performing one so far would be : https://huggingface.co/roberta-base-openai-detector
LetGoAndBeReal t1_j4vz8hv wrote
Reply to [D] Simple Questions Thread by AutoModerator
Companies can fine-tune top performing LLMs to condition the LLMs output, but not to embody the knowledge contained in proprietary data. The current best approach for incorporating this custom knowledge is through data augmented generation techniques and technologies such as what LangChain offers.
I am trying to decide whether to invest time building an expertise in these techniques and technologies. I may not wish to do so if the ability to add custom knowledge properly in the LLMs will arrive in short order.
I would like to know from those steeped in LLM R&D how soon such capabilities might be expected. Is this the right place to ask?
CaptainDifferent3116 OP t1_j4vz4wd wrote
Reply to comment by sfhsrtjn in [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
The first one doesn't seem to work (at least the live test)
The second one is garbage...
sfhsrtjn t1_j4vxu76 wrote
Reply to [D] Do you know of any model capable of detecting generative model(GPT) generated text ? by CaptainDifferent3116
https://huggingface.co/spaces/openai/openai-detector
https://huggingface.co/spaces/Hello-SimpleAI/chatgpt-detector-single
Tried these already? I have not so I can't speak to their quality
wildCatInMass t1_j4vv1tj wrote
Reply to comment by Zestyclose-Check-751 in [D] I’m a Machine Learning Engineer for FAANG companies. What are some places I can get started doing freelance work for ML? by doctorjuice
As a consultant myself, I can say you're reasonably accurate with your assessment. Often times it works by a consulting firm beautifully presenting to non-technical execs at a company about how they can turn the company's data into money. Lots of over-confidence and some nicely "massaged" benchmark figures.
If said company hires the consulting firm, then the firm staffs the project with people who basically have to figure out how to build what was sold...typically in a short time, with a new team, none of whom understand the company's data landscape and the nuances associated with the company. You might be thinking that this is a recipe for disaster. You'd be right. It's why job satisfaction for data scientists and ML engineers in consulting is super low. But success stories do exist when the stars align and a team is able to build things like recommendation engines, supply chain optimization models, conduct a good segmentation analysis, etc. for companies that don't have sufficient talent to execute these internally.
gunshoes t1_j4voru9 wrote
Reply to comment by Avelina9X in [D] Has any work been done on VQ-VAE Language Models? by Avelina9X
Trade secret for ML: your problem is always an alteration of preexisting cv/speech/NLP framework
dojoteef t1_j4vnho4 wrote
Reply to comment by Avelina9X in [D] Has any work been done on VQ-VAE Language Models? by Avelina9X
Note that the authors have an earlier paper introducing discrete latents for NLP and there are a number of follow up papers to this one as well. So if your interested in a deep dive, you should investigate the citation graph of this paper. Good luck!
[deleted] t1_j4wqmn0 wrote
Reply to [R] Researchers out there: which are current research directions for tree-based models? by BenXavier
[deleted]