Recent comments in /f/MachineLearning

numpee t1_j4whr9r wrote

Hi u/timdettmers, I had a great time reading your blog post :) I just wanted to point out something that might be worth mentioning: the issue with 4090 (and probably 4080 as well) is that they won't fit in servers, specifically 4U rack mounted servers. In rack mounted servers, the PCIe slots are placed at the bottom (facing upwards), so the GPUs are placed "vertically" (PCIe pointing downwards). The 4090s are too tall for the 4U server, which makes it unusable (plus, 3.5slots for a single GPU complicates things further).

3

Acceptable-Cress-374 t1_j4wcavd wrote

I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

> hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

This site gave me this:

> Your text is likely human generated!

11

Acceptable-Cress-374 t1_j4w9ssn wrote

From the model's card:

> Direct Use

> The model is a classifier that can be used to detect text generated by GPT-2 models. However, it is strongly suggested not to use it as a ChatGPT detector for the purposes of making grave allegations of academic misconduct against undergraduates and others, as this model might give inaccurate results in the case of ChatGPT-generated input.

Just for lols, I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

Response: hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

The above detector: > Computation time on Intel Xeon 3rd Gen Scalable cpu: 0.090 s > > Real 0.984

22

LetGoAndBeReal t1_j4vz8hv wrote

Companies can fine-tune top performing LLMs to condition the LLMs output, but not to embody the knowledge contained in proprietary data. The current best approach for incorporating this custom knowledge is through data augmented generation techniques and technologies such as what LangChain offers.

I am trying to decide whether to invest time building an expertise in these techniques and technologies. I may not wish to do so if the ability to add custom knowledge properly in the LLMs will arrive in short order.

I would like to know from those steeped in LLM R&D how soon such capabilities might be expected. Is this the right place to ask?

1

wildCatInMass t1_j4vv1tj wrote

As a consultant myself, I can say you're reasonably accurate with your assessment. Often times it works by a consulting firm beautifully presenting to non-technical execs at a company about how they can turn the company's data into money. Lots of over-confidence and some nicely "massaged" benchmark figures.

If said company hires the consulting firm, then the firm staffs the project with people who basically have to figure out how to build what was sold...typically in a short time, with a new team, none of whom understand the company's data landscape and the nuances associated with the company. You might be thinking that this is a recipe for disaster. You'd be right. It's why job satisfaction for data scientists and ML engineers in consulting is super low. But success stories do exist when the stars align and a team is able to build things like recommendation engines, supply chain optimization models, conduct a good segmentation analysis, etc. for companies that don't have sufficient talent to execute these internally.

2