Recent comments in /f/dataisbeautiful

aluvus t1_jag1qou wrote

ChatGPT is not a data source, even for data about itself. There is no underlying "thinking machine" that can, say, meaningfully assign numeric scores like this. It is essentially a statistical language machine. It is a very impressive parrot.

There is nothing inside of it that can autonomously reach a conclusion that a topic is too difficult to comment on; in fact, many people have noted that it will generate text that sounds very confident and is very wrong. It does not have a "mental model" by which it can actually be uncertain about claims in the way that a human can.

The first question you asked (100 topics) is perhaps one that it can answer in a meaningful way, but only inasmuch as it reveals things that the programmers programmed it not to talk about. But the others reflect only, at best, how complex people have said a topic is, in the corpus of text that was used to train its language model.

Regarding the plot itself, I would suggest "uncertainty about topic" and "complexity of topic" for the labels, as the single-word labels were difficult for me to make sense of. I would also suggest reordering the labels, since complexity should be the thing that leads to uncertainty (for a human; for ChatGPT they are essentially unrelated).

12

elijahmeeks OP t1_jag0zz8 wrote

I think it reflects the inherent biases in the design, training and implementation of the ML algorithms that drive ChatGPT: Science topics are considered "less uncertain but more complex" because its source material, creators and developers believe that, but also it has controls in place to avoid saying controversial things and topics like ghosts, history, art & religion are all much more likely to have controversy and therefore be more "uncertain" to ChatGPT when it comes to giving answers.

2

elijahmeeks OP t1_jag0qj8 wrote

Not a stupid question, it's using the "nice" scales functionality in D3 scales. because the low end doesn't land on a "nice" value it's hidden (which can be really frustrating sometimes). The chart is generated via a dataviz library I created called Semiotic which uses D3 under the hood for things like this. You can see the chart interactively on the original notebook that I link to in my comment if you want to play with it.

2

elijahmeeks OP t1_jafwspx wrote

The short answer: These ratings come from ChatGPT.

Long Answer: You'd have to read through the full exploration to see the entire picture. Basically, I asked ChatGPT to give me 100 topics too uncertain or complex to discuss without misleading users, and to give me subject areas for them and ratings on complexity and uncertainty. So this plot shows the aggregate complexity/uncertainty value of those 100 topics by subject area of the topic. You can see it all in much more detail in the notebook I link to in my comment.

5

elijahmeeks OP t1_jafu0ax wrote

I asked ChatGPT to give me a list of topics that were too complex or uncertain for it to discuss without being misleading and made a bunch of charts out of the results. This is a nice overview Dumbbell plot of the subject areas of the 100 topics that ChatGPT gave me but if you want to see some treemaps and word clouds you can check out the whole thing here: https://app.noteable.io/f/8e355d65-cd94-4afe-bb81-b9aa24a457dc

I used a Noteable notebook, which is Pythn+SQL. The data visualization uses their built-in tool DEX, which uses javascript with Semiotic and D3 under the hood.

Edited to add: This is a live notebook that you can interact with, explore, and export the dataset that I had ChatGPT create for me.

3

Alternative-Sock-444 t1_jaf31x2 wrote

My favorite so far is easily my customer's E39 M5 that I've been restoring for the last year. We had a high output engine built by Partee Racing mated to a diffsonline race prepped transmission. We recently got it dialed in on the Dyno and the thing is a riot. As far as current gen cars go, the new i4 M50 is an amazing car, as is the new i7. The i4 is just stupid fast and nimble, whereas the i7 is a big luxury boat with more tech squeezed into it than you could imagine. The new M cars are cool, but they don't do a whole lot for me. They feel too numb for what they're supposed to be. At least with the i cars, being electric, I expect them to be more numb and reserved feeling, which they are. But it doesn't take away from the fun of driving them. The only newer M car that I really enjoy is the M2 comp. It's a much more analog feeling car. I'd love to take one around a track one day.

5

Alternative-Sock-444 t1_jaf103u wrote

I daily an E39 540i. Mid 90s to early 00s was peak BMW. Also, as a BMW tech, if I made a chart like this, it would be 90% BMWs, and I'm kind of regretting not keeping track over the years. But it would probably be a couple thousand cars 😅

6