Recent comments in /f/dataisbeautiful
Accomplished-Owl3330 t1_jag1a4r wrote
Reply to comment by elijahmeeks in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
Ahh got it. Thanks for taking the time!
elijahmeeks OP t1_jag0zz8 wrote
Reply to comment by yourmamaman in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
I think it reflects the inherent biases in the design, training and implementation of the ML algorithms that drive ChatGPT: Science topics are considered "less uncertain but more complex" because its source material, creators and developers believe that, but also it has controls in place to avoid saying controversial things and topics like ghosts, history, art & religion are all much more likely to have controversy and therefore be more "uncertain" to ChatGPT when it comes to giving answers.
elijahmeeks OP t1_jag0qj8 wrote
Reply to comment by Accomplished-Owl3330 in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
Not a stupid question, it's using the "nice" scales functionality in D3 scales. because the low end doesn't land on a "nice" value it's hidden (which can be really frustrating sometimes). The chart is generated via a dataviz library I created called Semiotic which uses D3 under the hood for things like this. You can see the chart interactively on the original notebook that I link to in my comment if you want to play with it.
[deleted] t1_jag0jat wrote
Reply to comment by [deleted] in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
[removed]
Accomplished-Owl3330 t1_jag08aq wrote
Reply to [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
Sorry if this come across as a stupid question, but how are the scales and the intervals determined?
[deleted] t1_jafzqys wrote
yourmamaman t1_jafyyvw wrote
Reply to [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
What would be your hypothesis on the reason the fields like parapsychology are different than say computer science in this context. (Bars are flipped, and large difference)
elijahmeeks OP t1_jafwspx wrote
Reply to comment by Winterstorm8932 in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
The short answer: These ratings come from ChatGPT.
Long Answer: You'd have to read through the full exploration to see the entire picture. Basically, I asked ChatGPT to give me 100 topics too uncertain or complex to discuss without misleading users, and to give me subject areas for them and ratings on complexity and uncertainty. So this plot shows the aggregate complexity/uncertainty value of those 100 topics by subject area of the topic. You can see it all in much more detail in the notebook I link to in my comment.
Winterstorm8932 t1_jafwc1e wrote
Reply to [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
How is complexity judged, especially considering how so many of these subjects overlap?
[deleted] t1_jafw4k2 wrote
Reply to comment by harrrrrrjinderrrrr in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
[removed]
harrrrrrjinderrrrr t1_jafvsgn wrote
Reply to [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
Interesting! What was the data source you used?
[deleted] t1_jafvqv9 wrote
elijahmeeks OP t1_jafu4ac wrote
Reply to [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
Dumbbell Plot or Barbell Plot? Will we ever figure out this, the most critical question in modern data visualization?
elijahmeeks OP t1_jafu0ax wrote
Reply to [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
I asked ChatGPT to give me a list of topics that were too complex or uncertain for it to discuss without being misleading and made a bunch of charts out of the results. This is a nice overview Dumbbell plot of the subject areas of the 100 topics that ChatGPT gave me but if you want to see some treemaps and word clouds you can check out the whole thing here: https://app.noteable.io/f/8e355d65-cd94-4afe-bb81-b9aa24a457dc
I used a Noteable notebook, which is Pythn+SQL. The data visualization uses their built-in tool DEX, which uses javascript with Semiotic and D3 under the hood.
Edited to add: This is a live notebook that you can interact with, explore, and export the dataset that I had ChatGPT create for me.
GulfKiwi t1_jaf4i4f wrote
Reply to [OC] How well-liked are the most famous actors according to the British public? by down_vote_magnet
I feel like Jeff Bridges' location explains a lot about why things are difficult in Britain right now.
Norwester77 t1_jaf4fu9 wrote
Reply to comment by Habalaa in [OC] The Evolution of the European-Indian-Iranian language family by Pluto_and_Charon
The exact position of Germanic is controversial and really hard to pin down.
cox_ph t1_jaf433z wrote
Reply to comment by Rob1150 in [OC] How well-liked are the most famous actors according to the British public? by down_vote_magnet
She made Natasha Richardson move to California and made McFly move to New York.
OnceiwasaDJ t1_jaf3yrg wrote
Reply to [OC] How well-liked are the most famous actors according to the British public? by down_vote_magnet
Seeing Jack Black on the bottom-left corner is a bit sad
pierced_mirror t1_jaf33ar wrote
Reply to comment by NoWayNotThisAgain in [OC] New Mexico Now Produces More Oil Than Mexico & Venezuela by latinometrics
And the capital of the Aztec empire was in the Valley of Mexico. No other STATE called itself Mexico. We are talking about political creations, not the names of Valleys. What does it feel like to not even know how to think clearly and come to logical conclusions, clown?
Alternative-Sock-444 t1_jaf31x2 wrote
Reply to comment by crlogic in [OC] Every car I’ve ever driven, their country of origin, make model and transmission type. My spreadsheet goes into even more detail by crlogic
My favorite so far is easily my customer's E39 M5 that I've been restoring for the last year. We had a high output engine built by Partee Racing mated to a diffsonline race prepped transmission. We recently got it dialed in on the Dyno and the thing is a riot. As far as current gen cars go, the new i4 M50 is an amazing car, as is the new i7. The i4 is just stupid fast and nimble, whereas the i7 is a big luxury boat with more tech squeezed into it than you could imagine. The new M cars are cool, but they don't do a whole lot for me. They feel too numb for what they're supposed to be. At least with the i cars, being electric, I expect them to be more numb and reserved feeling, which they are. But it doesn't take away from the fun of driving them. The only newer M car that I really enjoy is the M2 comp. It's a much more analog feeling car. I'd love to take one around a track one day.
[deleted] t1_jaf2l0o wrote
Reply to comment by Rob1150 in [OC] How well-liked are the most famous actors according to the British public? by down_vote_magnet
[deleted]
crlogic OP t1_jaf276y wrote
Reply to comment by Alternative-Sock-444 in [OC] Every car I’ve ever driven, their country of origin, make model and transmission type. My spreadsheet goes into even more detail by crlogic
Totally agree! The other BMWs in my list are an E83 X3 3.0si, F10 535i XDrive and an E90 328i.
Yours would have been awesome. What would be your favourite?
Alternative-Sock-444 t1_jaf103u wrote
Reply to comment by crlogic in [OC] Every car I’ve ever driven, their country of origin, make model and transmission type. My spreadsheet goes into even more detail by crlogic
I daily an E39 540i. Mid 90s to early 00s was peak BMW. Also, as a BMW tech, if I made a chart like this, it would be 90% BMWs, and I'm kind of regretting not keeping track over the years. But it would probably be a couple thousand cars 😅
Pluto_and_Charon OP t1_jaf0xou wrote
Reply to comment by makingthematrix in [OC] The Evolution of the European-Indian-Iranian language family by Pluto_and_Charon
that is not what this statistical analysis recovered
the paleo-balkan branch is thought to be one of the oldest and most archaic branches of indo-european
aluvus t1_jag1qou wrote
Reply to comment by elijahmeeks in [OC] Complexity and Uncertainty of Topics that ChatGPT Claims to be Difficult to Discuss by elijahmeeks
ChatGPT is not a data source, even for data about itself. There is no underlying "thinking machine" that can, say, meaningfully assign numeric scores like this. It is essentially a statistical language machine. It is a very impressive parrot.
There is nothing inside of it that can autonomously reach a conclusion that a topic is too difficult to comment on; in fact, many people have noted that it will generate text that sounds very confident and is very wrong. It does not have a "mental model" by which it can actually be uncertain about claims in the way that a human can.
The first question you asked (100 topics) is perhaps one that it can answer in a meaningful way, but only inasmuch as it reveals things that the programmers programmed it not to talk about. But the others reflect only, at best, how complex people have said a topic is, in the corpus of text that was used to train its language model.
Regarding the plot itself, I would suggest "uncertainty about topic" and "complexity of topic" for the labels, as the single-word labels were difficult for me to make sense of. I would also suggest reordering the labels, since complexity should be the thing that leads to uncertainty (for a human; for ChatGPT they are essentially unrelated).