Recent comments in /f/dataisbeautiful

VikThorior t1_jazm9f9 wrote

When you fit data, you must have a model in mind. You don't just take something that seems to fit well. Otherwise, a 547th degree polynomial will do the job, but it's really not useful.

Here, your fit seems to suggest that, when the wind is strong, fossil fuel usage increases again. What is the model, the hypothesis, which would explain that?

Also, have you checked if every coefficient of the model is statistically significant? I'd guess that the 3rd isn't.

My guess for the best fit would be something resembling a logistic function: when the wind tends to infinity, fossil fuels tend to 0. In your model, fossil fuels would tend to infinity, which is... unlikely.

If you don't want to come up with a model, you have solutions: a moving average or a local regression like LOESS, which has the advantage to give a confidence interval.

Conclusion: regressions need to mean something. They must not be chosen without a model, even just hypothetical, in mind.

8

sisiredd t1_jazgb7d wrote

So they say that when the wind blows, they can generate up to 63 GW from wind power (that's what capacity means). What's wrong with that? Of course they have to rely on other energy sources when there's no wind.

2

okwaitno t1_jazc1oe wrote

Sorry but I prefer not to share that, as it will then be clear which pages I am referring to. And my level of knowledge is not visible to a mod anyway. They make decisions on their own terms irregardless. I’m sure you are aware of these issues, it’s hardly a new thing. Wikipedia ceased to be easy to edit long ago.

1

asyrin25 t1_jaz2rx0 wrote

I agree with the post you're responding to.

A misleading info graphic that explains that it's misleading is still misleading.

Putting these events together in a visualization is suggesting to your consumer that they're comparable, even if you point out why they're not in four different places.

A zoomed in line graph that grossly overestimates changes in the Y axis is still misleading even if the Y axis is clearly labeled. Even if the title of the graph is "Zoomed in Line Graph"

5

kompootor t1_jayy21o wrote

The title and thesis of the infographic are, to me, clear: that the number of annual automotive traffic deaths exceeds the largest amount of deaths of ever from a single disaster event in each category.

Though perhaps, now that the issue is raised, it would be more poignant to take something like the worst year of the deaths for each category, instead of the worst single event; the only one of the list I'd expect to get markedly worse from this amendment would be flooding, but it would pre-empt this possible objection. You could, if you like, denote the difference between the worst single event and other events that year with a slightly different color shade in the same box area.

50

databeautifier OP t1_jayxv1u wrote

Per the source, it actually was the failure of Banqiao Dam in 1975 and includes deaths due to subsequent diseases and hunger in addition to the initial flood. I can see the perspective that it should be categorized as a flood since it did include one, but the cause was a structural collapse and I put it in the visualization as one because the source (Wikipedia) categorized it that way.

13

Jackdaw99 t1_jaysd0c wrote

No, of course it's not. But -- of course -- I knew the backgrounds and paths of those classmates who were my friends, as well as friends from high school who had much the same sort of choices. Nevertheless, I'm not claiming this is dispositive: all I said was that the OP's contention it wasn't my experience. If anything, his evidence, with an apparent sample size of 0, is even thinner than mine.

6