So, was thinking more about the Million dollar Murray article. This raises an interesting question.
When things have a power law distribution of some sort, it is not obvious whether it is more useful to think in numbers of people or in percentages of the population. You might think they’re the same; they’re not.
Imagine that you have a population of 100 homeless people. Of these people, 99 stay in a shelter for a single night, and one stays in the shelter for 99 nights.
1% of the people are in the shelter more than once. But! 50% of the person-shelter-nights are associated with recurring visitors, while only 50% are associated with one-time visitors.
So is it more useful to think of the recurring-homelessness in this case as being 1% of the problem, or 50% of the problem? Both are at least potentially useful ways to think about it. I’m not so much arguing that there is a single right answer as that it’s important to remember that both answers exist.
The application of this I’m most aware of is talking about life expectancies for people with illnesses which typically kill either quite quickly or quite slowly. If half the people who get a disease live for 6 months, and half live for 10 years, then any sample of living people who have the disease will show many more than half who have had it over six months, despite the 50% mortality rate by six months.
When using statistics, you always have to stop and ask what exactly it is that you think you’re measuring.