I’m a researcher with an eye (or two) on medical imaging and psychophysics: I’m interested in the ways we can measure the image and video quality of medical imaging devices using subjective and task-based approaches.
AbnormalData posts will mostly be about the tools I love, rants that need to be made public, and some stats I use.
The name AbnormalData came about because most of the subjective human data I collect are not at all normally distributed (well, usually it’s the residuals that are misbehaving). The image on the left is a representation of a Q-Q plot of non-normally distributed data.
If the basic assumptions of the statistical tools used to analyze these data are violated, there may be serious consequences (such as causing the green jelly bean industry to go bankrupt based on flawed analysis). This seems like a useful thing to write about – I mean the statistics part, not the jelly beans.
In general, “abnormal data” may refer to any one of the following:
- Data that are not normally distributed
- Data with crazy outliers
- Data that make you go “Hmmmm, that’s funny…”
- Data that are just corrupt and make you want to cry
But here we’ll focus on first three points – except for the occasional foray into despair when the fourth point strikes. Which is why you shouldn’t forget to make regular backups of your data.