I’m a researcher at the Image Processing and Interpretation (IPI) group at Ghent University with an eye (or two) on image quality: I’m interested in ways to measure and optimize the image quality of medical imaging devices using subjective and objective methods.
AbnormalData posts will mostly be about the tools I love, rants that need to be made public, and some stats I use.
The name AbnormalData came about because most of the subjective human data I collect are not at all normally distributed (well, usually it’s the residuals that are misbehaving). The image on the left is a representation of a Q-Q plot of non-normally distributed data.
If the basic assumptions of the statistical tools used to analyze these data are violated, there may be serious consequences (such as causing the green jelly bean industry to go bankrupt based on flawed analysis). This seems like a useful thing to write about – I mean the statistics part, not the jelly beans.
In general, “abnormal data” may refer to any one of the following:
- Data that are not normally distributed
- Data with crazy outliers
- Data that make you go “Hmmmm, that’s funny…”
- Data that are just corrupt and make you want to cry
But here we’ll focus on first three points – except for the occasional foray into despair when the fourth point strikes. Which is why you shouldn’t forget to make regular backups of your data.