I am trying to find outliers using interquartile range and is having a few problems. I have read this article to understand how to find outliers and understood most of it.
Now I am trying to apply this method in a program. But it seems that the data I am using doesn’t work with this method. The data I am using is more than 4000 rows of data and can be found in this link.
The minimum value is: 951,723112057644
The maximum value is: 1588,93458298046
Q1 Median is: 1273,39127623714
Q3 Median is: 1273,52543277336
IQR is: 0,13415653622
With the above result my inner fence range is between 1273,19004143281018 and 1273,726667577769002. Then it means that I have a lot of data are candidates to removed as outliers. I would call them outliers. The data set that I use have a lot of data between 1273 and 1274, I believe it’s more than 50% that are between this range.
Is using IQR a suitable method to find outliers, are there any other method that I should use instead?