    Percentile of a dataset having a lot of similar values: RFM Analysis

    Jay Parikh

      I'm trying to do RFM analysis and I am confused how to divide the percentile for frequency.

      I have a dataset of about 50,000 rows.

      For frequency, i have:

      54% of those are 1 orders

      20% of those are 2 orders

      and the rest is divided between values 3 through 54 orders.


      By dividing it as it is in group of 20%, customers with 1 order are getting 3 different weights.

      Should I just put all the 54% of 1 time buyers into the first percentile i.e. give the weight of 1? Would this be a fair way to do it?

      Or should I put 1's and 2's into first percentile i.e. give weight of 1?