Order Statistics from a Discrete Population

Basic Concepts

In a sample taken from a population, the kth order statistic is the kth smallest element in the sample. Suppose that the population has a discrete distribution with pdf f(z) and cdf F(z). First, we describe the (cumulative) distribution Fk(x) of the kth order statistic in a sample of size n taken from the population.

We assume that the population is {0, 1, 2, …}.

Cdf and Pdf Properties

Property 1: The cdf of the kth order statistic in the sample is

Order statistic cdf

Proof:

Proof 1

Proof 2

Observation: Property 1 is based on the binomial distribution. The following is an equivalent version of the cdf based on the negative binomial distribution:

Negative binomial version cdf

We also note that if G(x) is the cdf of the beta distribution Bet(k, n-k+1), then

Beta distribution version cdf

Property 2: The pdf of the kth order statistic in the sample for x ≥ 0 is        

Order statistics pdf

Proof:

Proof 1

Pdf and Cdf Example

Example 1: Calculate the probability that the 2nd order statistic for a sample of size 8 from the Poisson distribution with mean 5 is less than or equal to 3 (i.e. the cdf at 3). Also, what is the pdf at 3?

Figure 1 shows four ways of calculating this value, namely 66.9%, as shown in cells D11, F12, I2 and I4.

Order statistic discrete population

Figure 1 – cdf and pdf for 2nd order statistic

The formula in D11 is =SUM(D3:D9), where cell D3 contains the formula

=COMBIN($B$4,C2)*(1-POISSON.DIST($B$2,$B$3,TRUE))^($B$4-C2)*POISSON.DIST($B$2,$B$3,TRUE)^C2

The formula in cell F12 is =SUM(F2:F8)*POISSON(B2,B3,TRUE)^B5, where cell F2 contains the formula

=COMBIN($B$5+E2-1,$B$5-1)*(1-POISSON.DIST($B$2,$B$3,TRUE))^E2

Cell I2 contains the formula

=BETA.DIST(POISSON.DIST(B2,B3,TRUE),B5,B4-B5+1,TRUE)

Cell I4 contains the formula =ORDER_DIST(B2,B5,B4,TRUE,”poisson”,B3)

Note too that cell I5 contains the formula

=ORDER_DIST(B2-1,B5,B4,TRUE,”poisson”,B3)

and so the pdf at 3 is 40.66% as shown in cell I6 using the formula =I4-I5.

Expected Value

For a discrete distribution, the expected value of the kth order statistic for a sample of size n is

Mean of order statistic

The following is an alternative way of expressing the expected value of the kth order statistic for a discrete population.

Property 3: The expected value of the kth order statistic for a sample of size n is

Property 3

Proof: First observe that

Proof 1

Proof 2

Proof 3

Thus, for any m > 0

Proof 4

Proof 5

Proof 6

As m → ∞, mP(x(k) > m) → 0, and so

Proof 7

Observation: The expected values of the first and last order statistics are therefore

Mean x_(1)

Mean x_(n)

Expected Value Example

Example 2: Find the expected value of the 2nd order statistic for a sample of size 8 from a Poisson distribution with a mean of 5.

Figure 2 shows four ways of calculating the expected value, with the results shown in cells L2, L7, Q19 and T19.

Order statistic mean example

Figure 2 – Expected value of 2nd-order statistic

The value of 3.083 (cell L2) is based on a simulation of 1,000 samples using the Real Statistics formula =ORDER_SIM(B5,B4,TRUE,,,”poisson”,B3). See Confidence Intervals for Order Statistics, Medians and Percentiles for more information about the ORDER_SIM function. We see that the standard error for this estimate is a little over 1 with a 95% confidence interval of [1, 5].

Cell L7 contains the estimate of 3.10495 from the =ORDER_MEAN(B5,B4,,”poisson”,B3) formula. See Distribution of Order Statistics from a Continuous Population for more information about the ORDER_MEAN function.

We can see from column Q of Figure 2 that for x > 11, fk(x) = 0 and so we only need to sum between x = 0 and x = 11 to obtain the value shown in cell Q19. Note that cell O3 contains the formula =ORDER_DIST(N3,$B$5,$B$4,TRUE,”poisson”,$B$3). Cells P3, Q3 and Q19 contain the formulas =O3-O2, =N3*P3 and =SUM(Q2:Q17), respectively.

Finally, we can obtain the same value for the expected kth order statistic by using Property 3, as shown in cell T19. Note that cell T3 contains the formula

=1-ORDER_DIST(S3,$B$5,$B$4,TRUE,”poisson”,$B$3)

and cell T19 contains the formula  =SUM(T2:T17).

Property 4: The variance of the kth order statistic for a sample of size n is

Variance order statistic discrete

Real Statistics Support

The ORDER_DIST and ORDER_MEAN functions described in Order Statistics from a Continuous Population also support discrete populations.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

David, H. A. and Nagaraja, H. N. (2003) Order statistics. Wiley
https://books.google.it/books/about/Order_Statistics.html?id=3Ts1yDLWXmQC&redir_esc=y

Omondi, O. C. (2016) Order statistics of uniform, logistic and exponential distributions
http://erepository.uonbi.ac.ke/bitstream/handle/11295/97307/MSc_Project2016.pdf?sequence=1&isAllowed=y

Arnold, B. C., Balakrishnan, N., Nagaraja, H. N. (2003) A First course in order statistics. Society for Industrial and Applied Mathematics
https://books.google.it/books/about/A_First_Course_in_Order_Statistics.html?id=gUD-S8USlDwC&redir_esc=y

Leave a Comment