Sorting and Removing Duplicates

Excel provides standard capabilities for sorting and eliminating duplicates. These can be accessed by Data > Sort & Filter|Sort and Data > Data Tools|Remove Duplicates. See Sorting and Filtering for more details.

Sorting functions have only been provided in the most recent releases of Excel (see Sorting and Filtering Functions). For earlier versions of Excel, sorting using Excel functions is possible using a little ingenuity, as described in Sorting using Excel Formulas.

The Real Statistics Resource Pack provides a number of sorting functions available for all releases of Excel. We describe these capabilities on this webpage. These capabilities can also be accessed via Real Statistics data analysis tools.

Worksheet Functions

Real Statistics Functions: The following array functions contained in the Real Statistics Resource Pack can be helpful in sorting data and removing duplicates.

QSORT(R1, ascend, nrows, ncols) – returns an nrows ⨯ ncols array with the data in range R1 in sorted order (arranged by columns) ); ascend is an optional parameter (default = TRUE); if ascend is TRUE then the sort is in ascending order and if ascend is FALSE (or 0) the sort is in descending order

SortUnique(R1, filler) – fills the highlighted range with the data in range R1 in ascending sorted order eliminating any duplicates (arranged by columns); the highlighted range should only have one column

ExtractUnique(R1, filler) – fills the highlighted range with the data in range R1 in the order found in R1 (arranged by columns) eliminating any duplicates; the highlighted range should only have one column

NODUPES(R1, filler, b) – fills the highlighted range with the data in range R1 eliminating any duplicates (arranged by columns); if b = TRUE then the data in R1 is sorted first, and if b = FALSE then it is assumed that range R1 is already in sorted order.

The string filler is used as a filler in case the output range has more cells than R1. This second argument is optional and defaults to the error value #N/A.

For QSORT, if nrows = 0 or is omitted then the output array is the highlighted range on the active worksheet. If nrows < 0 then the output is an array of the same size and shape as R1; in both these cases, ncols  is not used and can be omitted; otherwise ncols defaults to 1.

For the above functions, data in a range is assumed to be ordered first by column and then by row. E.g. for the range A1:B2, the presumed order is A1, A2, B1, B2.

Examples

The following two figures provide examples of the use of the above worksheet functions. Each example displays the use of an array formula in the range within the indicated rectangular border.

Numerical sorting examples

Figure 1 – Numeric examples

E.g. the range E5:F10 contains the array formula =QSORT(A5:C8). Note that range M12:N15 is not large enough to hold all 9 unique values and so 15 is omitted.

Alphanumeric sorting examples

Figure 2 – Alphanumeric examples

Dynamic Real Statistics Functions

There are also the following functions that are useful with dynamic arrays:

SortsUnique(R1) – outputs a column array with the data in range R1 in ascending sorted order eliminating any duplicates

ExtractsUnique(R1) – outputs a column array with the data in range R1 in the order found in R1 eliminating any duplicates

Example

Figure 3 provides examples of the use of the above two functions, plus QSORT when the row and column lengths are specified. For these examples, Excel 365 uses can enter each formula in the upper left corner of the highlighted rectangle and press the Enter key.

Dynamic array sorting examples

Figure 3 – Dynamic array formulas

Data Analysis Tool

Real Statistics Data Analysis Tool: The Real Statistics Resource Pack contains the Reformatting a Data Range data analysis tool. This tool provides a variety of reformatting capabilities (including those provided by QSORT and SortsUnique).

See Example 5 from Reformatting Capabilities for more details.

Count Worksheet Functions

Real Statistics Functions: The Real Statistics Resource Pack also provides the following non-array worksheet functions related to the  ExtractsUnique function.

COUNTU(R1) = count of the number of unique numeric elements in R1

COUNTAU(R1) = count of the number of unique non-blank elements in R1

R1 is an array or cell range that doesn’t contain any error values.

For example, referring to Figure 1, the formula =COUNTU(B5:D8) returns the value 9. Similarly, referring to Figure 2, the formula =COUNTAU(A3:B12) returns the value 6.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Mellor, L. (2020) Sort and filter using dynamic array functions in Microsoft Excel
https://www.laramellortraining.co.uk/how-to-sort-and-filter-using-dynamic-array-functions-in-microsoft-excel

Stack Overflow (2021) Sortby function combined with Unique combined with filter functions
https://stackoverflow.com/questions/68246608/sortby-function-combined-with-unique-and-filter-functions-how-to-use-it-correcl

12 thoughts on “Sorting and Removing Duplicates”

  1. Thanks, this was useful for my work. However…

    You’ve lost data! In your numeric example your sorted-without-duplicates data no longer contains a 2. This is because your method is not robust for the final row.

    The same thing happened in my work and because I’m in a hurry I did an inelegant work around but perhaps you could fix elegantly for others?

    Reply
    • Ed,
      Thank you very much for finding this mistake. I have just corrected the error. The instructions now say that you should put the formula =COUNTIF(I7:I$16,”=”&I6) in cell I6.
      I really appreciate your diligence and help in making the website better.
      Charles

      Reply
    • Mitchell,

      I used the following formulas:

      cell O6 =IF(COUNTIF(N7:N$15,”=”&N6)=0,N6,””)
      cell P6 =SMALL(O$6:O$15,ROW(N6)-ROW(N$6)+1)
      cell R6 =IF(ISNUMBER(P6),P6,””)

      An alternative form for cell R6 is

      cell R6 =IFERROR(SMALL(O$6:O$15,ROW(N6)-ROW(N$6)+1),””)

      This approach only requires the original data (in column N) and column O.

      Charles

      Reply
  2. Hi Sir, Can we sort out data when column B in your example 1 is filled with with formula,

    Like B1 = COUNTIF(X$1:X$100,X1) and B is Spread to B100

    It is showing #ref

    Reply
    • Leela,

      I start with the following data in column A (starting in cell A1). I then insert the formula =COUNTIF($A$1:$A$7,A1) in cell B1. Next I copy this formula down (i.e. I highlight range B1:B7 and then press Ctrl-D). Finally, I highlight the range C1:C7, insert the array formula =QSORT(B1:B7) and press Ctrl-Shft-Enter. The result is that the values in column B are sorted.

      A B C
      34 2 1
      12 3 1
      12 3 2
      25 1 2
      34 2 3
      15 1 3
      12 3 3

      If I can do this with 7 rows, you should be able to do it with 100 rows.

      Charles

      Reply
  3. Figure 5 sample formulas…
    D36 — return only 0’s
    E36–return only 0’s
    F36– error
    G36 — error

    Can you please reply with full set of formula from A36 to G36 for sorting alphanumeric values… or send me excel sample file.. I will grateful to you.

    thanks
    Muza

    Reply
  4. Sir

    It seems there are something wrong with functions: SortUnique(R1, s); ExtractUnique(R1, s); NODUPES(R1, s, b). When the last number of R1 is a duplicate, e.g. R1=(4,5,2,9,2,-1,4,7,0,2), these three functions treat “2” as a unique number.

    Colin

    Reply
    • Colin,
      I have tried using SortUnique and NODUPES with the data range you have listed and the functions do what they are supposed to do (i.e. 2 is not repeated in the output). Please clarify or send me a worksheet with the example.
      Charles

      Reply

Leave a Comment