Reformatting Capabilities

The Real Statistics Resource Pack provides a variety of capabilities used for reformatting data.

These include the sorting and duplicate removal functions QSORT, SortUnique, ExtractUnique, and NODUPES (see Sorting and Removing Duplicates). They also include the functions DELBLANK and DELNonNum (see Dealing with Missing Data) as well as SHUFFLE and RANDOMIZE (see Sampling using Real Statistics Capabilities), and their dynamic array versions DELROWS, SHUFFLES, and RANDOMIZES.

Worksheet Functions

Real Statistics Functions: The following array functions contained in the Real Statistics Resource Pack can be used to reformat data, or to change the shape of an array or the order of elements in an array.

RESHAPE(R1, filler, nrows, ncols) – fills the highlighted range with the data in range R1 (in column order)

REVERSE(R1, filler) – fills the highlighted range with the data in range R1 in reverse order (by columns).

DELROW(R1, k) – returns an array identical to R1 except that the kth row has been removed

MERGE(R1, R2): outputs an m × n1+n2 range which contains the values in the m2 × n2 range R2 adjoined to the right of the  m1 × n1 range R1, where m = max(m1, m2) and any missing values are filled with empty cells

The string filler is used as a filler in case the output range has more cells than R1. This argument is optional and defaults to the error value #N/A.

Example

Example 1: Reformat the data in range B5:D8 of Figure 1, as illustrated in the rest of the figure.

Reformatting data examples

Figure 1 – Reformatting examples

This is accomplished by using the following array formulas:

=RESHAPE(B5:D8) in range F5:F16

=REVERSE(B5:D8) in range H5:J8

=DELNonNum(B5:D8,”nil”) in range L5:M9

=QSORT(B5:D8) in range O5:O16

=NODUPES(B5:D8) in range Q5:Q16

=NODUPES(O5:O16,,FALSE) in range S5:S16

In addition, there are the following non-array formulas:

=MDETERM(RESHAPE(F5:F8,,2,2)) in cell J12

=COUNTU(B5:D8) in cell J14

=COUNTUA(B5:D8) in cell J15

Other versions of RESHAPE and REVERSE

There is also a version of the RESHAPE function that takes the form

RESHAPE(R1, filler, nrows, ncols): outputs an nrows × ncols array with the data in range R1

This version of the function can be called from within another function. E.g. suppose that you have a column of data in the range A1:A9, you could calculate the determinant of a 3 × 3 version of this data by using the array formula =MDETERM(RESHAPE(A1:A9,,3,3)).

Similarly, the REVERSE function can also take the following form:

REVERSE(R1, filler, nrows, ncols): outputs an nrows × ncols array with the data in range R1 in reverse order.

If nrows < 0 then the output is an array of the same size and shape as R1; in this case, ncols is not used and can be omitted; otherwise ncols defaults to 1.

Dynamic array versions

The Real Statistics Resource Pack also provides versions of these functions that are especially useful when using dynamic arrays:

RESHAPES(R1, nrows, ncols, bycol): returns an nrows × ncols array with the elements from R1. If the nrows (ncols) argument is missing or 0, then nrows (ncols) is set to the smallest value such that all the elements in R1 are output. If nrows and ncols are both missing or 0, then a column array with all the elements in R1 is returned. If bycol = TRUE (default), then elements are returned in column order; otherwise, they are in row order. If R1 doesn’t contain a sufficient number of elements then #N/A is used as a filler.

REVERSES(R1): outputs an array of the same size and shape as R1 with the data in range R1 in reverse order.

Other Worksheet Functions

The Real Statistics Resource Pack also supports the following array functions:

MCONST(s, nrows, ncols): outputs an nrows × ncols array filled with the value s

SEQ(nrows, ncols, start, incr): outputs an nrows × ncols array containing the sequence of values start, start+incr, start+2*incr, start+3*incr… (in row order)

Here, start and incr default to 1. If nrows and/or ncols is omitted then the dimensions of the highlighted range are used.

For example, the array formula =SEQ(2,3,5,4) outputs an array with 2 rows and 3 columns; the first row contains the values 5, 9, 13 and the second row contains the values 17, 21, 25. The output would be the same if you placed the array formula =SEQ(,,5,4) in range A1:C2.

SUBMATRIX(R1, row1, col1, nrows, ncols): outputs an nrows × ncols subarray of the array R1 (nrows defaults to m – row1 + 1 and ncols defaults to n – col1 + 1) starting at the cell in the row1th row and col1th column of the m × n range R1.

Here, row1 and col1 default to 1.

Summary Table

Real Statistics Functions: The Real Statistics Resource Pack contains the following array function that creates a summary table.

COMPACT(R1, head): returns an m × n array which is a summary table for the data in the h × k array R1. The first k-1 columns of the output consist of the unique values in the first k-1 values of R1. The remaining columns consist of the counts of rows in R1 that match the non-negative integer in the kth column of R1.

If the minimum value in the kth column of R1 is mn and the maximum value is mx, then the output contains an additional mx-mn+1 columns.

If head = TRUE (default FALSE) then the first row of R1 is assumed to contain column headings, in which case the elements in the first k-1 columns of the first row of R1 are retained as headings in the output and the headings for the remaining columns of the output are mnmn+1, …, mx. If mx-mn > 24, then an error is generated.

Note that MLogitSummary(R1, head) and OLogitSummary(R1, head) are equivalent to COMPACT(R1, head). See Multinomial Regression Support and Ordinal Regression Support.

Binary version

The Real Statistics Resource Pack also provides a binary-choice version of COMPACT, defined as follows.

BCOMPACT(R1, head): returns an m × (k+1) array which is a summary table for the data in the h × k array R1 whose kth column can only contain the values 0 or 1. The first k-1 columns of the output consist of the unique values in the first k-1 values of R1. The kth column consists of the counts of the matching rows from R1 that contain 1 in the kth column and the k+1th column consists of the counts of the matching rows from R1 that contain 0 in the kth column.

If head = TRUE (default FALSE) then the first row of R1 is assumed to contain column headings, in which case the elements in the first k-1 columns of the first row of R1 are retained as headings in the output, and the kth and k+1th columns of the first row contain the values “Success” and “Failure”.

Note that LogitSummary(R1, head) is identical to BCOMPACT(R1, head). See Logistic Regression Support.

Data Analysis Tool

Real Statistics Data Analysis Tool: The Real Statistics Resource Pack contains the Reformatting a Data Range data analysis tool. This tool provides a variety of reformatting capabilities.

Example 2: Copy the data from range B5:D8 in Figure 1 into a 6 × 2 range removing any empty cells by using the Reformatting a Data Range data analysis tool.

Press Ctrl-m and choose the Reformatting Data Range option (from the Desc tab if using the multipage user interface). A dialog box will appear as shown in Figure 2. Fill in the dialog box as indicated and click on the OK button. The output will appear in range L5:M10, as shown in Figure 1.

Dialog box

Figure 2 – Dialog box for Reformatting a Data Range

You don’t need to fill in the # of Rows field since the data analysis tool will automatically set its value to 6 (since 4 × 3 = 12 = 6 × 2).

4 thoughts on “Reformatting Capabilities”

  1. Hi!
    So I came across your wonderful stat toolpak and I am eager to put it to good use. I have one question, I have noticed for analyzing, their is only one option which puts labels sorted as columns, however due to using my data in other programs for graphing and such, all of my data is organized as groups in rows. I was wondering if there is an easy way to use your tool pack without reformatting all of my data. I have about 6 sheets, each with 3-5 groups and 3 time-points. Thank you in advance!

    Annette

    Reply
    • Annette,
      I can’t think of a way without reformatting your data. This can be done by using Excel’s TRANSPOSE function or by copying and pasting using the Transpose option.
      Charles

      Reply
  2. After opening the sample files and updating to link to Real Stats there are many values which are now #NUM or #NAME and when looking at the formula it is changed to include path to real stats as in:
    =’C:\Users\xx\AppData\Roaming\Microsoft\AddIns\RealStats.xlam’!COUNTU(A5:C8)
    If I remove the path part I get message that there is some error in formula. How can I fix these. The above example came from Table Reformat 1 and except for original values all values in table are #NAME. Suggestions? Thank you

    Reply

Leave a Comment