ARRAY SORT statement

Purpose

Sort all or part of a given array.

Syntax

Numeric array:

ARRAY SORT darray([index]) [FOR count] [,TAGARRAY tarray()] [,{ASCEND | DESCEND}]

String array:

ARRAY SORT dArray([index]) [FOR count] [,FROM start TO end] [,COLLATE {UCASE |

    cstring}] [,TAGARRAY tarray()] [,{ASCEND | DESCEND}]

Custom sort array:

ARRAY SORT darray([index]) [FOR count] [,TAGARRAY tarray()] ,CALL custfunc()

Remarks

ARRAY SORT sorts all or part of darray, an n-dimensional array, in ascending or descending order.  tarray is a tag-along array whose elements are swapped in the same order as those in darray as the sort proceeds (you could sort an array of names and have an array of corresponding addresses tag along, for example).  tarray must have at least as many elements as darray, since corresponding elements of tarray will be swapped during the sort.

Note that tarray does not have to be of the same type as darray.  For example, you could have a numeric array containing account numbers tag along with a string array containing user names:

DIM Users$(100 TO 500), AcctNum&(100 TO 500)

ARRAY SORT Users$(), TAGARRAY AcctNum&()

Together, index and count specify the portion of darray to be sorted.  index specifies the element at which the sort is to begin, while count specifies the number of consecutive elements to be sorted.  If index is omitted, the sort begins at the first element of darray.  If count is omitted or is zero, the array is sorted from element index to the last element of darray.  If both are omitted, the entire array is sorted:

DIM A&(1 TO 99)

ARRAY SORT A&(5)        'sorts elements 5..99 of A&

ARRAY SORT A&() FOR 10  'sorts elements 1..10 of A&

ARRAY SORT A&(9) FOR 20 'sorts elements 9..28 of A&

ARRAY SORT A&()         'sorts elements 1..99 of A&

Sorting numeric arrays

By default, arrays are sorted in ascending order.  To sort in descending order, include the DESCEND keyword:

ARRAY SORT A&(), DESCEND  ' descending order

ARRAY SORT A&(), ASCEND   ' ascending order

ARRAY SORT A&()           ' ascending order

Sorting string arrays

When sorting a string array, the sort is performed in ascending order by default.  In addition to DESCEND, ARRAY SORT provides the COLLATE UCASE and COLLATE string options.

COLLATE UCASE treats all lowercase letters as equal to their uppercase counterparts during the sort (elements "Bob" and "BOB" would be considered equal, for example):

DIM A$(1 TO 5)

A$(1) = "Bob"

A$(2) = "Jan"

A$(3) = "Linda"

A$(4) = "Ann"

A$(5) = "Jerry"

ARRAY SORT A$(), COLLATE UCASE, DESCEND

'sorts A$() in descending order; case-insensitive

COLLATE cstring is used to specify an entirely new sorting order.  This can be used for a variety of purposes, the most obvious of which is the case of international character sets.  The collate string cstring must contain exactly 256 characters, one for each of the ASCII codes 0-255, in the order that they would be sorted (from lowest to highest, if an ascending sort were performed on them).

Each position in the string represents the ASCII code of that value.  The contents of the byte at that position tells PowerBASIC the "weight" or importance factor of that particular ASCII code.  The default is that position 0 has a weight of 0, position 1 has a weight of 1, etc, so that CHR$(0) sorts first, CHR$(1) sorts next, and so on through CHR$(255).

Suppose you want the special character "ä" to have the same weight as the standard character "a".  It's easy: construct a string of 256 characters, 0-255; then go to the position of "ä" (ASCII code 132), and change the contents of that byte so it is exactly equal to the code for "a" (97).  The following code fragment constructs just such a collate string:

' Create a 256-character string:

FOR ix = 0 TO 255

  C$ = C$ + CHR$(ix)

NEXT

MID$(C$, 132 + 1) = CHR$(97)

We add one to the ASC value for MID$ because string positions start at 1, not 0.  We can also use the expanded CHR$ function to create the same collating string using less code:

C$ = CHR$(0 TO 131, 97, 133 TO 255)

It is most important to remember the rule for creating a collating string, as it is easy to make an intuitive jump to the wrong conclusion.  Each position in the string (1-256) represents the ASCII code with that value minus one (CHR$(0) to CHR$(255)).  The contents of the byte at that position tell the ARRAY SORT procedure the new "weight" or importance factor for that particular code.  This is exactly the technique used by the 80x86-assembler opcode XLAT.

Suppose you want CHR$(0) to sort at the very end of the sequence.  To do that, you would set the byte at position 0+1 to CHR$(255) and the bytes at positions 0+2 to 0+256 to the values 0 to 254.  The ASCII sequence in the collating string would appear like this: 255,0,1,2,3,4…254.  Using the expanded CHR$ function, this is straightforward:

C$ = CHR$(255, 0 TO 254)

To sort upper case and lower case alphabetic characters as exactly equal, just set positions 97 to 122 (a-z) to the values 65-90 (A-Z).  This is precisely how COLLATE UCASE is handled.  With the collating method implemented by this procedure in PowerBASIC, it is possible for two or more ASCII codes to have equal "weight".

As mentioned earlier, many programmers make a common, fatal mistake by intuitively creating a collating string that is simply a list of ASCII codes, in the sequence they wish to sort.  That is, they expect the byte which appears first in the string to sort first, the byte which appears next to sort second, so that creating a collate string from the BASIC code:

CHR$(65) + CHR$(66) + CHR$(67) + ...

…might cause the characters "ABC..." to be sorted first.  This technique will never work with the ARRAY statement and must be carefully avoided.  We describe it here only because it is a common error.  While it is arguably more intuitive than the technique implemented in PowerBASIC, the reason it does not work is that it doesn't allow two or more ASCII codes to have the same "weight".

The following code builds a collating string compatible with the American OEM ASCII character set.  For the fastest operation, this code should be run only once and the collating string should be made global.

GLOBAL cu AS STRING

FOR x = 0 TO 255

  cu = cu + CHR$(x)

NEXT

MID$(cu, 97+1, 26) = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

MID$(cu, 129+1, 6) = "ueaaaa"     ' üéâäàã

MID$(cu, 136+1, 9) = "eeeiiiAAE"  ' ëèïîìÄÂÉ

MID$(cu, 147+1, 8) = "ooouuyOU"   ' ôöòûùÿÖÜ

MID$(cu, 161+1, 5) = "iounN"      ' ìòùñÑ

MID$(cu, 168+1, 1) = "?"          ' ¿

[ your code goes here ]

ARRAY SORT MyArray$(), COLLATE cu

An alternative arrangement using the expanded CHR$ function may look like this:

cu = CHR$(0 TO 96, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", _

        123 TO 128, "ueaaaa", _     ' üéâäàã

        135,        "eeeiiiAAE", _  ' ëèïîìÄÂÉ

        145 TO 146, "ooouuyOU", _   ' ôöòûùÿÖÜ

        155 TO 160, "iounN", _      ' ìòùñÑ

        166 TO 167, "?", _          ' ¿

        169 TO 255)

For example, the normal ascending ASCII sort order would be described by a string containing ASCII codes 0 through 255 in order:

C$ = CHR$(0 TO 255)

ARRAY SORT A$(), COLLATE C$

The normal descending ASCII sort order would be described by a collating string containing the reverse of the above:

C$ = STRREVERSE$(CHR$(0 TO 255))

ARRAY SORT A$(), COLLATE C$

COLLATE string can also be used with the ASCEND or DESCEND option.  With ASCEND, the sort is performed in the order specified by COLLATE string; DESCEND sorts using the reverse of the order specified by COLLATE string:

ARRAY SORT A$(), COLLATE C$, DESCEND

The COLLATE string option is provided as a flexible means with which to specify the sorting order for strings containing international characters or other special symbols.  Please keep in mind that the characters with ASCII code above CHR$(127) may have different meanings in different countries.  The examples here assume that the default American OEM ASCII code page is in use.

When sorting a string array, all characters of each element of the array are normally considered when performing comparisons.  To limit the comparison to a specific subset of characters, use FROM to specify the start position, and TO to specify the end position that ARRAY SORT will consider within each array element.  For example, you could sort based on the zip code contained in the last 5 characters of a 40-character address string:

ARRAY SORT A$()                ' sorts all chars

ARRAY SORT A$(), FROM 36 TO 40 ' sorts 36 - 40 only/p>

By using the FROM..TO keywords, it also becomes possible to sort an array of User-Defined Types.  In this case, ARRAY SORT can sort the array as if it were an array of fixed-length strings.

Sorting custom arrays:

In most cases, the standard numeric and string sorts should serve your needs very well.  However, in the case of more complex data, it is frequently necessary to create multi-key sorts, or other unusual data sequences.  Generally speaking, a multi-key sort is used when you wish to order data based upon multiple sections of a string or UDT.  For example, you may wish to have customers sequenced by name -- but in the case of duplicate names, order each set of duplicates by ZIP code.  With the custom array option, you can sort by any number of keys, in any sequence you may desire.

A custom array may be user-defined types, fixed-length strings, or nul-terminated strings.  With a custom array sort, you can write your own simple function to tell PowerBASIC the correct sequence for any two array elements.  In the following example, the array MyType() is sorted based upon the code you write in the user-written function named MyFunc().

ARRAY SORT MyType(), CALL MyFunc()

As PowerBASIC proceeds through the sort, each time it needs to compare two array elements, it calls your custom function (in this case named MyFunc) to determine the correct sequence of the two elements.  The custom function you write must always have exactly two ByRef parameters of an appropriate data type.  The custom function you write must always have exactly two ByRef parameters with precisely the same data type as the sorted array, for nul-terminated and FIELD strings, they must contain the length.  Your custom function must return a long integer to tell the correct sequence.  It returns -1 if the first parameter should precede the second parameter.  It returns +1 if the second parameter should precede the first.  It returns 0 if the parameters are equal. This affords the PowerBASIC programmer the ultimate tool in sorting capabilities.  You can have any number of keys.  You can sort ascending, descending, or some other special sequence.  The conditions are now totally under your control.  The following example show how easy it is to create a multi-key sort, even those based upon non-string members of a UDT.

Type TheType
  LastName   as String * 40
  FirstName  as String * 20
  BalanceDue as Currency

End Type
[statements]
Dim MyType(100) as TheType
[statements]

Array Sort MyType(), Call MyFunc()
[statements]

Function MyFunc(Param1 as TheType, Param2 as TheType) As Long
  If Param1.LastName < Param2.LastName Then
    Function = -1 : Exit Function
  End If
  If Param1.LastName > Param2.LastName Then
    Function = +1 : Exit Function
  End If
  If Param1.FirstName < Param2.FirstName Then
    Function = -1 : Exit Function
  End If
  If Param1.FirstName > Param2.FirstName Then
    Function = +1 : Exit Function
  End If
  If Param1.BalanceDue < Param2.BalanceDue Then
    Function = +1 : Exit Function
  End If
  If Param1.BalanceDue > Param2.BalanceDue Then
    Function = -1 : Exit Function
  End If
End Function

Notice that this function first sorts by last name in ascending sequence. If the last names are equal, it then sorts by first name in ascending sequence.  If both names are equal, it then sorts by Balance Due in descending sequence so that the accounts with the highest balance appear first.  This descending sequence is accomplished by switching the values -1/+1 in the final tests.

The array to be sorted, and the function parameters, must be fixed-length strings, nul-terminated strings, or user-defined types.  PowerBASIC verifies that the size of the data and parameters are identical.  However, to allow maximum flexibility, it does not require that the data types be the same. Therefore, for example, it's possible to sort an array of fixed-length strings using a function with UDT parameters as long as the data size is identical.  It is the programmer's responsibility to ensure accuracy.

Sorting a multi-dimensional array

When sorting a multi-dimensional array, the array is treated as a single-dimension array containing all of the elements of the multi-dimensional array, in linear column-major order.  That is, all elements where all dimensions (except the first), are held at their minimum bounds, will come first in memory.  These are immediately followed by the elements where the second dimension is set to its next consecutive index value, etc.

For example, the elements of a two-dimensional array (i.e., DIM A(n,x)) would be stored in consecutive memory locations like this:

(0,0), …, (n,0), (0,1), …, (n,1), …, (0,x), …, (n,x)

In this case, ARRAY SORT A(0,0) FOR n+1 would sort only elements (0,0)...(n,0), while ARRAY SORT A(0,0) would sort the entire array: elements (0,0)…(n,x).

Be very careful when using ARRAY SORT with multi-dimensional arrays so as not to disrupt the organization of the data in the arrays.

Options

The options for ARRAY SORT can be specified in any order, as long as the FOR option, if it is present, directly follows the closing parenthesis of the name of darray.

Restrictions

ARRAY SORT cannot be used on arrays within UDT structures or on an array of Interfaces.  However, ARRAY SORT can be used with arrays of UDT structures - simply treat them as if they were an array of fixed-length strings.

To use ARRAY SORT on an embedded UDT array, use DIM..AT to dimension a regular array (of the same type) directly "over the top" of the UDT array, and use ARRAY SORT on that array.  For example:

TYPE SalesType

  OrderNum AS LONG

  PartNumber(1 TO 20) AS STRING * 20

END TYPE

[statements]

DIM Sales AS SalesType

[statements]

DIM Temp(1 TO 20) AS STRING * 20 AT VARPTR(Sales.Partnumber(1))

ARRAY SORT Temp()

ERASE Temp()

See also

ARRAY ASSIGN, ARRAY DELETE, ARRAY INSERT, ARRAY SCAN, CHR$, DIM, LBOUND, REDIM, UBOUND, Array Data Types

Example

A&(5) FOR 10, TAGARRAY B$(), DESCEND

Sorts elements 5 through 14 of array A& in descending order, tagging along elements 5 through 14 of array B$.

ARRAY SORT A#()

Sorts all elements of array A# in ascending order, using no tag-along array.

ARRAY SORT A$(5) FOR 10, FROM 16 TO 25, COLLATE C$, TAGARRAY D()

Sorts elements 5 to 14 of array A$, considering only characters 16 to 25 of each element, using the sort order specified by collating string C$, tagging along elements 5 to 14 of array D.

ARRAY SORT A$()

Sorts all elements of array A$ in ascending order, considering all characters of each element, using no tag-along array.

ARRAY SORT MYTYPE(), USING MYFUNC()

Sorts all elements of the UDT array MYTYPE, using the custom UDT comparison function MYFUNC() to determine the sequence.