Range Notation

Business Case: A record set consists of sequential values that must be represented in standard range notation. For example, values include 1, 2, 3, 5, 6, 8, 9, 10, 11, 14, 15, and 16; these values should be represented as a single, comma-separated string with sequential values compressed into ranges, as follows: “1-3,5,6,8-11,14-16”.

Environment: There are no environmental factors with a relevant influence on this business case.

Approach: We’ll attempt to meet this requirement in a series of steps that build upon each other. First, we’ll identify sequence gaps in the record set. With that information we’ll determine how many values are in each range. If there are more than two sequential values, we’ll compress the range into range notation indicating the first value and the last value in the range. Finally, we’ll incorporate the single-values and the ranges and convert them into a comma-separate string using the FOR XML clause.

Solution: Let’s see the solution in action, starting with creation of our sample data set.

-- setup sample data
VALUES (1), (2), (3)
    , (5), (6)
    , (8), (9), (10), (11)
    , (14), (15), (16);

With the same data set created, we’ll define the pieces necessary to reach the desired output. For this example, we’ll use common table expressions to enhance readability of the solution. Note that it can be done without CTEs, but it would be more difficult to follow due to the nested sub-queries.

Our initial step is to identify the gaps in sequencing; we’ll do this by joining the data set to itself, with the sequence value offset by one. Whenever the offset record set is NULL, we know we have a gap in the sequence. This step is coded in the “cte” common table expression below.

Once the sequence gaps have been identified, we can join the record set of gaps to the original record set of sequence numbers and add the range start value and range notation text to the range end (sequence gap) value. This is accomplished in the “cteRange” common table expression below. Note that we are grouping the records and determining those ranges that should be converted to range notation based on having a count > 2; in other words, ranges with three or more sequential values.

We’re almost there! Next we join the range data to the original record set of sequence numbers (again) and group the result based on range. In the event a sequence number is not part of a range (does not have a corresponding record in “cteRange”), it serves as its own range value. Finally, in order to convert this multi-record result into a single, comma-separated value, we use the FOR XML clause. (Haven’t done that before? Check here for more explanation, or here for another variation.) This entire process is demonstrated in the “put it all together” section of the code below.

-- create common table expressions to improve readability
WITH cte AS ( -- find range end points
    SELECT a.ID AS 'RangeEnd'
    FROM @tbl AS a LEFT OUTER JOIN @tbl AS b ON a.ID = b.ID - 1
, cteRange AS ( -- group values based on range end points
    SELECT MIN(t.ID) AS 'RangeStart'
        , oa.RangeEnd, CONVERT(VARCHAR
        , MIN(t.ID)) + '-' + CONVERT(VARCHAR, oa.RangeEnd) AS 'RangeText'
    FROM @tbl AS t
        OUTER APPLY (
            SELECT TOP 1 RangeEnd
            FROM cte
            WHERE cte.RangeEnd >= t.ID
            ORDER BY RangeEnd) AS oa
    GROUP BY oa.RangeEnd
    HAVING COUNT(*) > 2)

-- put it all together
    (SELECT ',' + ISNULL(cteRange.RangeText, t.ID)
    FROM @tbl AS t
        LEFT OUTER JOIN cteRange
            ON t.ID >= cteRange.RangeStart
            AND t.ID <= cteRange.RangeEnd
    GROUP BY ISNULL(cteRange.RangeStart, t.ID), ISNULL(cteRange.RangeText, t.ID)
    ORDER BY ISNULL(cteRange.RangeStart, t.ID)
    FOR XML PATH('')))
    , 1, 1, '') AS 'RangeNotation';

Does anyone have a suggestion on how this can be accomplished more easily? Or any alternate approaches? Please share your feedback.

Anyone who is a regular reader of this blog is probably waiting for my reminder to clean-up your work area. However, since this example uses a table variable and does not materialize anything in the database… there is nothing to clean-up! So you won’t hear anything about clean-up from me. 😉

Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: