Midrange News for the IBM i Community


Automatic Sequencing - SQL Identity Column Published by: Bob Cozzi on 01 Feb 2011 view comments(2)
© 2011 Robert Cozzi, Jr. All rights reserved.

© Robert Cozzi, Jr. All rights reserved. Reproduction/Redistribution prohibited.
A midrangeNews.com Publication

Welcome to The Premiere Issue of RPG Report

Bob Cozzi Launches Subscription-based Newsletter for i Professionals

NOTE: RPG Report Free Preview Has Ended

Our two February issues require no membership, our two March issues require a free membership, and all subsequent issues of RPG Report require a Premium (paid) membership. Learn more about our membership choices here.

February marks the launch of RPG Report, Bob Cozzi's twice-monthly technical tips and announcements newsletter. Those who subscribe receive the newsletter and discounts on other materials, such as online seminars, RPG IV training on DVD and links to free downloads and code examples. Here's the best part, you can receive this great content delivered to your home or office email inbox for just $2/month and you may cancel anytime. Our two February issues of RPG Report are free, but subscriptions are now being accepting.

To subscribe, visit midrangenews.com and click the Sign Up link on the top right side of the page header, or click here to create a user ID now. After you've created your free user ID, click on the RPG Report link on the home page and then click "Follow". By following the newsletter, you are effectively subscribing to it. Whenever a new issue of RPG Report is created, you will receive an email notification. You don't have to follow every issue of RPG Report, just the entry page. RPG Report is published twice each month by midrangeNews.com. While much of midrangeNews.com content is free, subscribing to our Premium content, such as RPG Report requires a Premium (paid) membership of $2/month. You may cancel anytime.

Sponsored by: BCD

RPG Report - Auto-Increment a Database Key Field

RPG routines to increment a key field when adding a new record to a file have evolved over the years. When I first started programming RPG, most shops used a so called header record to store the "next key" for the file. Typically this was a record whose own key was something like 99999. You would "chain out to customer 99999" and a second field was used to store the next available customer number. You'd grab its value, increment it by 1 and update that master record. Then write out a new record to the file with the new key value.

Later, when System/38 Data Areas came into vogue, programmers moved from that 99999 record method to a data area that contained the next key value. Some shops used a single, large data area for all "next key" values, while others created a data area with the same name as the database file containing various attributes for processing the file. This included the next key value. While not my favorite technique, this method certainly insures that the data area and hence the next key value was locked until another job could retrieve and modify its value.

Another method we've seen used in many shops includes the use of a "read last record" technique. This method incorporates a SETLL *END opcode followed by a READP opcode. After reading the last or highest key value, it is incremented and then a new record is written to the file (releasing the previous last record). This does have some of the characteristics of the Data Area method--it locks the record and therefore the next key value. There are some odd downsides to this technique. But they aren't what we're talking about today.

Identity Column

A better way to create a self-incrementing field was introduced into DB2 UDB (i.e., "the database") in v5r2. Thanks in part to Oracle implementing a similar feature, IBM implemented the so called IDENTITY. Interestingly, both were after MS Access and SQL Server had similar features; competition is good.

Originally, Oracle illustrated that if an "auto incrementing" key were needed, a database trigger could be used along with the database SEQUENCE feature. But writing a trigger requires that it be done for every database file to which you want to include this type of auto-incrementing field. Great for job security, but not so much for day-to-day development and design. So they came up with a better approach: Identity Columns.

The IBM DB2 UDB Identity Column automatically increments to the next value when a record is written to the file; no user-written code is needed. In fact, you simply write a record to the file using the native RPG I/O WRITE opcode or the SQL INSERT statement, and it just works. The database performs the necessary routines to determine the next available value and then generates it during the write or insert operation. It even works when you use the CPYF (copy file) command.

As I mentioned, this type of field is referred to as an Identity Column. However, no DDS keyword exists to allow you to add the Identity attribute to a field in a database file. You might think an IDENTITY keyword would be all they'd need to implement but IBM has effectively frozen DDS for database. This means any cool new database features, such as the Identity Column attribute, will not be implemented in DDS. You will need to use SQL to define a file that contains an Identity column.

The SQL CREATE TABLE statement is used to create a file (i.e., table) that includes an IDENTITY column. If you want to add an Identity column to a file or change an existing field in a file to an Identity column, you must first convert the file's DDS to SQL data definition language (DDL). Then add the "GENERATED AS IDENTITY" clause to the SQL for the field in question. Fortunately generating DDS from an existing file is no problem thanks to my handy RTVSQLSRC (Retrieve SQL Source) command, download the RTVSQLSRC.ZIP file now.

Example IDENTITY

A typical Contacts database file, originally created with DDS might look like this:

     A          R CONTACT                                         
     A            CUSTNO         7P 0       ALIAS(CUSTOMER_NUMBER)
     A            COMPANY       32A         ALIAS(COMPANY_NAME)   
     A            ADDRESS       32A                               
     A            CITY          20A                               
     A            STATE          4A                               
     A            ZIPCODE       10A                               
     A            EMAIL         10A                               
     A            PHONE         11P 0                             
     A          K CUSTNO                                          	
	

This file contains 8 fields and 1 key field. Using the RTVSQLSRC CL command, the generated SQL equivalent of the DDS used to create the file is as follows:

     CREATE TABLE RPGREPORT/CONTACTS (                                          
       CUSTOMER_NUMBER FOR COLUMN CUSTNO  DECIMAL(7, 0),                      
       COMPANY_NAME FOR COLUMN COMPANY CHAR(32) NOT NULL DEFAULT '' ,  
       ADDRESS CHAR(32) CCSID 37 NOT NULL DEFAULT '' ,
       CITY CHAR(20) CCSID 37 NOT NULL DEFAULT '' ,
       STATE CHAR(4) CCSID 37 NOT NULL DEFAULT '' ,
       ZIPCODE CHAR(10) CCSID 37 NOT NULL DEFAULT '' ,
       EMAIL CHAR(10) CCSID 37 NOT NULL DEFAULT '' ,
       PHONE DECIMAL(11, 0) NOT NULL DEFAULT 0,
         PRIMARY KEY (CUSTNO) )
       RCDFMT CONTACT ;  
	

The SQL generated by RTVSQLSRC produces some very interesting results. You not only see the field definitions, but you can see how SQL handles the ALIAS keyword (important if you're moving to IBM i v7r1 or later).

Key Fields in SQL

The key field is visible on the 2nd to last line in the above source; the PRIMARY KEY keywords identify the field as the key field. The RTVSQLSRC command also likes to embed the CCSID keyword which clutters things up, but I'll leave it in there for now. (FYI, it is not the RTVSQLSRC command that inserts the CCSID keyword, but the underlying API that does.)

Since we are now working in SQL, we have to use SQL-speak. The term "field" is replaced with the SQL term "COLUMN", the term "file" is replaced with the SQL term "TABLE" (although I still say "file" most of the time) and the term "record" is referred to as a "ROW". Wait a minute! I can't say "ROW" I'm not using a spreadsheet, I'm using a database file, ah, I mean table. I can't seem to embrace the term "row". SQL did not embrace the database term tuple instead of row, so I won't be embracing row over record until I successfully rewire my brain; but I digress.

To add an auto-incrementing IDENTITY attribute to the CUSTNO column, we need to add "GENERATED ALWAYS AS IDENTITY" clause to the SQL definition, as follows:

     CREATE TABLE RPGREPORT/CONTACTS ( 
       CUSTOMER_NUMBER FOR COLUMN CUSTNO  DECIMAL(7, 0)
>>         GENERATED ALWAYS AS IDENTITY,

"CUSTOMER_NUMBER FOR COLUMN CUSTNO" defines a column (field) named CUSTNO with an alias of CUSTOMER_NUMBER as a 7-digit packed decimal column. By adding "GENERATED ALWAYS AS IDENTITY" the database no longer allows you to assign a value to the column. Instead, the database assigns the value when a new record is written to file ("table" in SQL speak). The first record added will have a CUSTNO value of 1, the second will have value of 2 and so on.

When the Identity column is also a keyfield, the identity column must contain unique values ("unique key"). This too is handled by the system, preventing the use of certain IDENTITY features such as the CYCLE keyword. When an Identity column is not a keyfield, it may contain duplicate values. This occurs when you, for example set a range limit for the field and specify the CYCLE keyword. The CYCLE keyword causes the Identity column to wrap around to the beginning value when the maximum value is reached. The max value is specified using the MAXVALUE clause (there is also a MINVALUE clause).

That is really all there is to it. You can create the example table I've illustrated now and give it a try, but wait... there's more.

Setting the Increment Value

You are allowed to control the starting Identity value and its increment value the default is to start at 1 and increment by 1. But what if you want to start at some other value and/or increment by 2 or 10 or 500 instead?

After the "GENERATED ALWAYS AS IDENTITY" keywords, you may insert a parenthetical statement that controls the starting point and increment value. The keywords needed to implement this are "START WITH" and "INCREMENT BY". Here's an example that starts with a value of 1200 and increments by 10:

     CREATE TABLE RPGREPORT/CONTACTS (
       CUSTOMER_NUMBER FOR COLUMN CUSTNO  DECIMAL(7, 0)
           GENERATED ALWAYS AS IDENTITY
>>       (START WITH 1200, INCREMENT BY 10),

With these settings, the first record written into the file will have a CUSTNO value of 1200, the 2nd record will have a value of 1210, the 3rd will be 1220 and so on. If record 1220 is the last record added and it is deleted before any other records are added, the next WRITE or INSERT operation creates a record with a CUSTNO value of 1230. That is, the database does not reuse the Identity values of deleted records.

To give you the full picture, I've summarized the changes to the SQL data definition language (DDL) source for the CONTACTS table in the following example:

     CREATE TABLE RPGREPORT/CONTACTS (
       CUSTOMER_NUMBER FOR COLUMN CUSTNO  DECIMAL(7, 0)
            GENERATED ALWAYS AS IDENTITY
              (START WITH 1200, INCREMENT BY 10),
       COMPANY_NAME FOR COLUMN COMPANY CHAR(32) NOT NULL DEFAULT '' ,  
       ADDRESS CHAR(32) CCSID 37 NOT NULL DEFAULT '' ,
       CITY CHAR(20) CCSID 37 NOT NULL DEFAULT '' ,
       STATE CHAR(4) CCSID 37 NOT NULL DEFAULT '' ,
       ZIPCODE CHAR(10) CCSID 37 NOT NULL DEFAULT '' ,
       EMAIL CHAR(10) CCSID 37 NOT NULL DEFAULT '' ,
       PHONE DECIMAL(11, 0) NOT NULL DEFAULT 0,
         PRIMARY KEY (CUSTNO) )
       RCDFMT CONTACT ;  

If your application is one of those 20+ year old systems that doesn't care if the invoice or order numbers wrap around to zero and start over after several years, this Identity capability could also help in this situation. There is another keyword named CYCLE that causes the system to wrap around when it hits the top value that will fit into the column. If you're working with a 5-digit packed decimal column, when it hits 99999 if the CYCLE keyword has been specified in the SQL DDL, the Identity value will wrap around to 0 on the next insert operation. However, you may not use a column as a PRIMARY KEY when the CYCLE keyword is specified on this column. But you may build an INDEX, VIEW, Logical File or specify the ORDER BY clause and get the results in the expected sequence.

When CYCLE is not specified (which is normally the case) and you reach the highest value that can fit into the column, an exception is generated and the SQLSTATE variable is set to a value greater than '02000'.

Using Identity in RPG

The database itself handles the assignment of the Identity column's value. Using a file with an Identity column in RPG is the same as any other file, except the column must be treated as read-only field in your program. This means you cannot move a value into the Identity column, but you can retrieve it's value using the normal RPG or SQL techniques. The RPG WRITE opcode is smart enough to handle Identity column correctly--no additional work required.

An advantage to using SQL INSERT over the WRITE opcode is that after an SQL INSERT you can retrieve the Identity column value generated by that INSERT operation. To do that use the SQL VALUES statement as follows:

         exec SQL  INSERT into CONTACTS
             (company, address, city, state, zipcode, email, phone) 
             VALUES(:c.company, :c.address, :c.city,
                    :c.state, :c.zipcode, :c.email, :c.phone);
        if (SQLSTATE < '02000')); // Successful?

 >>        exec SQL values IDENTITY_VAL_LOCAL() into :c.CustNo;

        endif;

Following the INSERT, the VALUES statement is performed (identified by the line marked with >> symbols). Using the special function IDENTITY_VAL_LOCAL(), the last Identity is returned. The INTO keyword is used to identify the RPG IV variable into which the Identity value is stored. This must be a numeric field. In this example, I used a Qualified Data Structure Subfield named CUSTNO in the data structure named C.

In the above example, the SQL INSERT statement is used to add a new record to the CONTACTS database file. Since a value for the CUSTNO may not be specified on an INSERT statement, all columns except CUSTNO are listed on the VALUES keyword of the INSERT statement. If CUSTNO were specified on the list of columns, then the VALUES keyword would have to include the special SQL DEFAULT keyword. This keyword tells SQL to use the column's default value (identified with the DEFAULT keyword on the CREATE statement) but for IDENTITY columns it means "insert the next sequence number". Here's a modified version of the previous INSERT statement, but this time I've included the CUSTNO column and the DEFAULT keyword for its value.

         exec SQL  INSERT into CONTACTS
             (custno, company, address, city,
               state, zipcode, email, phone) 
>>           VALUES(DEFAULT, :c.company, :c.address, :c.city,
                    :c.state, :c.zipcode, :c.email, :c.phone);
        if (SQLSTATE < '02000')); // Successful?

           exec SQL values IDENTITY_VAL_LOCAL() into :c.CustNo;

        endif;

By including the CUSTNO column in the list of fields, I had to add the DEFAULT keyword to the VALUES clause, no other changes were made. Why would you do this? Some people like to avoid specifying the list of column names on an INSERT statement and instead specify only the VALUES clause. When this happens, you need to specify a value for every column of the file/table. Once again, here is a modified version of the previous example to illustrate this point:

         exec SQL  INSERT into CONTACTS
>>           VALUES(DEFAULT, :c.company, :c.address, :c.city,
                    :c.state, :c.zipcode, :c.email, :c.phone);
        if (SQLSTATE < '02000')); // Successful?

           exec SQL values IDENTITY_VAL_LOCAL() into :c.CustNo;

        endif;

In this example I've completely removed the list of column names. This means I have to specify a value for every column. This is similar to what happens when you specify "SELECT *" where all columns are selected. (Disclaimer: SQL pundits  will tell you to "never use 'SELECT *' because it is bad form" and I agree with them. But that's a topic for another day.)

Identity Caveats

A few other Identity column options include the ability to set the min/max values for the column, a cache size used in very high-transactions per minute situations, and the wrap or cycle feature that I mentioned early.

In addition, there is the ability to override the Identity value generated by the system. Yes this seems contrary to what I've said so far. You can override the automatic Identity value by including your own value and the "OVERRIDING SYSTEM VALUE" clause on the INSERT or UPDATE SQL statements, for example:

         exec SQL  INSERT into CONTACTS
               VALUES(9408, :c.company, :c.address, :c.city,
                      :c.state, :c.zipcode, :c.email, :c.phone)
>>               OVERRIDING SYSTEM VALUE;
        if (SQLSTATE < '02000')); // Successful?

           exec SQL values IDENTITY_VAL_LOCAL() into :c.CustNo;

        endif;

Normally the database would assign the value, but in this example I've overridden the Identity value to 9408. To enable this, I had to included the "OVERRIDING SYSTEM VALUE" clause on the INSERT statement.

Adding the Identity Attribute to an Existing Database File

If you are adding IDENTITY to a file/table, the Identity attribute may only be specified when the file/table is initially created. The SQL ALTER TABLE statement does not support adding the Identity attribute to a column/field in an existing file/table. (At least I wasn't able to make that work on my v6r1 system.)

Restarting an Identity Sequence

 ALTER TABLE contacts ALTER COLUMN custno RESTART;

If an ALTER TABLE / ALTER COLUMN with the RESTART option is used, the Identity column is reset to its starting value, but the existing data in the file/table is not deleted. The next INSERT statement will assign the first Identity value (identified by the "START WITH" clause in the SQL DDL) to the record. If "START WITH" is not specified, 1 is the next value inserted.

Resetting the Next Identity Sequence

 ALTER TABLE contacts ALTER COLUMN custno RESTART WITH 9404;

Suppose to you create a new file and want to copy the original data to that new location. When you issue the CPYF command, the Identity fields are copied along with the other fields in the file. A problem will occur when you go to insert the next record; a duplicate key violation will occur (i.e., a duplicate identity). This will be caused due to the START WITH clause on the new file being the same as the original file. To solve that issue, after the CPYF command runs, run a query on the file and order by the Identity column in descending order (to view the highest current Identity value). Then issue a RESTART WITH clause using the ALTER TABLE, as illustrated above, that sets the next key value to (in this example) 9404.

Removing an Identity Attribute

ALTER TABLE contacts ALTER COLUMN custno DROP IDENTITY;

If you remove the Identity attribute from an existing column, the database no longer generates the value.

Setting a Minimum and/or Maximum Identity Range

ALTER TABLE contacts ALTER COLUMN custno MINVALUE 10, MAXVALUE 99999;

Even though the CUSTNO column has a length of 7-digits, I limit its value to 99999. Note that this applies only to Identity columns. You can also specify the MINVALUE and MAXVALUE keywords on the original column definition as follows:

     CREATE TABLE RPGREPORT/ITEMMAST (
       ITEMNO CHAR(5),
       ORDNBR INTEGER   GENERATED ALWAYS AS IDENTITY
>>              (MINVALUE 2, MAXVALUE 10, CYCLE),
       TITLE CHAR(32) CCSID 37 NOT NULL DEFAULT '' ,
       DESC  VARCHAR(2000) ALLOCATE(30) CCSID 37 NOT NULL DEFAULT '' ,
         PRIMARY KEY (ITEMNO) 
     );
Ad

This last example may be a bit contrived, but its purpose is to illustrate the syntax more than anything else. Note that when you use MIN/MAXVALUE and CYCLE you can't use the column as a primary key. You may specify it in an ORDER BY clause, but not as the PRIMARY KEY on the Table definition statement.

Experiment Yourself

Try it yourself. Extract the DDS for an existing file on your system. I like to store SQL source in the QSQLSRC source file. Then add the IDENTITY clause and re-create the file in your work-library (be careful not to replace any live production database files). Then try out different situations using both SQL and native RPG IV opcodes (it also works in RPGIII if you still have some code that hasn't yet been converted).

I find the best way to learn much of this stuff is to play with it (i.e., experiment) and see if it does what you expect. While it may not always do what you want, it will always do what it's supposed to do.

Call Me

That's all for this issue. If you have any questions on any area of RPG IV, Web development, API, C/C++ or anything else IBM i development related (except subfiles, data area, and RPGII/III because I don't care about that stuff) write a comment to me on our website and I may include a response in an upcoming issue of RPG Report.

Also, please remember to subscribe to RPG Report and tell your friends.

-Bob Cozzi
Follow Bob on Twitter

Return to midrangenews.com home page.
Sort Ascend | Descend

COMMENTS

(Sign in to Post a Comment)
Posted by: jonathancw
Premium member *
Comment on: RPG Report - 01 Feb 2011
Posted: 10 years 8 months 29 days 7 hours 41 minutes ago

Hi There,

 

it seems as though generated by default doesnt work as expected.

when using the write opcode in RPG it defaults the generated by default column to zero. so for some reason RPG

is looking at it as though i am specifying zero as a value because it is an integer column. is it possible to get the default auto generated number for this field when writing to it?

 

create table test    

(

Col1 INTEGER NOT NULL           

 GENERATED BY DEFAULT AS IDENTITY

(START WITH 100                  

 ,INCREMENT BY 1                 

 ,NO CYCLE                       

 ,ORDER)

 )                 

 

any assistance will be greatly appreciated.

Posted by: bobcozzi
Site Admin ****
Chicagoland
Comment on: RPG Report - 01 Feb 2011
Posted: 9 years 1 months 6 days 23 hours 20 minutes ago

Interesting observation. I find that on V7R1 it does not do what you describe, but now I'm wondering if on prior releases, RPG had issues with IDENTITY columns. I'm going to have to test this out on a V5R4 system.

Also, I did discover that ALTER TABLE can be used on DDS-based files starting with V7R1. Prior that then, you had to either RTVSQLSRC and recreate the table or go without identity columns on DDS-created files.