Midrange News for the IBM i Community

Posted by: Chris Proctor
Programmer Analyst
Columbia Sports Company
Portland, OR
Processing a LARGE number of rcds quickly??
has no ratings.
Published: 01 Jun 2015
Revised: 05 Jun 2015 - 3270 days ago
Last viewed on: 17 May 2024 (3558 views) 

Using IBM i? Need to create Excel, CSV, HTML, JSON, PDF, SPOOL reports? Learn more about the fastest and least expensive tool for the job: SQL iQuery.

Processing a LARGE number of rcds quickly?? Published by: Chris Proctor on 01 Jun 2015 view comments(3)

Good morning. I have a situation where we need to process a large number of rcds in a short period of time and I'm wondering if anyone might have been in this same situation and came up with a good solution. The program is simple, but it is currently processing approx 16M rcds. Here's a brief look at what the program needs to do.




  SETGT <On key fields> FILEB

  READPE FILEB   <-- The reason this is needed is because the key includes year and week and all may not exist, so we need to grab the latest available


    CHAIN <key> FILEC

     IF %FOUND





         WRITE FILEC





Any suggestions would be greatly appreciated. I was actually trying to think of ways to possibly break it down into multiple jobs being submitted, or something like that. I'd be interested to hear any other ideas. Thanks, all!!


Return to midrangenews.com home page.
Sort Ascend | Descend


(Sign in to Post a Comment)
Posted by: starbuck5250
Premium member *
Comment on: Processing a LARGE number of rcds quickly??
Posted: 8 years 11 months 18 days 2 hours 22 minutes ago

Two generic concepts apply to all I/O optimisation:

1) Reduce the number of rows

2) Reduce the number of columns

One can reduce the number of rows in several ways:

1) Break the batch into pieces and run them in parallel

2) Process the rows at a different time.  Instead of waiting to process 16M rows, process them one at a time, when they are written into that 16M row table, for instance.  Triggers might help here.

3) Use a logical file that omits unnecessary (deleted / inactive / last year) rows.

Reducing the number of columns can be pretty easy.  Imagine this 16M row file is a sales history file, and that it has customer, item, de!--script--ion, weight, serial number, quantity, price, extension, date sold.  Now imagine that all you care about is totals by customer/item number.  Every CHAIN brings in all the columns; only 3 are needed.  Say the record length is 100 bytes.  For 16M rows, you are moving 1.6B bytes of data from disk to your program.  If you had a logical file which only contained cust, item, qty then a CHAIN to that file would need to move way fewer bytes - say 15 - per record.  Now, moving through 16M rows would only require the transfer of 240M bytes, which is quite a reduction, and without a single change to the logic in your program.

Posted by: bobcozzi
Site Admin ****
Comment on: Processing a LARGE number of rcds quickly??
Posted: 8 years 11 months 17 days 20 hours 13 minutes ago
Edited: Fri, 05 Jun, 2015 at 06:46:41 (3270 days ago)

Figure out the SQL statement that will do what you want and use it. It'll finish faster than your RPG program.

Also if SQL is not an option, then look at the OVRDBF command and SEQONLY parameter. Make that as large as possible SEQONLY(*YES 50000)  There is also another NBRRCDS parameter somewhere in there, that can be useful.

Posted by: Ringer
Premium member *
Comment on: Processing a LARGE number of rcds quickly??
Posted: 8 years 11 months 14 days 1 hours 23 minutes ago

When you use READE/READPE, the compiler disallows record blocking on that table. If you use SETLL/SETGT/CHAIN/READ/READP and then check for key breaks yourself (IF stmts), it will run tons faster. And you need to add BLOCK(*YES) to those F-Specs. Run the job, dispay the job and you will see the I/O count go down dramatically if you did it right. 

Chris Ringer