Processing a LARGE number of rcds quickly??

Posted by: Chris Proctor

Programmer Analyst

Columbia Sports Company

Portland, OR

Processing a LARGE number of rcds quickly??
has no ratings.

Published: 01 Jun 2015
Revised: 05 Jun 2015 - 3247 days ago
Last viewed on: 23 Apr 2024 (3527 views)

Main URL:

Processing a LARGE number of rcds quickly?? Published by: Chris Proctor on 01 Jun 2015 view comments(3)

Good morning. I have a situation where we need to process a large number of rcds in a short period of time and I'm wondering if anyone might have been in this same situation and came up with a good solution. The program is simple, but it is currently processing approx 16M rcds. Here's a brief look at what the program needs to do.

READ FILEA

DOW NOT %EOF

SETGT <On key fields> FILEB

READPE FILEB <-- The reason this is needed is because the key includes year and week and all may not exist, so we need to grab the latest available

IF NOT %EOF

CHAIN <key> FILEC

IF %FOUND

ADD FIELDS TO TOTALS

UPDATE FILEC

ELSE

MOVE FIELDS AND TOTALS TO RCD

WRITE FILEC

ENDIF

READ FILEA

ENDDO

Any suggestions would be greatly appreciated. I was actually trying to think of ways to possibly break it down into multiple jobs being submitted, or something like that. I'd be interested to hear any other ideas. Thanks, all!!

Sort Ascend | Descend

COMMENTS

(Sign in to Post a Comment)

Posted by: starbuck5250

Premium member *

Comment on: Processing a LARGE number of rcds quickly??
Posted: 8 years 10 months 24 days 21 hours 17 minutes ago

Two generic concepts apply to all I/O optimisation:

1) Reduce the number of rows

2) Reduce the number of columns

One can reduce the number of rows in several ways:

1) Break the batch into pieces and run them in parallel

2) Process the rows at a different time. Instead of waiting to process 16M rows, process them one at a time, when they are written into that 16M row table, for instance. Triggers might help here.

3) Use a logical file that omits unnecessary (deleted / inactive / last year) rows.

Reducing the number of columns can be pretty easy. Imagine this 16M row file is a sales history file, and that it has customer, item, de!--script--ion, weight, serial number, quantity, price, extension, date sold. Now imagine that all you care about is totals by customer/item number. Every CHAIN brings in all the columns; only 3 are needed. Say the record length is 100 bytes. For 16M rows, you are moving 1.6B bytes of data from disk to your program. If you had a logical file which only contained cust, item, qty then a CHAIN to that file would need to move way fewer bytes - say 15 - per record. Now, moving through 16M rows would only require the transfer of 240M bytes, which is quite a reduction, and without a single change to the logic in your program.

Posted by: bobcozzi

Site Admin ****

Chicagoland

www.cozTools.com/qryfile

Comment on: Processing a LARGE number of rcds quickly??
Posted: 8 years 10 months 24 days 15 hours 7 minutes ago
Edited: Fri, 05 Jun, 2015 at 06:46:41 (3247 days ago)

Figure out the SQL statement that will do what you want and use it. It'll finish faster than your RPG program.

Also if SQL is not an option, then look at the OVRDBF command and SEQONLY parameter. Make that as large as possible SEQONLY(*YES 50000) There is also another NBRRCDS parameter somewhere in there, that can be useful.

Posted by: Ringer

Premium member *

Comment on: Processing a LARGE number of rcds quickly??
Posted: 8 years 10 months 20 days 20 hours 17 minutes ago

When you use READE/READPE, the compiler disallows record blocking on that table. If you use SETLL/SETGT/CHAIN/READ/READP and then check for key breaks yourself (IF stmts), it will run tons faster. And you need to add BLOCK(*YES) to those F-Specs. Run the job, dispay the job and you will see the I/O count go down dramatically if you did it right.

Chris Ringer