Midrange News for the IBM i Community


Posted by: Chris Proctor
Programmer Analyst
Columbia Sports Company
Portland, OR
Handling large amounts of data????
has no ratings.
Published: 26 May 2017
Revised: 14 Jul 2017 - 2450 days ago
Last viewed on: 28 Mar 2024 (2819 views) 

Using IBM i? Need to create Excel, CSV, HTML, JSON, PDF, SPOOL reports? Learn more about the fastest and least expensive tool for the job: SQL iQuery.

Handling large amounts of data???? Published by: Chris Proctor on 26 May 2017 view comments(6)

Good morning, all!

I've been tasked with trying to "speed up" the Order Inquiry process used by our customer service team here at Columbia Sportswear. As you can imagine, they need to be able to search on a number of different values, such as name, address, email, order number, etc. or even a wildcard of any of the values.

Obviously, the files I'm dealing with have millions of records which doesn't help speed up the process. Anyway, I was wondering if anyone here might have ideas that would help, other than the usual "create a logical or index" answer.

I'm not one to give up, so hopefully someone will have a "revolutionary" :-) idea to try! Lol

Thanks!

Return to midrangenews.com home page.
Sort Ascend | Descend

COMMENTS

(Sign in to Post a Comment)
Posted by: bobcozzi
Site Admin ****
Chicagoland
Comment on: Handling large amounts of data????
Posted: 6 years 10 months 3 days 18 hours 3 minutes ago

I assume you ran the tasks in Debug mode to get the SQL Optimizer reports and recommendations?

Can you suggest that the change out their Disk Drives for Solid State Drives? That would make a big, big difference. Also increase the main storage (i.e., RAM) in the partition will help, and of course there is the SQL Server Job capabilities when you're running lots of queries over and over, it does help keep some of the junk SQL needs around.

I had a similar issue with NAPA (auto parts) and basically created Indexes for a lot of things, but what actually helped was creating a VIEW with the primary Query view -- that is the SELECT statement used by the search engine -- I created a view that contained the mapping so that it didn't need to do that at runtime.

Posted by: Ringer
Premium member *
Comment on: Handling large amounts of data????
Posted: 6 years 10 months 1 hours 36 minutes ago

If you using SQL for searches, look at the INDEX ADVISOR in System i Navigator.

Start the navigator desktop app and click on: 

My Connections

Your system name

Databases

Your system name

Schemas

Right click on PF/Tables Library and select Index Advisor, Index Advisor 

Then look at the "Table" names and "Keys Advised" columns. You can sort a column by clicking on it. Create an index based on the suggestions and counts. 

Ringer 

 

Posted by: Ringer
Premium member *
Comment on: Handling large amounts of data????
Posted: 6 years 9 months 30 days 20 hours 45 minutes ago

(Not sure why you want us to exclude building indexes, that's usually the answer) 

Posted by: clbirk
Premium member *
Comment on: Handling large amounts of data????
Posted: 6 years 9 months 30 days 20 hours 11 minutes ago

ringer has a good point, why do you want to exclude building indexes because if you use the inav index advisor and you see a high number of inquiries with a partcular path, then it would be advisable to build that index.

From what I understand, lets say it shows you two of them:

cust#

cust# name

 

if you build the second one, SQE is smart enough to know that it can use that index to also achieve the first one.

 

Of course I think we are assuming that you are using SQL to do such and not the traditional setLL and reads, etc. to access...  Is this being done in RPG or in like php?

You said the files have millions and millions of records, but in reality when the file has 1000 records or 100,000,000, with the proper index on such, the access time should be nearly the same if the query engine is designed correctly. It goes back to the stone age of binary searches, etc. that was taught nearly 50 years ago.

Sure if you are reading all those records in the program, well... and so if somebody says give me all the people whose last name starts with S that could be a big group and so maybe limiting that search with a fetch only x number of records... might be helpful.

DB2 can be pretty fast.

If you haven't used the index advisor, put a few indexes on the file, clear out the advisor and start over again and see what builds.  For example, say that you have 200,000 uses where having a particular index would be helpful and another with just 3. Forget the 3 and do the 200,000.

I also think the version of i5/os you are on, might help you as they continue to refine SQE so that it does more and more of the queries and doesn't use CQE. I mean if you were on 5.4, a move to 7.3 well...

indexes are where you will see the most speed improvement as long as the resultant set isn't something silly like my example of all folks with the last name starting with S (where that might return several million).

 

chris

 

 

 

Posted by: SteveCCNJ
Premium member *
Egg Harbor Twp., NJ
Comment on: Handling large amounts of data????
Posted: 6 years 9 months 17 days 17 hours 14 minutes ago

The 2 things that most impact query performance are Record Selection and Result Sorting. 

How your SQL is structured, in indexes, in WHERE clauses, and in ON clauses have a direct influence on how the Query Mgr. arrives at your result set.  Put your most exclusionary terms first.  For example, if a table has a Y/N flag and about half the rows are Y or N, that would help the Query Mgr. cut out half the rows right at the start.  Lesser terms, such as a date range should be put last when the Mgr. has already cut out most of the other possible matches and can do the date search on a much reduced result set. 

The same thing goes for Sorting - GROUP BY - terms.  Match existing indexes or keys, or create new ones if necessary.

If your Selection or Sorting terms can match existing indexes or Primary Keys, your code will perform much better.  There is a huge performance difference between DB2 having to do a table scan versus being able to do an index scan to get your results.  Avoid table scans like they lead to the unemployment line.......      ;-)

 

 

Posted by: clbirk
Premium member *
Comment on: Handling large amounts of data????
Posted: 6 years 8 months 16 days 21 minutes ago

So what did you do?