Federated Database Performance (Part-2)

In part-1 of Federated Database Performance, we discussed what federation is, how it works and a few tips at the query level to improve the federated database performance.

In part-2, we will discuss few configuration parameters that will help us in performance tuning of the federated systems.

The performance of the query which includes data access on the remote databases mainly depends on the PUSHDOWN mechanism. Almost all parameters that IBM provides for improving the performance of the SHIP operation is dependent on this mechanism.

PUSHDOWN Mechanism:

Using pushdown mechanism, DB2 optimizer tries to push as many operations as possible to the remote database server so that filtering of data can be done at the remote server and only the required data is transferred to the federated server. So using pushdown, we can filter the data remotely instead of bringing all data to the local DB2 server and then perform filtering. This will improve the performance of the database system by reducing the network traffic and data transfer cost and time.

SYSCAT.SERVEROPTIONS view has some interesting parameters which can be helpful in federated environments. SYSCAT.SERVEROPTIONS view catalogs the federated server settings for each remote data source. Each row specifies the server-specific options and corresponding values.

Few basic and important parameters are discussed below.

1) PUSHDOWN

Setting this parameter to ‘Y’ will enable the filtering of data at the remote data source. It will push all the filtering and sorting operations to remote data source thereby reducing the network traffic. If this parameter is set to ‘N’, then DB2 optimizer will select all the data from remote data objects and predicates will be applied locally which will degrade the performance of the query.

Here is the sample example that I tried on my local setup.

SELECT * FROM ORA_EMPLOYEE
WHERE CUSTOMERID = ‘053141245124’;

Cardinality of ORA_EMPLOYEE table: 42 Millions
Original Execution Time: 29 secs

Then I set the the PUSHDOWN parameter to ‘Y’ using alter statement.

ALTER SERVER ora_server
OPTIONS (ADD PUSHDOWN ‘Y’);
 ALTER SERVER ora_server
                OPTIONS (ADD PUSHDOWN ‘Y’);

Final Execution Time: < 1 sec

This simple example proves the importance of this parameter in the federated environment.

2) DB2_MAXIMAL_PUSHDOWN

This parameter determines whether to use the execution plan based on cost , or to favor pushing down the maximum number of operations to the remote data source. Setting this parameter to ‘Y’ will choose the execution plan with the lowest number of SHIP operators.

Decision will not consider the statistical information of nicknames, nor will it consider the CPU_RATIO and IO_RATIO settings. That means this parameter will force optimizer to use the execution plans favoring pushdown mechanism. Most of the times these plans are the optimal execution plans but there can be exceptions.

SET SERVER OPTION DB2_MAXIMAL_PUSHDOWN
TO ‘Y’ FOR SERVER ora_server;

3) DB2_SELECT_SCALAR_SQ

During the initial stage of project on the federated environment, we had problems with some queries that had sub-queries in their SELECT list.

Consider the following example –

SELECT emp_name, (SELECT job_desc from job
    where emp_no =123) from ora_emp;

In such cases, if there is a remote object referred in the sub-queries, the DB2 optimizer ignores pushdown, and the performance of such queries is degraded. We had many such queries, used for reporting purpose, where we had serious performance bottlenecks. When we made a contact with the IBM Labs, they suggested setting this parameter to ‘Y’ which allowed pushdown of such queries thereby improving the performance of the system.

 SET SERVER OPTION db2_select_scalar_SQ TO ‘Y’
for server  ora_server;

NOTE: One needs to be careful while using this parameter since it can be prove dangerous and may send the unsupported SQL for pushdown thereby causing FAILURE.

4) DB2_SCALARFUNC_SCALAR_SQ

In a federated environment, when we want to use a scalar function that was created in the remote data source, we cannot use it directly. We need to create a function template in the local database and then map this template with definition of the function in the remote data source. Here are the commands for creating a function template and mapping the user with the remote function.

Creating the template for remote functions with no compatible counterpart –

 CREATE FUNCTION localFunc_tmpl(INTEGER)
RETURNS INTEGER AS TEMPLATE
DETERMINISTIC
NO EXTERNAL ACTION;

Mapping the template with remote function

CREATE FUNCTION MAPPING func_mapping
FOR localFunc_tmpl SERVER ORACLE_SERVER OPTIONS(REMOTE_NAME ‘RemoteFuncName’);

This parameter is used to tell the DB2 optimizer that sub-query is allowed as a parameter to the function. After creating the function template and mapping it with the remote function we can use that function in our queries. But in any case if we need to pass the parameter in the form of a sub-query, we need to set this parameter to ‘Y’ so that operation will not cause any FAILURE.

Consider the following sample query.

SELECT localFunc_tmpl
(select empno from ora_employee WHERE employee_name=‘PRASAD’);

To make such queries workable, we need to set DB2_SCALARFUNC_SCALAR_SQ to ‘Y’.

ALTER SERVER ORACLE_SERVER
OPTIONS(ADD DB2_SCALARFUNC_SCALAR_SQ ‘Y’);
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s