Oracle8(TM) Server Tuning
Release 8.0

A54638-01

Library

Product

Contents

Index

Prev Next

9
Tuning Distributed Queries

Oracle supports transparent distributed queries to access data from multiple databases. It also provides many other distributed features, such as transparent distributed transactions and a transparent fully automatic two-phase commit. This chapter explains how the Oracle8 optimizer decomposes SQL statements, and how this affects performance of distributed queries. The chapter provides guidelines on how to influence the optimizer and avoid performance bottlenecks.

Topics include:

Distributed Queries: Decomposition of SQL Statements

If a SQL statement references one or more remote tables, the optimizer first determines whether all remote tables are located at the same site. If all tables are located at the same remote site, Oracle sends the entire query to the remote site for execution. The remote site sends the resulting rows back to the local site. This is called a "remote-mapped statement". If the tables are located at more than one site, the optimizer decomposes the query into separate SQL statements to access each of the remote tables. The site where the query is executed, called the "driving site," is normally the local site.

This section describes:

Remote Data Dictionary Information

If a SQL statement references multiple tables, then the optimizer must determine which columns belong to which tables before it can decompose the SQL statement. For example, with

SELECT DNAME, ENAME 
FROM DEPT, EMP@REMOTE
WHERE DEPT.DEPTNO = EMP.DEPTNO

the optimizer must first determine that the DNAME column belongs to the DEPT table and the ENAME column to the EMP table. Once the optimizer has the data dictionary information of all remote tables it can build the decomposed SQL statements.

Column and table names in decomposed SQL statements appear between double quotes. Column and table names which contain special characters, reserved words or spaces must be enclosed by double quotes.

This mechanism also replaces an asterisk (*) in the select list with the actual column names. For example:

SELECT * FROM DEPT@REMOTE 

results in the decomposed SQL statement

SELECT A1."DEPTNO", A1."DNAME", A1."LOC" FROM "DEPT" A1 

Note: For the sake of simplicity double quotes are not used in the remainder of this chapter.

EXPLAIN PLAN and SQL Decomposition

EXPLAIN PLAN not only gives information about the overall execution plan of SQL statements, but also about the way in which the optimizer decomposes SQL statements. EXPLAIN PLAN stores information in the PLAN_TABLE table. If remote tables are used in a SQL statement the OPERATION column will contain the value REMOTE to indicate that a remote table is referenced, and the OTHER column will contain the decomposed SQL statement which will be sent to the remote database. For example:

EXPLAIN PLAN FOR SELECT DNAME FROM DEPT@REMOTE 
SELECT OPERATION, OTHER FROM PLAN_TABLE
OPERATION OTHER
--------- -------------------------------------
REMOTE SELECT A1."DNAME" FROM "DEPT" A1

Note the table alias and double quotes around the column and table names.

See Also: Chapter 21, "The EXPLAIN PLAN Command" for more information.

Accessing All Tables from the Same Remote Database

If the entire SQL statement is sent to the remote database, the optimizer will use table aliases A1, A2, and so on, for all tables and columns in the query, in order to avoid possible naming conflicts. For example:

SELECT DNAME, ENAME 
FROM DEPT@REMOTE, EMP@REMOTE
WHERE DEPT.DEPTNO = EMP.DEPTNO

is sent to the remote database as

SELECT A2.DNAME, A1.ENAME 
FROM DEPT A2, EMP A1
WHERE A1.DEPTNO = A2.DEPTNO

Accessing Tables from Different Databases

When executing a query that accesses data on one or more databases, one site will "drive" the execution of the query. This is known as the "driving site"; it is here that the data is joined, grouped and ordered. Additionally, there is a hint called DRIVING_SITE which forces the driving site to be a particular site

The decomposition of SQL statements is important since it determines the number of records or even tables which must be sent through the network. A knowledge of how the optimizer decomposes SQL statements can help you achieve optimum performance for distributed queries.

If a SQL statement references one or more remote tables, the optimizer must decompose the SQL statement into separate queries to be executed on the different databases. For example:

SELECT DNAME, ENAME 
FROM DEPT, EMP@REMOTE
WHERE DEPT.DEPTNO = EMP.DEPTNO

might be decomposed into

SELECT DEPTNO, DNAME FROM DEPT 

which is executed locally, and

SELECT DEPTNO, ENAME FROM EMP 

which is sent to the remote database. The data from both tables is joined locally. All this is done automatically and transparently for the user or application.

In some cases, however, it might be better to send the local table to the remote database and join the two tables on the remote database. This can be achieved either by creating a view, or by using the DRIVING_SITE hint.

If you decide to create a view on the remote database, a database link from the remote database to the local database is also needed.

For example (on the remote database):

CREATE VIEW DEPT_EMP AS 
SELECT DNAME, ENAME
FROM DEPT@LOCAL, EMP
WHERE DEPT.DEPTNO = EMP.DEPTNO

Then select from the remote view instead of the local and remote tables

SELECT * FROM DEPT_EMP@REMOTE 

Now the local DEPT table is sent through the network to the remote database, joined on the remote database with the EMP table, and the result is sent back to the local database.

See Also: "DRIVING_SITE" on page 8-25 for details about this hint.

Rule-based Optimization

Rule-based optimization does not have information about indexes for remote tables. It will never, therefore, generate a nested loops join between a local table and a remote table with the local table as the outer table in the join. It will use either a nested loops join with the remote table as the outer table or a sort merge join, depending on the indexes available for the local table.

Cost-based Optimization

Cost-based optimization can consider more execution plans than rule-based optimization. Cost-based optimization knows whether indexes on remote tables are available, and in which cases it would make sense to use them. Cost-based optimization considers index access of the remote tables as well as full table scans, whereas rule-based optimization only considers full table scans.

Which execution plan and table access cost-based optimization will choose depends on the table and index statistics. For example, with

SELECT DNAME, ENAME 
FROM DEPT, EMP@REMOTE
WHERE DEPT.DEPTNO = EMP.DEPTNO

the optimizer might choose the local DEPT table as the driving table and access the remote EMP table using an index. In that case the decomposed SQL statement becomes

SELECT ENAME FROM EMP WHERE DEPTNO = :1 

This decomposed SQL statement is used for a nested loops operation.

Using Views

If tables are on more than one remote site, it can be more effective to create a view than to use the DRIVING_SITE hint. If not all tables are on the same remote database the optimizer will access each remote table separately. For example:

SELECT D.DNAME, E1.ENAME, E2.JOB 
FROM DEPT D, EMP@REMOTE E1, EMP@REMOTE E2
WHERE D.DEPTNO = E1.DEPTNO
AND E1.MGR = E2.EMPNO

results in the decomposed SQL statements

SELECT EMPNO, ENAME FROM EMP 

and

SELECT ENAME, MGR, DEPTNO FROM EMP 

If you want to join the two EMP tables remotely, you can use a view to accomplish this. Create a view with the join of the remote tables on the remote database. For example (on the remote database):

CREATE VIEW EMPS AS 
SELECT E1.DEPTNO, E1.ENAME, E2.JOB
FROM EMP E1, EMP E2
WHERE E1.MGR = E2.EMPNO

and now select from the remote view instead of the remote tables:

SELECT D.DNAME, E.ENAME, E.JOB 
FROM DEPT D, EMPS@REMOTE E
WHERE D.DEPTNO = E.DEPTNO

This will result in the decomposed SQL statement

SELECT DEPTNO, ENAME, JOB FROM EMPS 

Using Hints

In a distributed query, all hints are supported for local tables. For remote tables, however, only join order and join operation hints can be used. (Hints for access methods, parallel hints, and so on, will have no effect.) For remote mapped queries, all hints are supported.

See Also: "Hints for Join Orders" on page 8-21.
"Hints for Join Operations" on page 8-22.

Transparent Gateways

The Transparent Gateways are used to access data from other data sources (relational databases, hierarchical databases, file systems, and so on). These data sources usually do not support all the functions and operators that Oracle8 supports. The Transparent Gateways therefore tell Oracle (at connect time) which functions and operators they do support. If the other data source does not support a function or operator, Oracle performs it. In this case Oracle obtains the data from the other data source and applies the function or operator locally. This affects the way in which the SQL statements are decomposed and can affect performance, especially if Oracle is not on the same machine as the other data source.

Distributed Query Restrictions

Distributed queries within the same version of Oracle have these restrictions:

Summary: Optimizing Performance of Distributed Queries

There are several ways to improve performance of distributed queries:

In many cases there are several SQL statements which give the same result; with all tables on the same database the difference in performance between these SQL statements might be minimal, but if the tables are located on different databases the difference in performance might be bigger.
Cost-based optimization can use indexes on remote tables and considers more execution plans than rule-based optimization and generally will give better results. With cost-based optimization performance of distributed queries is generally satisfactory. Only in rare occasions is it necessary to change SQL statements, create views or use procedural code.
Views can in some situations be used to improve performance of distributed queries; for example:
In some rare occasions it can be more efficient to replace a distributed query by procedural code, such as a PL/SQL procedure or a precompiler program; note that this option is only mentioned here for completeness, not because it is often needed.




Prev

Next
Oracle
Copyright © 1997 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index