Oracle8i Utilities
Release 8.1.5

A67792-01

Library

Product

Contents

Index

Prev Next

5
SQL*Loader Control File Reference

This chapter describes the SQL*Loader control file syntax. The following topics are included:

SQL*Loader's Data Definition Language (DDL)

SQL*Loader's Control File: Load Configuration

SQL*Loader's Control File: Loading Data

SQL*Loader's Control File: Loading Objects, LOBs, and Collections

SQL*Loader's Data Definition Language (DDL) Syntax Diagrams

You use SQL*Loader's data definition language (DDL) to control how SQL*Loader performs a data load into your database. You can also use DDL to manipulate the data you are loading.

The SQL*Loader Control File

The SQL*Loader control file is a repository that contains the DDL instructions that you have created to control where SQL*Loader will find the data to load, how SQL*Loader expects that data to be formatted, how SQL*Loader will be configured (memory management, rejecting records, interrupted load handling, etc.) as it loads the data, and how it will manipulate the data being loaded. You create the SQL*Loader control file and its contents using a simple text editor like vi, or xemacs.

The rest of this chapter explains how to use DDL to achieve your required data load.

SQL*Loader DDL Syntax Diagram Notation

The SQL*Loader DDL diagrams (sometimes called "railroad diagrams") used in this chapter to illustrate syntax use standard SQL syntax notation. For more information about the syntax notation used in this chapter, see the PL/SQL User's Guide and Reference or the preface in the Oracle8i SQL Reference.

See Control File Basics for more information

High-Level Syntax Diagrams

The following diagrams of DDL syntax are shown with certain clauses collapsed (position_spec, into_table clause, etc.). These diagrams are expanded and explained in more detail in Expanded DDL Syntax.

Options Clause


Load Statement


Note: The characterset specified does not apply to data in the control file.



infile_clause



os_file_proc_clause


Important: The syntax above is specific to the Unix platform. Please see your Oracle operating system-specific documentation for the syntax required by your platform.

concatenate_clause


into_table_clause








cond_spec
            

delim_spec


full_fieldname


Note: full_fieldname is the full name of a field specified using dot notation. If the field col2 is an attribute of a column object col1, when referring to col2 in one of the directives, you must use the notation col1.col2. The column_name and the full_fieldname referencing/naming the same entity can be different because column_name never includes the full name of the entity (no dot notation).

termination_spec



Note: Only fields which are loaded from a LOBFILE can be terminated by EOF.

enclosure_spec



Note: Fields terminated by EOF cannot be enclosed.

OID_spec


  

Note: If the table uses primary key OIDs instead of system-generated OIDs, do not specify an OID clause.

SID_spec
 

field_list
    


Note: The column_name and the fieldname referencing/naming the same entity can be different because column_name never includes the full name of the entity (no dot notation).

d_gen_fld_spec


REF_spec
 


Notes:

init_spec
 


Note: No field_condition can be based on fields in a secondary data file (SDF).

BFILE_spec



Note: The first argument to the BFILE directive contains the DIRECTORY OBJECT (the server_directory alias). The second argument contains the filename.

filler_fld_spec



Note: Conventional path loading does piecing when necessary. During direct path loads, piecing is done automatically, therefore, it is unnecessary to specify the PIECED keyword.

scalar_fld_spec



Note: Conventional path loading does piecing when necessary. During direct path loads, piecing is done automatically, therefore, it is unnecessary to specify the PIECED keyword. Note also that you cannot specify sql_string for LOB fields (regardless of whether LOBFILE_spec is specified).

LOBFILE_spec



Notes:

pos_spec



datatype_spec


col_obj_fld_spec


collection_fld_spec


nested_table_spec


Note: field_list cannot contain a collection_fld_spec

VARRAY_spec


Notes: A col_obj_spec nested within a VARRAY cannot contain a collection_fld_spec.

The <column_name> specified as part of the field_list must be the same as the <column_name> preceding the keyword VARRAY.

SDF_spec



Notes:

count_spec



Note: If a field is specified as the argument to the COUNT clause, that field must be mapped into the datafile data which is convertible to an integer (e.g. the string of characters "124").

Expanded DDL Syntax

Position Specification
pos_spec

A position specification (pos_spec) provides the starting location for a field and, optionally, the ending location. pos_spec syntax is:

The position must be surrounded by parentheses. The starting location can be specified as a column number, as * (next column), or *+n (next column plus an offset). The start and end locations can be separated with a colon (:) or a dash (-).

Field Condition
field_condition

A field condition compares a named field or an area of the record to a specified value. When the condition evaluates to true, the specified function is performed. For example, a true condition might cause the NULLIF function to insert a NULL data value, or cause DEFAULTIF to insert a default value. field_condition syntax is:

char_string and hex_string can be enclosed in either single quotation marks or double quotation marks. hex_string is a string of hexadecimal digits, where each pair of digits corresponds to one byte in the field. The BLANKS keyword allows you to test a field to see if it consists entirely of blanks. BLANKS is required when you are loading delimited data and you cannot predict the length of the field, or when you use a multi-byte character set that has multiple blanks.

There must not be any spaces between the operator and the operands. For example:

(1)='x'

is legal, while

(1) = 'x'

generates an error.

Column Name
column_name

The column name you specify in a field condition must be one of the columns defined for the input record. It must be specified with double quotation marks if its name is a reserved word. See Specifying Filenames and Objects Names for more details.

Precision vs. Length
precision
length

The precision of a numeric field is the number of digits it contains. The length of a numeric field is the number of byte positions on the record. The byte length of a ZONED decimal field is the same as its precision. However, the byte length of a (packed) DECIMAL field is (p+1)/2, rounded up, where p is the number's precision, because packed numbers contain two digits (or digit and sign) per byte.

Date Mask

The date mask specifies the format of the date value. For more information, see the DATE datatype.

Delimiter Specification
delimiter_spec

delimiter_spec can specify a termination delimiter, enclosure delimiters, or a combination of the two, as shown below:

Note: Only fields which are loaded from a LOB file can be TERMINATED BY EOF.

Note: Fields TERMINATED BY EOF cannot also be enclosed.

For more information, see Specifying Delimiters.

Control File Basics

The following sections describe the various DDL entries in the SQL*Loader control file and their function. All statements use the data definition language syntax described in the previous sections. The control file statements are presented in the approximate order they would appear in the control file.

Comments in the Control File

Comments can appear anywhere in the command section of the file, but they should not appear within the data. Precede any comment with two hyphens. For example,

--This is a Comment

All text to the right of the double hyphen is ignored, until the end of the line. Appendix Case 3: Loading a Delimited, Free-Format File shows comments in a control file.

Specifying Command-Line Parameters in the Control File

The OPTIONS statement is useful when you typically invoke a control file with the same set of options. The OPTION statement precedes the LOAD DATA statement.

OPTIONS

The OPTIONS parameter allows you to specify runtime arguments in the control file, rather than on the command line. The following arguments can be specified using the OPTIONS parameter. These arguments are described in greater detail in Chapter 6, "SQL*Loader Command-Line Reference".

SKIP = n 
LOAD = n 
ERRORS = n 
ROWS = n 
BINDSIZE = n 
SILENT = {FEEDBACK | ERRORS | DISCARDS | ALL} 
DIRECT = {TRUE | FALSE} 
PARALLEL = {TRUE | FALSE}

For example:

OPTIONS (BINDSIZE=100000, SILENT=(ERRORS, FEEDBACK) )

Note: Values specified on the command line override values specified in the OPTIONS statement in the control file.

Specifying Filenames and Objects Names

SQL*Loader follows the SQL standard for specifying object names (for example, table and column names). This section explains certain exceptions to that standard and how to specify database objects and filenames in the SQL*Loader control file that require special treatment. It also shows how the escape character is used in quoted strings.

Filenames that Conflict with SQL and SQL*Loader Reserved Words

SQL and SQL*Loader reserved words must be specified within double quotation marks. The reserved words most likely to be column names are:

COUNT     DATA     DATE     FORMAT 
OPTIONS   PART     POSITION 

So, if you had an inventory system with columns named PART, COUNT, and DATA, you would specify these column names within double quotation marks in your SQL*Loader control file. For example:

INTO TABLE inventory 
(partnum   INTEGER, 
"PART"     CHAR(15), 
"COUNT"    INTEGER, 
"DATA"     VARCHAR2(30)) 

See Appendix A, "SQL*Loader Reserved Words", for a complete list of reserved words.

You must use double quotation marks if the object name contains special characters other than those recognized by SQL ($, #, _), or if the name is case sensitive.

Specifying SQL Strings

You must specify SQL strings within double quotation marks. The SQL string applies SQL operators to data fields. See Applying SQL Operators to Fields for more information.

Restrictions

A control file entry cannot specify a SQL string for any field in the control file that uses a BFILE, SID, OID, or REF directive.

SQL strings cannot be used with column objects or collections, or attributes of column objects or collections.

Operating System Considerations

Specifying a Complete Path

If you encounter problems when trying to specify a complete pathname, it may be due to an operating system-specific incompatibility caused by special characters in the specification. In many cases, specifying the pathname within single quotation marks prevents errors.

If not, please see your operating system-specific documentation for possible solutions.

The Backslash Escape Character

In DDL syntax, you can place a double quotation mark inside a string delimited by double quotation marks by preceding it with the escape character, "\" (if the escape is allowed on your operating system). The same rule applies when single quotation marks are required in a string delimited by single quotation marks.

For example, homedir\data"norm\myfile contains a double quotation mark. Preceding the double quote with a backslash indicates that the double quote is to be taken literally:

INFILE 'homedir\data\"norm\mydata'

You can also put the escape character itself into a string by entering it twice:

For example:

"so'\"far"     or  'so\'"far'     is parsed as   so'"far 
"'so\\far'"    or  '\'so\\far\''  is parsed as  'so\far' 
"so\\\\far"    or  'so\\\\far'        is parsed as   so\\far 

Note: A double quote in the initial position cannot be escaped, therefore you should avoid creating strings with an initial quote.

Non-Portable Strings

There are two kinds of character strings in a SQL*Loader control file that are not portable between operating systems: filename and file processing option strings. When converting to a different operating system, these strings will likely need to be modified. All other strings in a SQL*Loader control file should be portable between operating systems.

Escaping the Backslash

If your operating system uses the backslash character to separate directories in a pathname and if the version of Oracle running on your operating system implements the backslash escape character for filenames and other non-portable strings, then you must specify double backslashes in your pathnames and use single quotation marks.

Additional Information: Please see your Oracle operating system-specific documentation for information about which escape characters are required or allowed.

Escape Character Sometimes Disallowed

The version of Oracle running on your operating system may not implement the escape character for non-portable strings. When the escape character is disallowed, a backslash is treated as a normal character, rather than as an escape character (although it is still usable in all other strings). Then pathnames such as:

INFILE 'topdir\mydir\myfile' 

can be specified normally. Double backslashes are not needed.

Because the backslash is not recognized as an escape character, strings within single quotation marks cannot be embedded inside another string delimited by single quotation marks. This rule also holds for double quotation marks: A string within double quotation marks cannot be embedded inside another string delimited by double quotation marks.

Identifying Data in the Control File with BEGINDATA

If your data is contained in the control file itself and not in a separate datafile, you must include it following the load configuration specifications.

Specify the BEGINDATA keyword before the first data record. The syntax is:

BEGINDATA data

BEGINDATA is used in conjunction with the INFILE keyword, as described by specifying INFILE *. Case 1: Loading Variable-Length Data provides an example.

Notes:

INFILE: Specifying Datafiles

You use the INFILE keyword to specify a datafile or datafiles fully followed by a file-processing options string. You can specify multiple files by using multiple INFILE keywords. You can also specify the datafile from the command line, using the DATA parameter described in Command-Line Keywords.

Note: the command-line parameter DATA overrides the INFILE keyword contained in the control file.

Naming the File

To specify a file that contains the data to be loaded, use the INFILE keyword, followed by the filename and optional processing options string. Remember that a filename specified on the command line overrides the first INFILE keyword in the control file. If no filename is specified, the filename defaults to the control filename with an extension or file type of DAT.

If the control file itself contains the data to be loaded, specify an asterisk (*). This specification is described in Identifying Data in the Control File with BEGINDATA .

Note: IDDN has been retained for compatibility with DB2.

where:

INFILE or INDDN  

(Use INDDN when DB2 compatibility is required.) This keyword specifies that a datafile specification follows.  

filename  

Name of the file containing the data.

Any spaces or punctuation marks in the filename must be enclosed in single quotation marks. See Specifying Filenames and Objects Names.  

*  

If your data is in the control file itself, use an asterisk instead of the filename. If you have data in the control file as well as datafiles, you must specify the asterisk first in order for the data to be read.  

processing_options  

This is the file-processing options string. It specifies the datafile format. It also optimizes datafile reads. See Specifying Datafile Format and Buffering.  

Specifying Multiple Datafiles

To load data from multiple datafiles in one SQL*Loader run, use an INFILE statement for each datafile. Datafiles need not have the same file processing options, although the layout of the records must be identical. For example, two files could be specified with completely different file processing options strings, and a third could consist of data in the control file.

You can also specify a separate discard file and bad file for each datafile. However, the separate bad files and discard files must be declared after each datafile name. For example, the following excerpt from a control file specifies four datafiles with separate bad and discard files:

INFILE  mydat1.dat  BADFILE  mydat1.bad  DISCARDFILE mydat1.dis 
INFILE  mydat2.dat 
INFILE  mydat3.dat  DISCARDFILE  mydat3.dis 
INFILE  mydat4.dat  DISCARDMAX  10 0

Examples

Data Contained In The Control File Itself


INFILE  *
Data Contained in File WHIRL with Default Extension .dat


INFILE  WHIRL 
Data in File datafile.dat: Full Path Specified


INFILE 'c:/topdir/subdir/datafile.dat' 

Note: Filenames that include spaces or punctuation marks must be enclosed in single quotation marks. For more details on filename specification, see Specifying Filenames and Objects Names.

Specifying READBUFFERS

The READBUFFERS keyword controls memory usage by SQL*Loader. This keyword is used for direct path loads only. For more information, Maximizing Performance of Direct Path Loads.

Specifying Datafile Format and Buffering

When configuring SQL*Loader, you can specify an operating system-dependent file processing options string in the control file to control file processing. You use this string to specify file format and buffering.

Additional Information: For details on the syntax of the file processing options string, see your Oracle operating system-specific documentation.

File Processing Example

For example, suppose that your operating system has the following option-string syntax:

where RECSIZE is the size of a fixed-length record, and BUFFERS is the number of buffers to use for asynchronous I/O.

To declare a file named MYDATA.DAT as a file that contains 80-byte records and instruct SQL*Loader to use eight I/O buffers, using this syntax you would use the following control file entry:

INFILE 'mydata.dat' "RECSIZE 80 BUFFERS 8" 

Note: This example uses the recommended convention of single quotation marks for filenames and double quotation marks for everything else. See Specifying Filenames and Objects Names for more details.

BADFILE: Specifying the Bad File

When SQL*Loader executes, it can create a file called a bad file or reject file in which it places records that were rejected because of formatting errors or because they caused Oracle errors. If you have specified that a bad file is to be created, the following applies:

Additional Information: On some systems a new version of the file is created if a file with the same name already exists. See your Oracle operating system-specific documentation to find out if this is the case on your system.

To specify the name of this file, use the BADFILE keyword (or BADDN for DB2 compatibility), followed by the bad file filename. If you do not specify a name for the bad file, the name defaults to the name of the datafile with an extension or file type of BAD. You can also specify the bad file from the command line with the BAD parameter described in Command-Line Keywords.

A filename specified on the command line is associated with the first INFILE or INDDN clause in the control file, overriding any bad file that may have been specified as part of that clause.

The bad file is created in the same record and file format as the datafile so that the data can be reloaded after corrections. The syntax is

where:

BADFILE or BADDN  

(Use BADDN when DB2 compatibility is required.) This keyword specifies that a filename for the badfile follows.  

bad_filename  

Any valid filename specification for your platform.

Any spaces or punctuation marks in the filename must be enclosed in single quotation marks. See Specifying Filenames and Objects Names.  

Examples

A bad file with filename UGH and default file extension or file type of .bad:

BADFILE UGH 

A bad file with filename BAD0001 and file extension or file type of .rej:

BADFILE BAD0001.REJ 
BADFILE '/REJECT_DIR/BAD0001.REJ' 

Rejected Records

A record is rejected if it meets either of the following conditions:

If the data can be evaluated according to the WHEN-clause criteria (even with unbalanced delimiters) then it is either inserted or rejected.

If a record is rejected on insert, then no part of that record is inserted into any table. For example, if data in a record is to be inserted into multiple tables, and most of the inserts succeed, but one insert fails, then all the inserts from that record are rolled back. The record is then written to the bad file, where it can be corrected and reloaded. Previous inserts from records without errors are not affected.

The log file indicates the Oracle error for each rejected record. Case 4: Loading Combined Physical Records demonstrates rejected records.

Note: During a multi-table load, SQL*Loader ensures that, if a row is rejected from one table, it is also rejected from all other tables. This is to ensure that the row can be repaired in the bad file and reloaded to all tables consistently. Also, if a row is loaded into one table, it should be loaded into all other tables which don't filter it out. Otherwise, reloading a fixed version of the row from the bad file could cause the data to be loaded into some tables twice.

Therefore, when SQL*Loader encounters the maximum number of errors allowed for a multi-table load, it continues loading rows to ensure that valid rows loaded into previous tables are either loaded into all tables or filtered out of all tables.

LOB Files and Secondary Data Files

Data from LOB files or secondary data files are not written to a bad file when there are rejected rows. If there is an error loading a LOB, the row is not rejected, rather the LOB field is left empty (not NULL with a length of zero (0) bytes).

Specifying the Discard File

During SQL*Loader execution, it can create a discard file for records that do not meet any of the loading criteria. The records contained in this file are called discarded records. Discarded records do not satisfy any of the WHEN clauses specified in the control file. These records differ from rejected records. Discarded records do not necessarily have any bad data. No insert is attempted on a discarded record.

A discard file is created according to the following rules:

To create a discard file, use any of the following syntax:

In a Control File     On the Command Line  

DISCARDFILE filename  

 

DISCARD  

DISCARDDN filename (DB2)  

 

DISCARDMAX  

DISCARDS  

 

 

DISCARDMAX  

 

 

Note that you can specify the discard file directly with a parameter specifying its name, or indirectly by specifying the maximum number of discards.

Specifying the Discard File in the Control-File

To specify the name of the file, use the DISCARDFILE or DISCARDDN (for DB2-compatibility) keyword, followed by the filename.

where:

DISCARDFILE or DISCARDDN  

(Use DISCARDDN when DB2 compatibility is required.) This keyword specifies that a discard filename follows.  

discard_filename  

Any valid filename specification for you platform.

Any spaces or punctuation marks in the filename must be enclosed in single quotation marks. See Specifying Filenames and Objects Names.  

The default filename is the name of the datafile, and the default file extension or file type is DSC. A discard filename specified on the command line overrides one specified in the control file. If a discard file with that name already exists, it is either overwritten or a new version is created, depending on your operating system.

The discard file is created with the same record and file format as the datafile. So it can easily be used for subsequent loads with the existing control file, after changing the WHEN clauses or editing the data.

Examples

A discard file with filename CIRCULAR and default file extension or file type of .dsc:

DISCARDFILE  CIRCULAR 

A discard file named notappl with the file extension or file type of .may:

DISCARDFILE NOTAPPL.MAY 

A full path to the discard file forget.me:

DISCARDFILE  '/DISCARD_DIR/FORGET.ME' 

Discarded Records

If there is no INTO TABLE keyword specified for a record, the record is discarded. This situation occurs when every INTO TABLE keyword in the SQL*Loader control file has a WHEN clause; and either the record fails to match any of them or all fields are null.

No records are discarded if an INTO TABLE keyword is specified without a WHEN clause. An attempt is made to insert every record into such a table. So records may be rejected, but none are discarded.

Case 4: Loading Combined Physical Records provides an example of using a discard file.

LOB Files and Secondary Data Files

Data from LOB files or secondary data files are not written to a discard file when there are discarded rows.

Limiting the Number of Discards

You can limit the number of records to be discarded for each datafile:

where n must be an integer. When the discard limit is reached, processing of the datafile terminates and continues with the next datafile, if one exists.

You can specify a different number of discards for each datafile. Alternatively, if the number of discards is only specified once, then the maximum number of discards specified applies to all files.

If you specify a maximum number of discards, but no discard filename, SQL*Loader creates a discard file with the default filename and file extension or file type. Case 4: Loading Combined Physical Records provides an example.

Using a Command-Line Parameter

You can specify the discard file from the command line, with the parameter DISCARDFILE described in Command-Line Keywords.

A filename specified on the command line overrides any bad file that you may have specified in the control file.

Handling Different Character Encoding Schemes

SQL*Loader supports different character encoding schemes (called character sets, or code pages). SQL*Loader uses Oracle's NLS (National Language Support) features to handle the various single-byte and multi-byte character encoding schemes available today. See the Oracle8i National Language Support Guide for information about supported character encoding schemes. The following sections provide a brief introduction to some of the supported schemes.

Multi-Byte (Asian) Character Sets

Multi-byte character sets support Asian languages. Data can be loaded in multi-byte format, and database objects (fields, tables, and so on) can be specified with multi-byte characters. In the control file, comments and object names may also use multi-byte characters.

Input Character Conversion

SQL*Loader has the capacity to convert data from the datafile character set to the database character set, when they differ.

When using a conventional path load, data is converted into the session character set specified by the NLS_LANG initialization parameter for that session. The data is then loaded using SQL INSERT statements. The session character set is the character set supported by your terminal.

During a direct path load, data converts directly into the database character set. The direct path load method, therefore, allows data in a character set that is not supported by your terminal to be loaded.

Note: When data conversion is required, it is essential that the target character set contains a representation of all characters that exist in the data. Otherwise, characters that have no equivalent in the target character set are converted to a default character, with consequent loss of data.

When using the direct path, load method the database character set should be a superset of, or equivalent to, the datafile character sets. Similarly, during a conventional path load, the session character set should be a superset of, or equivalent to, the datafile character sets.

The character set used in each input file is specified with the CHARACTERSET keyword.

CHARACTERSET Keyword

You use the CHARACTERSET keyword to specify to SQL*Loader which character set is used in each datafile. Different datafiles can be specified with different character sets. However, only one character set can be specified for each datafile.

Using the CHARACTERSET keyword causes character data to be automatically converted when it is loaded into the database. Only CHAR, DATE, and numeric EXTERNAL fields are affected. If the CHARACTERSET keyword is not specified, then no conversion occurs.

The CHARACTERSET syntax is:

CHARACTERSET character_set_spec 

where character_set_spec is the acronym used by Oracle to refer to your particular encoding scheme.

Additional Information: For more information on supported character sets, code pages, and the NLS_LANG parameter, see the Oracle8i National Language Support Guide.

Control File Characterset

The SQL*Loader control file itself is assumed to be in the character set specified for your session by the NLS_LANG parameter. However, delimiters and comparison clause values must be specified to match the character set in use in the datafile. To ensure that the specifications are correct, it may be preferable to specify hexadecimal strings, rather than character string values.

Any data included after the BEGINDATA statement is also assumed to be in the character set specified for your session by the NLS_LANG parameter. Data that uses a different character set must be in a separate file.

Loading into Empty and Non-Empty Tables

You can specify one of the following methods for loading tables:

Loading into Empty Tables

If the tables you are loading into are empty, use the INSERT option.

Loading into Non-Empty Tables

If the tables you are loading into already contain data, you have three options:

Warning: When the REPLACE or TRUNCATE keyword is specified, the entire table is replaced, not just individual rows. After the rows are successfully deleted, a commit is issued. You cannot recover the data that was in the table before the load, unless it was saved with Export or a comparable utility.

Note: This section corresponds to the DB2 keyword RESUME; users of DB2 should also refer to the description of RESUME in Appendix B, "DB2/DXT User Notes".

APPEND

If data already exists in the table, SQL*Loader appends the new rows to it. If data doesn't already exist, the new rows are simply loaded. You must have SELECT privilege to use the APPEND option. Case 3: Loading a Delimited, Free-Format File provides an example.

REPLACE

All rows in the table are deleted and the new data is loaded. The table must be in your schema, or you must have DELETE privilege on the table. Case 4: Loading Combined Physical Records provides an example.

The row deletes cause any delete triggers defined on the table to fire. If DELETE CASCADE has been specified for the table, then the cascaded deletes are carried out, as well. For more information on cascaded deletes, see the "Data Integrity" chapter of Oracle8i Concepts.

Updating Existing Rows

The REPLACE method is a table replacement, not a replacement of individual rows. SQL*Loader does not update existing records, even if they have null columns. To update existing rows, use the following procedure:

  1. Load your data into a work table.

  2. Use the SQL language UPDATE statement with correlated subqueries.

  3. Drop the work table.

For more information, see the "UPDATE" statement in Oracle8i SQL Reference.

TRUNCATE

Using this method, SQL*Loader uses the SQL TRUNCATE command to achieve the best possible performance. For the TRUNCATE command to operate, the table's referential integrity constraints must first be disabled. If they have not been disabled, SQL*Loader returns an error.

Once the integrity constraints have been disabled, DELETE CASCADE is no longer defined for the table. If the DELETE CASCADE functionality is needed, then the contents of the table must be manually deleted before the load begins.

The table must be in your schema, or you must have the DELETE ANY TABLE privilege.

Notes:

Unlike the SQL TRUNCATE option, this method re-uses a table's extents.

INSERT is SQL*Loader's default method. It requires the table to be empty before loading. SQL*Loader terminates with an error if the table contains rows. Case 1: Loading Variable-Length Data provides an example.

Continuing an Interrupted Load

If SQL*Loader runs out of space for data rows or index entries, the load is discontinued. (For example, the table might reach its maximum number of extents.) Discontinued loads can be continued after more space is made available.

State of Tables and Indexes

When a load is discontinued, any data already loaded remains in the tables, and the tables are left in a valid state. If the conventional path is used, all indexes are left in a valid state.

If the direct path load method is used, any indexes that run out of space are left in direct load state. They must be dropped before the load can continue. Other indexes are valid provided no other errors occurred. (See Indexes Left in Index Unusable State for other reasons why an index might be left in direct load state.)

Using the Log File

SQL*Loader's log file tells you the state of the tables and indexes and the number of logical records already read from the input datafile. Use this information to resume the load where it left off.

Dropping Indexes

Before continuing a direct path load, inspect the SQL*Loader log file to make sure that no indexes are in direct load state. Any indexes that are left in direct load state must be dropped before continuing the load. The indexes can then be re-created either before continuing or after the load completes.

Continuing Single Table Loads

To continue a discontinued direct or conventional path load involving only one table, specify the number of logical records to skip with the command-line parameter SKIP. If the SQL*Loader log file says that 345 records were previously read, then the command to continue would look like this:

SQLLDR USERID=scott/tiger CONTROL=FAST1.CTL DIRECT=TRUE SKIP=345

Continuing Multiple Table Conventional Loads

It is not possible for multiple tables in a conventional path load to become unsynchronized. So a multiple table conventional path load can also be continued with the command-line parameter SKIP. Use the same procedure that you would use for single-table loads, as described in the preceding paragraph.

Continuing Multiple Table Direct Loads

If SQL*Loader cannot finish a multiple-table direct path load, the number of logical records processed could be different for each table. If so, the tables are not synchronized and continuing the load is slightly more complex.

To continue a discontinued direct path load involving multiple tables, inspect the SQL*Loader log file to find out how many records were loaded into each table. If the numbers are the same, you can use the previously described simple continuation.

CONTINUE_LOAD

If the numbers are different, use the CONTINUE_LOAD keyword and specify SKIP at the table level, instead of at the load level. These statements exist to handle unsynchronized interrupted loads.

Instead of specifying:

LOAD DATA... 

at the start of the control file, specify:

SKIP

Then, for each INTO TABLE clause, specify the number of logical records to skip for that table using the SKIP keyword:

... 
INTO TABLE emp 
SKIP 2345 
... 
INTO TABLE dept 
SKIP 514 
... 

Combining SKIP and CONTINUE_LOAD

The CONTINUE_LOAD keyword is only needed after a direct load failure because multiple table loads cannot become unsynchronized when using the conventional path.

If you specify CONTINUE_LOAD, you cannot use the command-line parameter SKIP. You must use the table-level SKIP clause. If you specify LOAD, you can optionally use the command-line parameter SKIP, but you cannot use the table-level SKIP clause.

Assembling Logical Records from Physical Records

Since Oracle8i supports user-defined record sizes larger than 64k (see READSIZE (read buffer)), the need to fragment logical records into physical records is reduced. However, there may still be situations in which you may want to do so.

You can create one logical record from multiple physical records using one of the following two clauses, depending on your data:

CONCATENATE
CONTINUEIF 

CONCATENATE is appropriate in the simplest case, when SQL*Loader should always add the same number of physical records to form one logical record.

The syntax is:

CONCATENATE  n 

where n indicates the number of physical records to combine.

If the number of physical records to be continued varies, then CONTINUEIF must be used. The keyword CONTINUEIF is followed by a condition that is evaluated for each physical record, as it is read. For example, two records might be combined if there were a pound sign (#) in character position 80 of the first record. If any other character were there, the second record would not be added to the first. The full syntax for CONTINUEIF adds even more flexibility:

where:

THIS  

If the condition is true in the current record, then the next physical record is read and concatenated to the current physical record, continuing until the condition is false. If the condition is false, then the current physical record becomes the last physical record of the current logical record. THIS is the default.  

NEXT  

If the condition is true in the next record, then the current physical record is concatenated to the current record, continuing until the condition is false.  

pos_spec  

Specifies the starting and ending column numbers in the physical record.

Column numbers start with 1. Either a hyphen or a colon is acceptable (start-end or start:end).

If you omit end, the length of the continuation field is the length of the byte string or character string. If you use end, and the length of the resulting continuation field is not the same as that of the byte string or the character string, the shorter one is padded. Character strings are padded with blanks, hexadecimal strings with zeroes.  

LAST  

This test is similar to THIS but the test is always against the last non-blank character. If the last non-blank character in the current physical record meets the test, then the next physical record is read and concatenated to the current physical record, continuing until the condition is false. If the condition is false in the current record, then the current physical record is the last physical record of the current logical record.  

operator  

The supported operators are equal and not equal.

For the equal operator, the field and comparison string must match exactly for the condition to be true. For the not equal operator, they may differ in any character.  

char_string  

A string of characters to be compared to the continuation field defined by start and end, according to the operator. The string must be enclosed in double or single quotation marks. The comparison is made character by character, blank padding on the right if necessary.  

X'hex-string'  

A string of bytes in hexadecimal format used in the same way as the character string described above. X'1FB033 would represent the three bytes with values 1F, b), and 33 (hex).  

Note: The positions in the CONTINUEIF clause refer to positions in each physical record. This is the only time you refer to character positions in physical records. All other references are to logical records.

For CONTINUEIF THIS and CONTINUEIF NEXT, the continuation field is removed from all physical records before the logical record is assembled. This allows data values to span the records with no extra characters (continuation characters) in the middle. Two examples showing CONTINUEIF THIS and CONTINUEIF NEXT follow:

CONTINUEIF THIS			
CONTINUEIF NEXT 
(1:2) = '%%'  			(1:2) ='%%' 

Assume physical data records 12 characters long and that a period means a space:

%%aaaaaaaa....		..aaaaaaaa.... 
%%bbbbbbbb....		%%bbbbbbbb.... 
..cccccccc....		%%cccccccc.... 
%%dddddddddd..		..dddddddddd.. 
%%eeeeeeeeee..		%%eeeeeeeeee.. 
..ffffffffff..		%%ffffffffff.. 

The logical records would be the same in each case:

aaaaaaaa....bbbbbbbb....cccccccc.... 
dddddddddd..eeeeeeeeee..ffffffffff.. 

Notes:

Using CONTINUEIF

In the first example, you specify that if the current physical record (record1) has an asterisk in column 1. Then the next physical record (record2) should be appended to it. If record2 also has an asterisk in column 1, then record3 is appended also.

If record2 does not have an asterisk in column 1, then it is still appended to record1, but record3 begins a new logical record.

CONTINUEIF THIS (1) = "*" 

In the next example, you specify that if the current physical record (record1) has a comma in the last non-blank data column. Then the next physical record (record2) should be appended to it. If a record does not have a comma in the last column, it is the last physical record of the current logical record.

CONTINUEIF LAST = "," 

In the last example, you specify that if the next physical record (record2) has a "10" in columns 7 and 8. Then it should be appended to the preceding physical record (record1). If a record does not have a "10" in columns 7 and 8, then it begins a new logical record.

CONTINUEIF NEXT (7:8) = '10' 

Case 4: Loading Combined Physical Records provides an example of the CONTINUEIF clause.

Loading Logical Records into Tables

This section describes the way in which you specify:

Specifying Table Names

The INTO TABLE keyword of the LOAD DATA statement allows you to identify tables, fields, and datatypes. It defines the relationship between records in the datafile and tables in the database. The specification of fields and datatypes is described in later sections.

INTO TABLE

Among its many functions, the INTO TABLE keyword allows you to specify the table into which you load data. To load multiple tables, you include one INTO TABLE clause for each table you wish to load.

To begin an INTO TABLE clause, use the keywords INTO TABLE, followed by the name of the Oracle table that is to receive the data.

The table must already exist. The table name should be enclosed in double quotation marks if it is the same as any SQL or SQL*Loader keyword, if it contains any special characters, or if it is case sensitive.

INTO TABLE SCOTT."COMMENT"
INTO TABLE SCOTT."Comment" 
INTO TABLE SCOTT."-COMMENT" 

The user running SQL*Loader should have INSERT privileges on the table. Otherwise, the table name should be prefixed by the username of the owner as follows:

INTO TABLE SOPHIA.EMP 

Table-Specific Loading Method

The INTO TABLE clause may include a table-specific loading method (INSERT, APPEND, REPLACE, or TRUNCATE) that applies only to that table. Specifying one of these methods within the INTO TABLE clause overrides the global table-loading method. The global table-loading method is INSERT, by default, unless a different method was specified before any INTO TABLE clauses. For more information on these options, see Loading into Empty and Non-Empty Tables.

Table-Specific OPTIONS keyword

The OPTIONS keyword can be specified for individual tables in a parallel load. (It is only valid for a parallel load.) For more information, see Parallel Data Loading Models.

Choosing which Rows to Load

You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record.

The WHEN clause appears after the table name and is followed by one or more field conditions.

For example, the following clause indicates that any record with the value "q" in the fifth column position should be loaded:

WHEN (5) = 'q' 

A WHEN clause can contain several comparisons provided each is preceded by AND. Parentheses are optional, but should be used for clarity with multiple comparisons joined by AND. For example

WHEN (DEPTNO = '10') AND (JOB = 'SALES') 

To evaluate the WHEN clause, SQL*Loader first determines the values of all the fields in the record. Then the WHEN clause is evaluated. A row is inserted into the table only if the WHEN clause is true.

Field conditions are discussed in detail in Specifying Field Conditions. Case 5: Loading Data into Multiple Tables provides an example of the WHEN clause.

Using The WHEN Clause with LOB Files and Secondary Data Files

If a WHEN directive fails on a record, that record is discarded (skipped). Note also that, the skipped record is assumed to be contained completely in the main datafile, therefore, a secondary data file will not be affected if present.

Specifying Default Data Delimiters

If all data fields are terminated similarly in the datafile, you can use the FIELDS clause to indicate the default delimiters. The syntax is:

Note: Terminators are strings not limited to a single character. Also, TERMINATED BY EOF applies only to loading LOBs from SSDFs.

Note: Enclosure strings do not have to be a single character.

You can override the delimiter for any given column by specifying it after the column name. Case 3: Loading a Delimited, Free-Format File provides an example. See Specifying Delimiters for more information on delimiter specification.

Handling Short Records with Missing Data

When the control file definition specifies more fields for a record than are present in the record, SQL*Loader must determine whether the remaining (specified) columns should be considered null or whether an error should be generated.

If the control file definition explicitly states that a field's starting position is beyond the end of the logical record, then SQL*Loader always defines the field as null. If a field is defined with a relative position (such as DNAME and LOC in the example below), and the record ends before the field is found; then SQL*Loader could either treat the field as null or generate an error. SQL*Loader uses the presence or absence of the TRAILING NULLCOLS clause to determine the course of action.

TRAILING NULLCOLS

TRAILING NULLCOLS tells SQL*Loader to treat any relatively positioned columns that are not present in the record as null columns.

For example, if the following data

10 Accounting 

is read with the following control file

INTO TABLE dept 
    TRAILING NULLCOLS 
( deptno CHAR TERMINATED BY " ", 
  dname  CHAR TERMINATED BY WHITESPACE, 
  loc    CHAR TERMINATED BY WHITESPACE 
) 

and the record ends after DNAME. The remaining LOC field is set to null. Without the TRAILING NULLCOLS clause, an error would be generated due to missing data.

Case 7: Extracting Data from a Formatted Report provides an example of TRAILING NULLCOLS.

Index Options

This section describes the SQL*Loader options that control how index entries are created.

SORTED INDEXES Option

The SORTED INDEXES option applies to direct path loads. It tells SQL*Loader that the incoming data has already been sorted on the specified indexes, allowing SQL*Loader to optimize performance. Syntax for this feature is given in High-Level Syntax Diagrams. Further details are in SORTED INDEXES Statement.

SINGLEROW Option

The SINGLEROW option is intended for use during a direct path load with APPEND on systems with limited memory, or when loading a small number of rows into a large table. This option inserts each index entry directly into the index, one row at a time.

By default, SQL*Loader does not use SINGLEROW when APPENDing rows to a table. Instead, index entries are put into a separate, temporary storage area and merged with the original index at the end of the load. This method achieves better performance and produces an optimal index, but it requires extra storage space. During the merge, the original index, the new index, and the space for new entries all simultaneously occupy storage space.

With the SINGLEROW option, storage space is not required for new index entries or for a new index. The resulting index may not be as optimal as a freshly sorted one, but it takes less space to produce. It also takes more time, since additional UNDO information is generated for each index insert. This option is suggested for use when:

Specifying Field Conditions

A field condition is a statement about a field in a logical record that evaluates as true or false. It is used in the NULLIF and DEFAULTIF clauses, as well as in the WHEN clause.

A field condition is similar to the condition in the CONTINUEIF clause, with two important differences. First, positions in the field condition refer to the logical record, not to the physical record. Second, you may specify either a position in the logical record or the name of a field that is being loaded.

where:

start  

Specifies the starting position of the comparison field in the logical record.  

end  

Specifies the ending position of the comparison field in the logical record. Either start-end or start:end is acceptable, If you omit end the length of the field is determined by the length of the comparison string. If the lengths are different, the shorter field is padded. Character strings are padded with blanks, hexadecimal strings with zeroes.  

full_field_name  

full_field_name is the full name of a field specified using dot notation. If the field col2 is an attribute of a column object col1, when referring to col2 in one of the directives, you must use the notation col1.col2. The column_name and the fieldname referencing/naming the same entity can be different because column_name never includes the full name of the entity (no dot notation).  

operator  

A comparison operator for either equal or not equal.  

char_string  

A string of characters enclosed in single or double quotes that is compared to the comparison field. If the comparison is true, the current row is inserted into the table.  

X'hex_string'  

A byte string in hexadecimal format that is used in the same way as char_string, described above.  

BLANKS  

A keyword denoting an arbitrary number of blanks. See below.  

Comparing Fields to BLANKS

The BLANKS keyword makes it possible to determine easily if a field of unknown length is blank.

For example, use the following clause to load a blank field as null:

full_field_name ... NULLIF column_name=BLANKS 

The BLANKS keyword only recognizes blanks, not tabs. It can be used in place of a literal string in any field comparison. The condition is TRUE whenever the column is entirely blank.

The BLANKS keyword also works for fixed-length fields. Using it is the same as specifying an appropriately-sized literal string of blanks. For example, the following specifications are equivalent:

fixed_field CHAR(2) NULLIF (fixed_field)=BLANKS 
fixed_field CHAR(2) NULLIF (fixed_field)="  " 

Note: There can be more than one "blank" in a multi-byte character set. It is a good idea to use the BLANKS keyword with these character sets instead of specifying a string of blank characters.

The character string will match only a specific sequence of blank characters, while the BLANKS keyword will match combinations of different blank characters. For more information on multi-byte character sets, see Multi-Byte (Asian) Character Sets.

Comparing Fields to Literals

When a data field is compared to a shorter literal string, the string is padded for the comparison; character strings are padded with blanks; for example:

NULLIF (1:4)="_" 

compares the data in position 1:4 with 4 blanks. If position 1:4 contains 4 blanks, then the clause evaluates as true.

Hexadecimal strings are padded with hexadecimal zeroes. The clause

NULLIF (1:4)=X'FF' 

compares position 1:4 to hex 'FF000000'.

Specifying Columns and Fields

You may load any number of a table's columns. Columns defined in the database, but not specified in the control file, are assigned null values (this is the proper way to insert null values).

A column specification is the name of the column, followed by a specification for the value to be put in that column. The list of columns is enclosed by parentheses and separated with commas as follows:

( columnspec,  columnspec,  ... ) 

Each column name must correspond to a column of the table named in the INTO TABLE clause. A column name must be enclosed in quotation marks if it is a SQL or SQL*Loader reserved word, contains special characters, or is case sensitive.

If the value is to be generated by SQL*Loader, the specification includes the keyword RECNUM, the SEQUENCE function, or the keyword CONSTANT. See Generating Data.

If the column's value is read from the datafile, the data field that contains the column's value is specified. In this case, the column specification includes a column name that identifies a column in the database table, and a field specification that describes a field in a data record. The field specification includes position, datatype, null restrictions, and defaults.

It is not necessary to specify all attributes when loading column objects. Any missing attributes will be set to NULL.

Specifying Filler Fields

Filler fields have names but they are not loaded into the table. However, filler fields can be used as arguments to init_specs (foe example, NULLIF and DEFAULTIF) as well as to directives (for example, SID, OID, REF, BFILE). Also, filler fields can occur anyplace in the data file. They can be inside of the field list for an object or inside the definition of a VARRAY. See New SQL*Loader DDL Behavior and Restrictions for more information on filler fields and their use.

The following an sample filler field specification:

 field_1_count FILLER char,
 field_1 varray count(field_1_count)
 (
    filler_field1  char{2},
    field_1  column object
    (
      attr1 char(2),
      filler_field2  char(2),
      attr2 char(2),
    )
    filler_field3  char(3),
 )
 filler_field4 char(6)

Specifying the Datatype of a Data Field

A field's datatype specification tells SQL*Loader how to interpret the data in the field. For example, a datatype of INTEGER specifies binary data, while INTEGER EXTERNAL specifies character data that represents a number. A CHAR field, however, can contain any character data.

You may specify one datatype for each field; if unspecified, CHAR is assumed.

SQL*Loader Datatypes describes how SQL*Loader datatypes are converted into Oracle datatypes and gives detailed information on each SQL*Loader's datatype.

Before the datatype is specified, the field's position must be specified.

Specifying the Position of a Data Field

To load data from the datafile SQL*Loader must know a field's location and its length. To specify a field's position in the logical record, use the POSITION keyword in the column specification. The position may either be stated explicitly or relative to the preceding field. Arguments to POSITION must be enclosed in parentheses, as follows:

where:

start  

The starting column of the data field in the logical record. The first character position in a logical record is 1.  

end  

The ending position of the data field in the logical record. Either start-end or start:end is acceptable. If you omit end, the length of the field is derived from the datatype in the datafile. Note that CHAR data specified without start or end is assumed to be length 1. If it is impossible to derive a length from the datatype, an error message is issued.  

*  

Specifies that the data field follows immediately after the previous field. If you use * for the first data field in the control file, that field is assumed to be at the beginning of the logical record. When you use * to specify position, the length of the field is derived from the datatype.  

+n  

You can use an on offset, specified as +n, to offset the current field from the previous field. A number of characters as specified by n are skipped before reading the value for the current field.  

You may omit POSITION entirely. If you do, the position specification for the data field is the same as if POSITION(*) had been used.

For example

ENAME  POSITION (1:20)  CHAR 
EMPNO  POSITION (22-26) INTEGER EXTERNAL 
ALLOW  POSITION (*+2)   INTEGER EXTERNAL TERMINATED BY "/" 

Column ENAME is character data in positions 1 through 20, followed by column EMPNO, which is presumably numeric data in columns 22 through 27. Column ALLOW is offset from the end of EMPNO by +2. So it starts in column 29 and continues until a slash is encountered.

Using POSITION with Data Containing TABs

When you are determining field positions, be alert for TABs in the datafile. The following situation is highly likely when using SQL*Loader's advanced SQL string capabilities to load data from a formatted report:

These kinds of errors occur when the data contains TABs. When printed, each TAB expands to consume several columns on the paper. In the datafile, however, each TAB is still only one character. As a result, when SQL*Loader reads the datafile, the POSITION specifications are wrong.

To fix the problem, inspect the datafile for tabs and adjust the POSITION specifications, or else use delimited fields.

The use of delimiters to specify relative positioning of fields is discussed in detail in Specifying Delimiters. Especially note how the delimiter WHITESPACE can be used.

Using POSITION with Multiple Table Loads

In a multiple table load, you specify multiple INTO TABLE clauses. When you specify POSITION(*) for the first column of the first table, the position is calculated relative to the beginning of the logical record. When you specify POSITION(*) for the first column of subsequent tables, the position is calculated relative to the last column of the last table loaded.

Thus, when a subsequent INTO TABLE clause begins, the position is not set to the beginning of the logical record automatically. This allows multiple INTO TABLE clauses to process different parts of the same physical record. For an example, see the second example in Extracting Multiple Logical Records.

A logical record may contain data for one of two tables, but not both. In this case, you would reset POSITION. Instead of omitting the position specification or using POSITION(*+n) for the first field in the INTO TABLE clause, use POSITION(1) or POSITION(n).

Examples

SITEID  POSITION (*) SMALLINT 
SITELOC POSITION (*) INTEGER 

If these were the first two column specifications, SITEID would begin in column1, and SITELOC would begin in the column immediately following.

ENAME  POSITION (1:20)  CHAR 
EMPNO  POSITION (22-26) INTEGER EXTERNAL 
ALLOW  POSITION (*+2)   INTEGER EXTERNAL TERMINATED BY "/" 

Column ENAME is character data in positions 1 through 20, followed by column EMPNO which is presumably numeric data in columns 22 through 26. Column ALLOW is offset from the end of EMPNO by +2, so it starts in column 28 and continues until a slash is encountered.

Using Multiple INTO TABLE Statements

Multiple INTO TABLE statements allow you to:

In the first case, it is common for the INTO TABLE statements to refer to the same table. This section illustrates the different ways to use multiple INTO TABLE statements and shows you how to use the POSITION keyword.

Note: A key point when using multiple INTO TABLE statements is that field scanning continues from where it left off when a new INTO TABLE statement is processed. The remainder of this section details important ways to make use of that behavior. It also describes alternative ways using fixed field locations or the POSITION keyword.

Extracting Multiple Logical Records

Some data storage and transfer media have fixed-length physical records. When the data records are short, more than one can be stored in a single, physical record to use the storage space efficiently.

In this example, SQL*Loader treats a single physical record in the input file as two logical records and uses two INTO TABLE clauses to load the data into the EMP table. For example, if the data looks like

1119 Smith      1120 Yvonne 
1121 Albert     1130 Thomas 

then the following control file extracts the logical records: