ETL and Data Transfer utilities in Teradata
In this article, we will learn about ETL and Data Transfer utilities in Teradata
Teradata database has various techniques to load the data into tables. Below are the 5 commonly used load utilities.
- Fast load
- Multi load
- Fast Export
- TPT ( Teradata Parallel Transporter )
Each utility mentioned above has certain specifications. so, based on customer requirement which ever suits best, they can proceed with the corresponding utility to load the data.
Apart from the mentioned load utilities, we have data mover to move the data from one system to other system and backup/restore utilities ( ARC and DSC ) for the same purpose.
In this article, we are mainly focusing on Fast load utility.
Fast load utility is mainly used to load huge amount of data into empty tables.
The feature which makes unique to fast load is it loads huge amount of data with very high speed rate.
As it won’t maintain transient journal, it won’t allow duplicate values though the table is multiset table.
with one job, you can load data for one table only at a time.
How it works :
- The Parsing engines read the records from the input file and sends a block to each AMP.
- Each AMP stores the blocks of records.
- Then AMPs hash each record and redistribute them to the correct AMP.
- At the end of Phase 1, each AMP has its rows but they are not in row hash sequence.
- Phase 2 starts when FastLoad receives the END LOADING statement.
- Each AMP sorts the records on row hash and writes them to the disk.
- Locks on the target table is released and the error tables are dropped
Below is the sample fast load script to load data
BEGIN LOADING tduser.Employee_Stg
ERRORFILES Employee_ET, Employee_UV
SET RECORD VARTEXT ",";
DEFINE in_EmployeeNo (VARCHAR(10)),
FILE = employee.txt;
INSERT INTO Employee_Stg (
:in_BirthDate (FORMAT 'YYYY-MM-DD'),
:in_JoinedDate (FORMAT 'YYYY-MM-DD'),
Sample input text file for above script :
How to execute fastload job :
Once the input file employee.txt is created and the FastLoad script is named as EmployeeLoad.fl, you can run the FastLoad script using the following command in UNIX and Windows.
fastload < fasloadscriptfile
Once the above command is executed, the FastLoad script will run and produce the log. In the log, you can see the number of records processed by FastLoad and status code.
Fastload terminology :
Following is the list of common terms used in FastLoad script.
LOGON − Logs into Teradata and initiates one or more sessions.
DATABASE − Sets the default database.
BEGIN LOADING − Identifies the table to be loaded.
ERRORFILES − Identifies the 2 error tables that needs to be created/updated.
CHECKPOINT − Defines when to take checkpoint.
SET RECORD − Specifies if the input file format is formatted, binary, text or unformatted.
DEFINE − Defines the input file layout.
FILE − Specifies the input file name and path.
INSERT − Inserts the records from the input file into the target table.
END LOADING − Initiates phase 2 of the FastLoad. Distributes the records into the target table.
LOGOFF − Ends all sessions and terminates FastLoad.
Limitations of Fastload :
It doesn’t support tables having secondary index, join index and foreign key references.