NOPI( NO PRIMARY INDEX) table in Teradata

Share via:

NOPI( NO PRIMARY INDEX) table in Teradata 

Dear Readers,

In this article, we will learn about No Primary key indexes in Teradata

  • As the name suggests, No primary index table means there is no primary index defined on Teradata table.
  • This feature has been introduced in V2R13 release.
  • The main purpose of this feature which allows us to create tables with no primary index in Teradata is to improve           the performance of FastLoad and Tpump  array insert data loading operations.
  • Organizing/sorting rows based on row hash is therefore avoided, which speeds up the load operation.
  • NoPI tables are typically used as staging tables for initial load by FastLoad or TPump Array Inserts, then an           Insert/Select is used to apply the updates to a populated table.
  • NO PRIMARY INDEX is specified in the CREATE TABLE syntax.

Syntax for NOPI table:

NO PRIMARY INDEX;

In this case table ktexperts will act as a source table and it can be a PI table or NOPI table.

If a primary key or a UNIQUE constraint is defined when creating the NOPI table, Teradata creates a unique secondary index to ensure uniqueness.

Data Distribution:

  • As there is no primary index involved in NOPI table, its rows are not hashed to AMP based on primary index value. Teradata database, either hashes on query id or use some different algorithm to assign the rows to its destination AMP.
  • Once the row reaches to the destination AMP, Teradata generates a ROWID for it by selecting a random hash bucket which is AMP local.
  • Within the AMP, there are no row-ordering constraints and as rows are inserted into a NoPI table, rows are always appended at the end of the table as if it were a spool table and never inserted in the middle of a hash sequence.
  • Each row in a NoPI table has a hash bucket value that is internally generated.
  • A NoPI table is internally treated as a hashed table; it is just that typically all the rows on one AMP will have the same hash bucket value.

Changing NOPI table to PI table:

Once NOPI table is created, we can change it to PI table in two ways

Solution 1:

Note :Ktexperts is a NOPI table.

Solution 2:

Create a new table with PI and execute INSERT-SELECT from NOPI table to PI table.

Note :Ktexperts is a NOPI table.

Benefits of NOPI table:

  • A NoPI table will reduce skew in intermediate ETL tables which have no natural PI.
  • Loads (FastLoad and TPump array insert) into a NoPI staging table are faster.
  • You can use NoPI tables as intermediate worktables where PI access is not a consideration. The system automatically balances the distribution of NoPI tables across AMPs.

NoPI tables are designed for these circumstances:

  • Temporary use
  • When you have not yet determined a suitable primary index
  • Column-partitioning (which cannot be done on tables with a primary index)

In all other circumstances, Teradata recommends that you specify a primary index when you create a table.

  • Applications can also be benefited by using NoPI tables in the following ways:

–>As a log file

–>As a sandbox table to store data until an appropriate indexing method is determined

  • A query that accesses the data in a NoPI table results in a full-table scan unless you define a secondary index on the NoPI table and use the columns that are indexed in the query.

How are the rows of NOPI tables distributed across the AMPs?

The rows are distributed evenly but randomly across all AMPs. Rows can therefore not be found without the presence of a secondary index.

How can Teradata mark the rows of a NOPI table if no hashing is used?

  • For a NOPI table, a full table scan is always necessary if there is no other index that can be used. The AMP on which the searched rows are located is not known.
  • To identify the rows within the data blocks of the AMP a ROWID is needed. While for PI tables this is composed of the ROWHASH and a UNIQUENESS value, for NOPI tables the hash bucket of the AMP and a UNIQUENESS value is used to form the ROWID.
  • Since NOPI tables are a design change that was only introduced in one of the newer releases of Teradata, it was decided to use the bytes usually being the ROWHASH of PI tables to extend the value range of the UNIQUENESS value for NOPI tables.

How does Teradata determine which AMP gets which NOPI row?

  • A random generator is used. This is available on the AMPs as well as on the parsing engines.
  • The AMP random generator is used for fast load and the random generator on the parsing enginefor SQL requests.
  • During fast load, the random generator of the AMPs distributes the rows in the round-robin procedure.
  • For SQL requests, the DBQL QueryId of the request is used to generate a hash value which is used to distribute the rows.

Can NOPI tables never be skewed?

  • NOPI tables can also be skewed, but only in certain cases.
  • If a small table is loaded with a fast load, skew inevitably occurs because the data is not sent to the AMPs row by row but in blocks.
  • If the table is small, it is possible that all rows are sent in one block and end up on one single AMP.

When executing an “INSERT…SELECT” statement, from a primary index table into a NOPI table, AMP-local copying of the rows will be applied. All rows stay on their current AMP, and if the primary index table is skewed the NOPI table will be skewed as well.

This can be avoided by using the HASH BY RANDOM option. Data blocks are distributed randomly and give an even distribution.

If you want to distribute on row level to achieve even better randomness you can use this statement:

Furthermore, skewing of a NOPI table can happen, by executing an archive job on a source system, and a restore of the backup on a different target system.

How to find all the NOPI tables in a database?

Use the below SQL:

Teradata NOPI Table Limitations

  • No SET tables allowed
  • No Identity columns allowed
  • No Row-Level Partitioning allowed
  • No UPDATE, MERGE INTO, or UPSERT allowed
  • No Permanent Journaling possible
  • Can’t be defined as Queue Tables
  • No UPDATE triggers allowed
  • No Hash Index allowed
  • No Multiload

Teradata NOPI Tables – What’s allowed?

  • Fallback protection
  • Secondary Indexes (USI and NUSI)
  • Join Indexes
  • CHECK and UNIQUE constraints
  • Triggers
  • COLLECT STATISTICS
  • Global Temporary Tables
  • Volatile Tables
  • CLOB and BLOB Data Types
  • Fastload but duplicates are not filtered

NoPI tables should be used only for the following purpose

  • Temporary/Staging purpose
  • Column-partitioning

Advantage for Fastload when loading a NoPI table

Let’s look at the phases of the Fastload for a primary index table

      • Phase 1: All incoming rows are randomly distributed to all AMPs.
      • Phase 2: Rows are hashed through the primary index value and forwarded to the appropriate AMPs. And the appropriate AMPs will sort the received rows by ROWID.

The Fastload for NoPI tables ends after phase 1 as soon as the rows have been randomly distributed to the AMPs. There is no hashing, no sorting and no redistribution of rows.

Another advantage of NoPI tables is that the rows are appended to the end of the data blocks of the respective table. This avoids any overhead that is normally caused by sorting the rows by row hash into the data blocks.

Is it possible to create NOPI table as default?

  • In earlier releases if the PI is not specified, a PI is chosen automatically by the system. However, we can choose at the system level whether or not to create NOPI tables by default.
  • If we choose NOPI tables then by default when we don’t specify the PI, the table would be a NOPI table.
  • A new DBSControl Flag determines if the PI or NOPI table is created when we don’t specify following in the CREATE TABLE DDL :
    1. PRIMARY INDEX clause.
    2. NO PRIMARY index clause.
    3. PRIMARY KEY OR UNIQUE constraints.

This DBS control field is Field 53 and is named as ‘Primary Index Default’ .

Possible values for this field are:

  1. D –> Teradata Default. Currently Default is same as Option P.
  2. P –> “First Column NUPI”. Creates tables with 1st column as NUPI.
  3. N –> “No Primary Index”. Creates tables without PI.

However, note that if this option is chosen as N, and we create a table without explicit PI but table has UNIQUE or PRIMARY KEY defined , then  UNIQUE and PRIMARY KEY take precedence over the ‘N’ option and the table is created with a Unique Primary index.

Having said this also note that if we explicitly specify ‘NO PRIMARY INDEX’ while creating the table, it overrides any PRIMARY KEY or UNIQUE constraints. They get implemented as USI and the table is a NOPI table.

How is NOPI table implemented?

  • The rows are distributed between AMP’s using the Random Generator. Within a AMP rows are just added in a sequential order.
  • The Random generator is designed in such a way to balance out the rows between amps.
  • Normal tables have 64-bit ROWID (32-bit uniqueness value + 32 bit row hash).
  • NOPI tables will also have 64-bit ROWIDs ,but they are different than the normal table ROWIDs.
  • 1st part of the ROWID is based on the hash bucket value(Hash bucket value can be 16 bit or 20 bit. With TD 12 and forward the typical hash bucket size will be 20 bits and that means over 1 mill hash bucket entries.) that is internally generated and controlled by AMP.
  • Typically, all rows on the AMP will have same hash bucket value , but will have different uniqueness values.

There are two steps involved:

Step1 : The Random generator chooses the hash bucket which in turn determines the AMP to which the row is sent to.

Step 2: Internally each AMP selects a Hash bucket value that the AMP owns and uses  it as the 1st part of the row id (16 or 20 bits).

Thank you for giving your valuable time to read the above information.
Follow us on 
Website  www.ktexperts.com
Facebook Page KTExperts Facebook
Linkedin Page : KT EXPERTS Linkedin

Share via:
Note: Please test scripts in Non Prod before trying in Production.
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

Add Comment