Partitioned Primary Index in Teradata

Share via:

Dear Readers,

In this article, we will learn about Partitioned Primary Index in teradata.

Partitioned Primary Index

  • A primary index can be either
      1. Partitioned OR
      2. Non-partitioned
  • Primary indexes for join and hash indexes can also either be hash- or value-ordered, while primary indexes for all other table types are only hash-ordered.
  • The decision to define which of the two choices for a table depends on how its rows are most frequently accessed

The Concept behind Partitioning a Table

  1. Each Table in Teradata has a Primary Indexunless it is a NoPI table.
  2. The Primary Index is the mechanism that allows Teradata to physically distributethe rows of a table across the AMPs.
  3. AMPs Sort their rows by the Row-ID, so the system can perform a lightning fast Binary Search since the rows are in Row-ID Order.
  4. Partitioning merely tells the AMP to sort its tables’ rows by the Partition first, but then sort the rows by Row-ID within the partition.
  5. Partitioning queries will involve all AMPs, but partitioned tables are designed to prevent …

In this page, let us discuss further on Partitioned Primary Index (PPI)

Partitioned Primary Indexes

  • A partitioned primary index is defined to be either single-level or multilevel depending on how many partitioning expressions are defined for the table.
  • Partitioned Primary Index (PPI) is one of the powerful features of Teradata which allows a user to access a part of the table rather than full table access.
  • This helps in improving the performance as full table scan is eliminated.
  • PPI works same as Primary Index for the data distribution but creates partitions according to range or case as specified in the table.
  • Optimal partitioning expressions are typically coded using CASE_N or RANGE_N expressions based on exact numeric, character or DateTime columns.
  • When rows are inserted into a table, they are stored in an AMP and arranged by their row hash order. When a table is defined with PPI, the rows are sorted by their partition number. Within each partition, they are arranged by their row hash. Rows are assigned to a partition based on the partition expression defined.
  • DateTime expressions can include the BEGIN and END bound functions and the DATE, CURRENT_DATE, and CURRENT_TIMESTAMP functions.
  • The test value you specify with a RANGE_N function must result in either of the following :
      1. BYTEINT
      2. SMALLINT
      3. INTEGER
      4. BIGINT
      5. DATE
      6. TIMESTAMP(n)
      7. TIMESTAMP(n) WITH TIME ZONE
      8. CHARACTER
      9. VARCHAR
      10. GRAPHIC
      11. VARCHAR(n) CHARACTER SET GRAPHIC data type.
  • Support for the BEGIN and END bound functions also includes support for the IS [NOT] UNTIL_CHANGED and IS [NOT] UNTIL_CLOSED functions.

Advantages of PPI :

  • Avoid full table scan for certain queries.
  • Avoid using secondary index that requires additional physical structure and additional I/O maintenance.
  • Access a subset of a large table quickly.
  • Drop the old data quickly and add new data.

FOUR TYPES OF PPI:

  1. Case partitioning
  2. Range based partitioning
  3. Multi-level partitioning
  4. Character based partitioning.

Case_N Partitioning :

Rows will be placed into the partitions derived using condition mentioned in the Case statement(It will process the first met case condition only).

Example :

The UNKNOWN Partition is for a PRODUCT_VALUE with a NULL value. The NO CASE Partition is for partitions that did not meet the CASE criteria.

For example, if a PRODUCT_VALUE is greater than 20,000 it wouldn‘t fall into any of the partitions so it goes to the NO CASE partition.

RANGE_N Partitioning :  

  • Rows will be placed into the partitions derived using the range expression. Ranges are always listed in incremental order and ranges cannot be overlapped. Value to the RANGE_N can only be numerical (i.e Date or Number)
  • When we apply Range Query, that means it uses the keyword BETWEEN. The BETWEEN keyword in Teradata means find everything in the range BETWEEN this date and this other date. To avoid full table scan, we will Partition the table.

A Partitioned Table is designed to eliminate a Full Table Scan, especially on Range Queries.

Example : PARTITION BY clause is used to define the partition.

In the above example, the table is partitioned by PRODUCT_MFGD_DATE column. There will be one separate partition for each day.

Multi-Level Partitioning:
You can have up to 15 levels of partitions within partitions.

Example :

 Character Based Partitioning:

There are four new data types available for Character Based PPI. They are :

1.CHAR

2.VARCHAR

3.GRAPHIC AND

4.VARGRAPHIC

Example : RANGE based character PPI

Example : CASE based character PPI

Multilevel Partitioned Primary Index(MLPPI):

Multilevel Partitioned primary index allows each partition at a given level to be again partitioned into sub-partitions. MLPPI performs efficient searches by using partition elimination at the various levels or combinations of levels.

Example :

Below table is first partitioned by year and then within each year rows will be partitioned(sub-partition) by month.

Note: If unknown values are not handled properly and users try to insert unknown values, then users will get the below error.

Partitioning Rules:

  • A table can have up to 65,535 Partitions.
  • Partitioning never determines which AMP gets row.
  • Partitioning determines how an AMP will sort the row on its own.
  • Table can have up to 15 levels of partitions.
  • A table cannot have an UPI as the Primary Index if the Partition table does not include PI.
  • Total there are three forms of Partitioning : Simple, RANGE and CASE.

Advantages of partitioned tables:

  • They provide efficient searches by using partition elimination at the various levels or combination of levels.
  • They reduce the I/O for range constraint queries
  • They take advantage of dynamic partition elimination
  • They provide multiple access paths to the data, and an MLPPI provides even more partition elimination and more partitioning expression choices, (i.e., you can use last name or some other value that is more readily available to query on.)
  • The Primary Index may be either a UPI or a NUPI; a NUPI allows local joins to other similar entities
  • Row hash locks are used for SELECT with equality conditions on the PI columns.
  • Partitioned tables allow for fast deletes of data in a partition.
  • They allow for range queries without having to use a secondary index.
  • Specific partitions maybe archived or deleted.
  • May be created on Volatile tables; global temp tables, base tables, and non-compressed join indexes.
  • May replace a Value Ordered NUSI for access.

Disadvantages of partitioned tables: 

  • Rows in a partitioned table are 2 bytes longer.
  • Access via the Primary Index may take longer.
  • Full table joins to a NPPI table with the same PI may take longer.
  • Partitioning can make single row (primary index) accesses to the table slower if partitioning column is not a part of the primary index column.

PPI Characteristics:

  • Partitioning can be done on volatile, global temporary and permanent tables.
  • Partitioning column doesn’t have to be part of non-unique Primary Index.
  • Partitioning column must be part of unique Primary Index.
  • Partitioned column/s will not decide which AMP data should go,it is solely dependent on Primary Index.
  • Rows within AMP will be sorted by partitioned column/s and not by rowId.
  • PPI can be defined at multi-level i.e sub-partition within partition (called Multilevel PPI).
  • Teradata uses partition elimination strategy which helps to avoid Full Table Scan(FTS) for partitioned tables.
  • Partitioned tables may reduce I/O for range queries by reducing the number of rows to be processed .
  • Details about the partitions, constraints & conditions can be retrieved from dbc.PartitioningConstraintsV view.

Thank you for giving your valuable time to read the above information.
Follow us on 
Website  www.ktexperts.com
Facebook Page KTExperts Facebook
Linkedin Page : KT EXPERTS Linkedin

Share via:
Note: Please test scripts in Non Prod before trying in Production.
1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)
Loading...

Add Comment