Primary Indexes define the how the data is distributed for a particular table. There are two types of primary indexes:

Unique Primary Index (UPI): Unique Primary Index are those which has all of its rows as unique. This ensures even distribution of data i.e. all amps have same amount of data. This is an ideal situation which will not have any skewed distribution. With a UPI, there is no duplicate row checking done during a load which makes it a faster operation. Accessing the row with a primary index (whether UPI or NUPI) is the most desirable operation which is only a one amp operation.




As you can see above when UPI value is passed to hashing algorithm, it tells the parser that the data is in amp1 and the data is directly selected from one amp thereby making it a one amp operation and only one row is returned.

Non Unique Primary Index (NUPI):  When it is not possible to have all unique values to be chosen as a PI, we choose a PI which is almost unique i.e. lesser duplicates which is called as non unique primary index. NUPI will have duplicate rows on the same amp thereby leading to skewed distribution of data and the amp having highest skewed data will be slower in serving requests when compared to other amps. Accessing a NUPI is also a one amp operation but the number of rows returned might be more than 1.

Below is the diagram of NUPI access



As seen from above diagram, NUPI access is also a single amp operation and thus queries having filters on the PI columns are always one amp operation and are the fastest in terms of performance.



Note: Please test scripts in Non Prod before trying in Production.
1 Star2 Stars3 Stars4 Stars5 Stars (23 votes, average: 5.00 out of 5)

6 thoughts on “Teradata : TYPES OF PRIMARY INDEXES (UPI AND NUPI)

Add Comment