Share via:
##### SAMPLE FUNCTION

As the name suggests, SAMPLE function will an idea on the data stored in the table/view by returning few numbers of rows specified either as a list of fractions of the total number of rows or as a list of numbers of rows from the SELECT query.

The function can be specified in two ways  :

• SAMPLE n – Will return a sample of n rows.
• If the number n is greater than the number of rows in the table, the sample will consist of the number of rows in the        table.

Sample Syntax

Here n can either of the below two :

• An actual number of rows.
• A percentage of tables. (n can be any value between 1.00 and .00 )

If the number n is greater than the number of rows in the table, sample n will return all rows from the table.

Example :

Consider the following table.

 STUDENT_ID FIRST_NAME LAST_NAME DEPARTMENT 1 Alex Green IT 2 Monica Adams Sales 3 Peter Williams IT 4 Chandler Bing HR 5 Ross Buffay Finance

Select a sample of2 records from the student table :

Explain plan :

Explanation

——————————————————————–

1) First, we lock KTExp.student in TD_MAP1 for read on a reserved RowHash to prevent global deadlock.

2) Next, we lock KTExp.student in TD_MAP1 for read.

3) We do an all-AMPs SAMPLING step in TD_MAP1 from KTExp.student by way of an all-rows scan with no residual conditions into Spool1(group_amps), which is built locally on the AMPs.  Samples arespecified as a number of rows.  The size of Spool 1 isestimatedwith high confidence to be 2 rows (184 bytes).

4) Finally, we send out an END TRANSACTION step to all AMPs Involvedin processing the request.

-> The contents of Spool 1 are sent back to the user as the result ofstatement 1.

1. Select 30% sample records from the student table.

We have a total of 5 rows in the Student table. 50% percent of total rows is 2.5 rows. So, the output is 3 rows.

Fractional result less than .4999 will not add any row as shown below.

SAMPLE using SAMPLEID

SAMPLEID is an extension for SAMPLE to get multiple set of samples in a single query.

Example

In the example above , the table has a total of 5 rows whereas we have requested to provide sample of 7, since there are no enough rows to return , we get a WARNING message in BTEQ stating “7473 Requested sample is larger than table rows. All rows returned“.

Note:This warning is not seen in ODBC.

Explain Plan:

Explanation

1) First, we lock KTexp.STUDENT in TD_MAP1 for read on a reservedRowHash to prevent global deadlock.

2) Next, we lock KTExp.STUDENT in TD_MAP1 for read.

3) We do an all-AMPs SAMPLING step in TD_MAP1 from TEST.STUDENT by way of an all-rows scan with no residual conditions into Spool 1(group_amps), which is built locally on the AMPs.Then we do ASORT to order Spool 1 by the sort key in spool field1 ( KTExp.STUDENT.Field_1, KTExp.STUDENT.STUDENT_ID).  Samples are specified as a number of rows.  The size of Spool 1 is estimated with high confidence to be 5 rows (255 bytes).

4) Finally, we send out an END TRANSACTION step to all AMPs Involvedin processing the request.

-> The contents of Spool 1 are sent back to the user as the result ofstatement 1.

We can also use percentage with SAMPLEID as shown below :

Thank you for giving your valuable time to read the above information.     (No Ratings Yet) Loading...