There are two kinds of errors that we might receive in Teradata or in any other database for that matter.
One are the error messages that we normally get (like syntax errors etc), all these errors are handled errors i.e. the code of Teradata knows these kinds of errors and returns it if the query fails. However, what happens when the query gets an exception or error that is not handled by Teradata? In that case, we get a snapshot dump wherein a dump is generated and the code module where the error occurred is logged. Teradata Support then analyzes the dump to find out which part of code has a problem and then they provide a workaround or a code fix.
Now, this exception can happen either because of a code error in Teradata’s source code or because of any hardware issue causing a filesystem error or a data corruption.
Whenever such exception happens in a file system and a query hits the table on that filesystem sector, Teradata gets an exception and the query fails by generating a snapshot dump and in this case the region where the exception/filesystem issue is found is marked as down in that particular table and is called as down region. So if this happens to a table named, employee, we can say that employee table has a down region marked on amp x.
There can be any reason of this hardware issue, error in disk drive etc but it needs to be fixed else any future queries targetting that region will again fail and mark a down region. This down region has a counter which has a threshold defined in the dbscontrol parameter ‘MaxDownRegion’. The default value of this parameter is 3. So, if we get a down region in the same table at same sector (i.e. the defined threshold is reached) the table is marked as down. When the table is marked down, all the queries running against that table will fail and the table will become inaccessible.
To fix this, we need to first fix the hardware error and then make sure that a scandisk and checktable runs clean on the table with the issue. More details on scandisk and checktable will be shared in the next article.
Once, the hardware error is fixed, we reset the down region counter of the table using the following SQL:
ALTER TABLE TABLENAME RESET DOWN;
This will again reset the down region counter of the table to 0. It is advisable that we fix the hardware issue and reset the counter before the table is marked as down and causes bulk failures on the box.
This is a brief explanation of what is a down region and down table. Post your questions in comments please.