In this article, we will learn about Features & High level architecture of Teradata.
Why Teradata and Teradata Architecture
- Teradata is flexible,sophisticated, and cost-effective solutions that complement existing technology and scale as the organization grows to capture and interpret the large data sets.
- This is a unified hybrid architecture that lets organization to capture, store and analyse the structured, semi-structured ( xml, json etc ) and unstructured data.
- Even if Teradata has come into the years, the developers of the system have already considered many details from the beginning that still make Teradata capable of competing today.
- If we look at various modern database systems today, such as Redshift from Amazon (or Netezza), for example, we can recognize many things that were used by Teradata for the first time.
- Teradata was designed from the beginning for parallelism in every smallest detail and can therefore still be found today among the top RDBMS for Data Warehousing.
High level features of Teradata
Transparent Access :
- SQL-H provides a robust interface for run-time, selfservice data access from Aster Database to Hadoop, as well as Teradata Database to Hadoop
- SQL Assistant provides a user-friendly SQL creation front-end for a consistent experience across Teradata Database and Teradata Aster Database
- Unity Director automatically routes users and queries between Teradata systems based on context of the query and system availability.
Seamless Data Movement :
- Connectors provide easy to use, high-speed data movement between Teradata Database and Teradata Aster Database, Teradata Aster Database and Hadoop,and Teradata Database and Hadoop.
- Smart Loader for Hadoop gives users and administrators a friendly point-and-click, drag-and-drop interface for bi-directional data movement between Teradata Database and Hortonworks Hadoop.
- Unity Data Mover delivers intelligent, high speed data movement between Teradata systems. Its combination of command line or GUI driven interface coupled with its automatic selection of load utility gives users and administrators a powerful tool for data movement.
Single Operational View for Management :
- The ease-of-use and anytime, anywhere access of Teradata Viewpoint is now available on the Teradata Aster Big Analytics Appliance, with support for the Teradata Aster Database as well as Hortonworks Hadoop, giving administrators a single console for managing all the Teradata analytics systems in their environment.
- Teradata Vital Infrastructure extends consistent onestop support for Teradata to the Teradata Aster, and Hortonworks Hadoop appliances. Automated monitoring and fault escalation for all three technologies is delivered from a single source
- Unity Ecosystem Manager supplies end-to-end monitoring of process, components, and data across Teradata systems
- Unity Director makes managing multiple systems running the Teradata Database easy by intelligently applying database management commands to all participating Teradata systems.
Along with the above it has
- Fully autonomous file system
- shared nothing architecture
- world class optimizer
- Geospatial analytics
- performance at scale
High Level Architecture of Teradata
Below is the graphical/functional view of Teradata single node.
Teradata Node :
- Parsing engines and AMPs are processed and run on a node. A node is usually a Linux machine equipped with multiple physical CPUs.
- Each node can run hundreds of AMPs. Each AMP has its own portion of the main memory and its own portion of mass memory (called virtual disk).
- Nodes are connected to a disk array, and each AMP is assigned a part of it as a logical disk. Nowadays, SSDs are used and management is done by the Teradata Intelligent Memory system. But the principle is the same.
Parsing Engine :
The Parsing engine receives a request (e.g. an SQL statement) and generates an execution plan for all AMPS that are required to complete the request. Ideally, the plan is structured so that all AMPs start and finish their tasks at the same time. This ensures optimal parallel utilization of the system.
Main tasks of Parsing Engine :
- Session Manager : Logging on and Logging Off Sessions
- Resolver : The parsing of requests (syntax check, checking authorizations)
- Optimiser : Preparation and optimization of the execution plan
The Parsing Engines uses statistics to build an optimized plan.
- Query steps Dispatcher : Controlling the AMPs by Instructions
- Input data conversion : EBCDIC to ASCII conversion in both directions
- Transfers of the result of a request to the client tool.
- Each Teradata System can use multiple parsing engines.
- The number of parsing engines can be increased by the system as needed because each parsing engine can only process a limited number of sessions.
- Currently, there are 120 sessions that any parsing engine can manage. These can be sessions of different users, but also 120 sessions of the same user.
Message Passing Layer/Bynet
As you can see in the figure above, between the AMPs and the parsing engine is the BYNET, which represents the communication network over which both the data and instructions are exchanged
Below are the functionalities
Most efficient communications system between parallel database units
Guaranteed message delivery
- Designed for availability and redundancy
- AMP Coordination
Oversees Step Completion and error handling
- Final Answer Set Ordering
Dynamic merge bypassing expensive sort/merge steps
Recognizes and adjusts to hardware failures
- Resource Conservation
Sets up dynamic groups to minimize AMPs involved
Buffers messages to same AMP or Node
- Congestion Control
Regulates message flow to prevent overruns or bottlenecks
AMPs are the real workers in a Teradata System who execute the instructions they receive from the Parsing Engine (the Execution Plan).
AMPs are independent units that have their own main memory and mass storage allocated to them.
The allocation is exclusive, i.e. no AMP has access to the resources of another AMP.
These are the main tasks of an AMP:
- Storing and retrieving of rows
- Sorting of rows (for details read How Teradata sorts the result set)
- Aggregation of rows
- Joining of tables (see also: The Essential Teradata Join Methods)
- Locking of tables and rows
- Output conversion ASCII to EBCDIC (if the client is a mainframe)
- Management of its assigned space
- Sending of rows to the Parsing Engine or other AMPs (via the BYNET)
- Recovery handling
- Filesystem management
Each AMP can perform multiple tasks simultaneously. By default, there are 80 tasks that can be executed in parallel.
I hope the above information is useful and helpful.
Make sure follow us on :
Website : https://www.ktexperts.com/
Facebook Page: https://www.facebook.com/ktexperts/
Facebook Group : https://www.facebook.com/groups/ktexperts/
Linkedin : https://www.linkedin.com/company/ktexperts/
Twitter : https://twitter.com/ktexpertsadmin
Youtube : https://www.youtube.com/channel/UCJ-gDTLfNXSY3QoV_fnKtOg