LOGICAL DATA MODELING
To start with Cassandra data modeling, we are ready to begin designing Cassandra tables. From the conceptual model we must create a logical model containing a table for each query, capturing entities and relationships.
Then identify the primary entity type for which type you are querying and then use the primary entity type to start the entity name. If you are querying by attributes of other related entities use those to the table name, separated with by. example: schools_by_area
Next Identify the primary key of the table, then add partition key columns based on the required query attributes and clustering columns.
As primary key is very important because, we will know the capacity of the data stored in each partition key and also the way of data organized on the disk. Now the table is involved in the process.
INTRODUCTION TO NOTATION:
Notations are important in Cassandra because using that we can represent logical models. notations are used for capturing the data models in diagrammatic form.
Each table is shown with its title and a list of columns. Primary key columns are identified via symbols such as K for partition key columns and C↑ or C↓ to represent clustering columns. Lines are shown entering tables or between tables to indicate the queries that each table is designed to support.
Let us consider a data model which is used for storing client id, name phone_number etc with the service provided, service appointed, service provided employer and service appointed employer.
Diagramming Logical Data Models:
Once you have the object entities which are mapped to tables, you have to normalize the data. Normalization is the process of organizing the columns and tables to reduce the data problems.
Here the client entity has one to many relationships to service_appointment_entity. Hence service appointment entity will definitely have the client key column
The Service Appointment entity has many-to-one relationship to Client, Service Provider, Service, and Service Provider Employee.
Hence the Service Appointment table will have key columns to those four entities (client_id, service_provider_id, service_id, service_provider_employee_id).
In this way we must sort out each entity and combine everything to form a single logical model.
Query Driven Model:
Since Logical data model is a mapping of many entity relationship diagrams, we must combine the queries in the of the application. These combined entities will look as tables in Cassandra tables environment.
The data can be retrieved as:
👉 Search for a client by name.
👉 Search for a client by phone number.
👉 Get all scheduled service appointment by client for date range.
👉 Get all scheduled service appointment by employee for date range.
This is how data is retrieved in logical data modelling series, we must have the good understanding on the conceptual and logical data models.
Author : Neha Kasanagottu