My Notes: Relational Database Management System- III B.Com(CA)

Q) WRITE A SHORT NOTE ON A) DATA B)DATABASE C)DATABASE MANAGEMENT SYSTEM (DBMS)

A) DATA:

· It is a collection of information.

· The facts that can be recorded and which have implicit meaning known as 'data'.

Example: Customer ----- 1.cname. 2. cno. 3. ccity.

B) DATABASE:

· It is a collection of interrelated data.

· These can be stored in the form of tables.

· A database can be of any size and varying complexity.

· A database may be generated and manipulated manually or it may be computerized.

Example: Customer database consists the fields as cname, cno, and ccity

Cname	Cno	Ccity

C) DATABASE-MANAGEMENT SYSTEM (DBMS):

· A database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data.

· The collection of data, usually referred to as the database, contains information relevant to an enterprise.

· The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and efficient.

Q) What are the Database System Applications?

A) Database System Applications are as follows.

Banking: For customer information, accounts, and loans, and banking transactions.
Airlines: For reservations and schedule information. Airlines were among the first to use databases in a geographically distributed manner—terminals situated around the world accessed the central database system through phone lines and other data networks.
Universities: For student information, course registrations, and grades.
Credit card transactions: For purchases on credit cards and generation of monthly statements.
Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances on prepaid calling cards, and storing information about the communication networks.
Finance: For storing information about holdings, sales, and purchases of financial instruments such as stocks and bonds.
Sales: For customer, product, and purchase information.
Manufacturing: For management of supply chain and for tracking production of items in factories, inventories of items in warehouses/stores, and orders for items.
Human resources: For information about employees, salaries, payroll taxes and benefits, and for generation of paychecks.

Q) Explain the Traditional Approach for Data Storage and the Need of DBMS (OR) explain the difference between File processing systems and database (OR) Problems with Early Information System (OR) Purpose of Database Systems.

ANS: Traditional Data Storage Model

1. In traditional approach, information is stored in flat files which are maintained by the file system under the operating system’s control.

2. Application programs go through the file system in order to access these flat files

How data is stored in flat files

· Data is stored in flat files as records.

· Records consist of various fields which are delimited by a space, comma, pipe, any special character etc.

· End of records and end of files will be marked using any predetermined character set or special characters in order to identify them

Example: Storing employee data in flat files

Problems with traditional approach for storing data

1. Data Security: The data stored in the flat file(s) can be easily accessible and hence it is not secure.

Example: Consider an online banking application where we store the account related information of all customers in flat files. A customer will have access only to his account related details. However from a flat file, it is difficult to put such constraints. It is a big security issue.

2. Data Redundancy: In this storage model, the same information may get duplicated in two or more files. This may lead to to higher storage and access cost. it also may lead to data inconsistency.

For Example, assume the same data is repeated in two or more files. If a change is made to data stored in one file, other files also needs to be change accordingly.

Example: Assume employee details such as firstname, lastname, emailid are stored in employee_details file and employee_salary file. If a change needs to be made to emailid, both employee_details file and emplyee_salary file need to be updated otherwise it will lead to inconsistent data.

However, it is possible to design file systems with minimal redundancy. Also note that Data redundancy is sometimes preferred.

Example: Assume employee details such as firstname, lastname, emailid are stored only in employee_details file and not in employee_salary file. If we need to access an employee salary along with firstname of the employee, we have to retrieve details from two files. This would mean an increased overhead.

3. Data Isolation: Data Isolation means that all the related data is not available in one file. Usually the data is scattered in various files having different formats. Hence writing new application programs to retrieve the appropriate data is difficult.

4. Program/Data Dependence: In traditional file approach, application programs are closely dependent on the files in which data is stored. If we make any changes in the physical format of the file(s), like addition of a data field , etc, all application programs needs to be changed accordingly. Consequently, for each of the application programs that a programmer writes or maintains, the programmer must be concerned with data management. There is no centralized execution of the data management functions. Data management is scattered among all the application programs.

Example: Consider the banking system. An employee_salary file exists which has details about the salary of employees. An employee_salary record is described by

employee_id, firstname, lastname, salary_amount

An application program is available to display all the details about the salary of all employees. Assume a new data field, the date_of_joining is added to the employee_salary file. Since the application program depends on the file, it also needs to be altered.

If the physical format of the employee_salary file for example the field delimiter, record delimiter, etc. are changed, it necessitates that the application program which depends on it, also be altered.

5. Lack of Flexibility: The traditional systems are able to retrieve information for predetermined requests for data. If we need unanticipated data, huge programming effort is needed to make the information available, provided the information is there in the files. By the time the information is made available, it may no longer be required or useful.

Example : Consider a software application which is able to generate employee salary report. Assume that all the data is stored in flat files. Suppose we now have a requirement to retrieve all the employee details whose salary is greater than Rs.10000. It is not easy to generate such on-demand reports and lot of time is needed for application developers to modify the application to meet such requirements.

6. Concurrent Access Anomalies: Many traditional systems allow multiple users to access and update the same piece of data simultaneously. However this concurrent updates may result in inconsistent data. To guard against this possibility, the system must maintain some form of supervision. But supervision is difficult because data may be accessed by many different application programs and these application programs may not have been coordinated previously.

Example : Consider a personal information system which has the data of all employees. Now there may be an employee updating his address details in the system and at the same time, an administrator may be taking a report containing the data of all employees. This is called concurrent access. Since the employee's address is being updated at the same time, there is a possibility of the administrator reading an incorrect address.

These difficulties lead to the development of database systems.

What is Database Management System (DBMS)?

A Database Management System (DBMS) is a collection of interrelated files and set of programs which allows users to access and modify files. It provides a convenient and efficient way to store, retrieve and modify information. Application programs request DBMS to retrieve, modify/insert/delete data for them and thus it acts as a layer of abstraction between the application programs and the file system.

· DBMS acts as a layer of abstraction on top of the File system.

· For interacting with the DBMS we use a Query language called Structured Query Language (SQL)

General Block Diagram

Q: Why Data Base Management is being preferred in companies to Traditional File Processing system? (Or) Explain the advantages and disadvantages of DBMS .

ANS: Advantages of DBMS:

a. Data Sharing: It is very easy to share between users. Data from the entire company can be used by users who need them. Managers can analyze the data in a more extensive (=wide) manner than in file management system.

b. Reduced data redundancy: A database minimizes duplication of data from file to file. For example in a university database student’s name and address may appear in only one record rather than in the files of many departments.

c.Data independence: A database system keeps data separate from the application that uses the data. Thus, changes can be made to data definitions without making changes to every application program that uses the data. Thus program maintenance costs (i.e. cost of upgrading application programs) can be substantially reduced.

d.Increased user productivity: In database environment user with minimum technical knowledge can also generate queries using query languages and can generate reports using report generators.

e. Improved Data Integrity: As data redundancy is minimized, data inconsistency and threat of non integrated data are substantially decreased.

f. Improved data administration & control: In database environment the responsibility of protecting the database is placed in the hands of one person or department. It becomes easy to control access to data, privacy of data, updates, deletions of data etc.

Disadvantages of DBMS:

(i) Organizational Conflicts: Pooling data in a common database may not be politically feasible in some organizations. Moreover, the risk involved in data sharing – for example, that one group may damage another group’s data – and the potential system problems that may limit a group’s access to its own data may be viewed as more troublesome than beneficial. Such people problems could prevent the effectual implementation of a database system.

(iii) System Failure: When the system goes down, all users directly involved in accessing must wait until the system is functional again. This may require a long wait. Moreover, if the system or application software fails, there may be permanent damage to the database. It is very important, therefore to carefully evaluate all software that will have a direct effect on the database to be certain that it is as free as errors as possible.

(iv) Overhead Costs: The database approach may require an investment in both hardware and software. The hardware to run large DBMS must be efficient and will generally require more main memory and disk storage than simpler file-based system. Tape drivers for rapidly backing up the database are also required. In addition, the DBMS itself may be quite expensive.

(v) Need for Sophisticated Personnel: The database administration function requires skilled personnel who are capable of coordinating the needs of different user groups, designing views, integrating those views into a single schema, establishing data recovery procedures and fine tuning the physical structure of the database to meet acceptable performance criteria.

Q) Explain about the three levels of data abstraction (OR) Explain about view of data (OR) Write about the architecture of a database?

Architecture of a Database: A database follows a three level architecture

1. View level,

2. Logical level,

3. Physical level

1. View level:

· It is viewed by the chairman or the operation manager or the data entry operator,

· It is at the highest level of database abstraction,

· It includes only those portion of database or application programs which is of concern to the users,

· It is defined by the users or written by the programmers.

E.g.: An external view in its Logical Record 1 may indicate employee name, employee address and in its Logical Record 2 may indicate employee name, employee address, employee code and employee salary.

2) Logical level:

· It is viewed by Data Base Administrator,

· All database entities and relationships among them are included,

· Single view represents the entire database,

· It describes all records, relationships and constraints or boundaries,

· It includes data description to render it independent of the physical representation.

E.g.: A conceptual view may define employee code as a string of characters, employee address as a string, employee code as a key and employee salary as an integer

3)Physical or internal view:

· It is at the lowest level of database abstraction,

· It is closest to the physical storage method,

· It indicates how data will be stored,

· It describes data structure,

· It describes access methods,

· It is expressed by internal schema.

E.g.: The internal view may define that employee name consists of 30 characters, employee address also consist of 30 characters, employee code consists of 5 characters and employee salary consists of 12 numbers.

Figure shows the relationship among the three levels of abstraction

An example of the three levels:

Q) What are the database languages of DBMS? (OR) Discuss any 2-types of DBMS languages.

A database system provides a DATA DEFINITION LANGUAGE to specify the database schema and a DATA MANIPULATION LANGUAGE to express database queries and updates.

1) DATA DEFINITION LANGUAGE: Data definition language is

· This language provides a set of commands which can be used to define

what is the data in database.
what is the relationship between various data elements
what are the integrity constraints put on various data items needed to be satisfied o etc.

· It will be used to define the records or structures of database.

· The DDL statements are compiled to form the Data Dictionary or Data Directory which contains the meta data i.e. data about data.

· The data dictionary is consulted by DBMS before any operation on data.

2) DATA-MANIPULATION LANGUAGE: Data manipulation is

· The retrieval of information stored in the database

· The insertion of new information into the database

· The deletion of information from the database

· The modification of information stored in the database

A data-manipulation language (DML) is of two types:

1) Procedural DMLs require a user to specify what data are needed and how to get those data.

2) Nonprocedural DMLs require a user to specify what data are needed without specifying how to get those data.

Q) EXPLAIN DIFFERENT TYPES OF DATA MODELS

DATA MODELS: The entire structure of a database can be described using a data model. It is a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. Data models can be classified into following types.

1. Object Based Logical Models.

2. Record Based Logical Models.

3. Physical Models.

1.Object Based Logical Models: These models can be used in describing the data at the logical and view levels. These models are classified into following types.

a. The entity-relationship model.

b. The object-oriented model.

c. The semantic data model.

d. The functional data model.

a. THE ENTITY-RELATIONSHIP MODEL:

· The entity-relationship (E-R) data model is a collection of basic objects, called entities, and of relationships among these objects.

· An entity is a “thing” or “object” in the real world that is distinguishable from other objects. For example, each person is an entity, and bank accounts can be considered as entities.

· A relationship is an association among several entities. For example, a depositor relationship associates a customer with each account that she has.

The overall logical structure (schema) of a database can be expressed graphically by an E-R diagram, which is built up from the following components:

• Rectangles, which represent entity sets

• Ellipses, which represent attributes

• Diamonds, which represent relationships among entity sets

• Lines, which link attributes to entity sets and entity sets to relationships

2.Record Based Logical Models: These models can also be used in describing the data at the logical and view levels. These models can be classified into,

a. Relational model.

b. Network model.

c. Hierarchal model.

a. Relational Model:

· The relational model uses a collection of tables to represent both data and the relationships among those data.

· Each table has multiple columns, and each column has a unique name. Figure presents a sample relational database comprising three tables:

· It shows how tables are linked, what type of links are between tables, what keys are used, what information is referenced between tables. It's an essential part of developing a normalised database structure to prevent repeat and redundant data storage

One shows details of bank customers, the second shows accounts, and the third shows which accounts belong to which customers.

b. Network Model: It is a modified version of hierarchical database. In network database structure each node may have several parents.

» It is very difficult to develop this type of database structures.

» It is useful for one-to-one and one-to-many record relationships.

» The relationships should be pre-determined. The records in the database are organized as a collection of arbitrary graph.

c. Hierarchical Model: - In this database structure, records are logically organised into a hierarchy of relationships and involve an inverted tree like structure. The tree consists of hierarchy of nodes and the uppermost tree is called parent. Every element can have any number of lower level elements, called children, but every node will have only one parent..

3. Physical Models: These models can be used in describing the data at the lowest level, i.e. physical level. These models can be classified into

a. Unifying model

b. Frame memory model.

Q) WRITE ABOUT DATABASE USERS AND THE ROLE OF DATABASE ADMINISTRATOR

People who work with a database can be categorized as database users or database administrators.

There are four different types of database-system users

1) Naive users: users, who interact with the system by invoking one of the application programs that have been written previously. For example, a bank teller who needs to transfer $50 from account A to account B invokes a program called transfer. This program asks the teller for the amount of money to be transferred, the account from which the money is to be transferred, and the account to which the money is to be transferred.

2) Application programmers: These users are computer professionals who write application programs. The application programmer uses Rapid Application Development (RAD) toolkit or special type of programming languages. These languages, sometimes called fourth-generation languages

3) Sophisticated users: These users interact with the system without writing programs. These users interact with the database using database query language. They submit their query to the query processor. Then Data Manipulation Language (DML) functions are performed on the database to retrieve the data. Tools used by these users are OLAP(Online Analytical Processing) and data mining tools

4) Specialized users: These users write specialized database applications to retrieve data. These applications can be used to retrieve data with complex data types e.g. graphics data and audio data.

THE ROLE OF DATABASE ADMINISTRATOR:

A person having who has central control over data and programs that access the data is called DBA. Following are the functions of the DBA.

1) Schema definition: The DBA creates the original database schema by executing a set of data definition statements in the DDL.

2) Storage structure and access-method definition.

3) Schema and physical organization modification: If any changes are to be made in the original schema, to fit the need of organization, then these changes are carried out by the DBA.

4) Granting of authorization for data access: DBA can decide which parts of data can be accessed by which users. Before any user access the data, dbms checks which rights are granted to the user by the DBA.

5) Routine maintenance: DBA has to take periodic backups of the database, ensure that enough disk space is available to store new data, ensure that performance of dbms is not degraded by any operation carried out by the users.

Q) Explain the database Architecture (OR)Database system structure (or) Data storage and querying (OR)Explain the overall design of DBMS.

A database system is partitioned into modules that deal with each of the responsibilities of the overall system.

The functional components of a database system can be broadly divided into

a) The storage manager and

b) The query processor.

The storage manager is important because databases typically require a large amount of storage space. Corporate databases range in size from hundreds of gigabytes to, for the largest databases, terabytes of data.

The query processor is important because it helps the database system simplify and facilitate access to data.

A)THE STORAGE MANAGER:

· A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.

· The storage manager is responsible for the interaction with the file manager. The raw data are stored on the disk using the file system, which is usually provided by a conventional operating system.

· The storage manager translates the various DML statements into low-level file-system commands. Thus, the storage manager is responsible for storing, retrieving, and updating data in the database.

The storage manager components include:

1) Authorization and integrity manager, which tests for the satisfaction of integrity constraints and checks the authority of users to access data.

2) Transaction manager, which ensures that the database remains in a consistent(correct) state despite system failures, and that concurrent transaction executions proceed without conflicting.

3) File manager, which manages the allocation of space on disk storage and the data structures used to represent information stored on disk.

4) Buffer manager, which is responsible for fetching data from disk storage into main memory, and deciding what data to cache in main memory. The buffer manager is a critical part of the database system, since it enables the database to handle data sizes that are much larger than the size of main memory.

The storage manager implements several data structures as part of the physical system implementation:

1) Data files, which store the database itself.

2) Data dictionary, which stores metadata about the structure of the database, in particular the schema of the database.

3) Indices, which provide fast access to data items that hold particular values.

B) THE QUERY PROCESSOR:

The query processor components include

1) DDL interpreter, which interprets DDL statements and records the definitions in the data dictionary.

2) DML compiler, which translates DML statements in a query language into an evaluation plan consisting of low-level instructions that the query evaluation engine understands. A query can usually be translated into any of a number of alternative evaluation plans that all give the same result. The DML compiler also performs query optimization, that is, it picks the lowest cost evaluation plan from among the alternatives.

3) Query evaluation engine, which executes low-level instructions generated by the DML compiler.

Q) Explain the History of DBMS

History of DBMS

Database management systems were first introduced during 1960’s and have continued to evolve during subsequent decades as shown in the above figure.

1960’s

File processing systems were dominant during this period. Databases are primarily for large and complex tasks.

1970’s

During this decade the use of DBMS became a commercial reality. The hierarchical and network database management systems were developed during this decade. However, there were some disadvantages in these DBMS’s.

1980’s

To overcome the limitations of the hierarchical and network DBMS’s E.F. Codd and other developed the relational model (second generation DBMS) during 1970’s. It received widespread acceptance during the 1980’s, with the relational model all data are represented in the form of tables (rows x columns). A relatively simple fourth generation languages (4GL) called SQL is used for data retrieval.

1990’s

The decade of the 1990’s is very important in a new era of computing. The technologies such as client/server (c/s), datawarehousing and Internet applications were introduced during this decade to cope-up with these increasingly complex situations object oriented databases were introduced during this decade

2000 and Beyond

We can expect following several trends in the database technologies during next decade

1. The ability to manage increasingly complex data types (i.e., image, audio and video segments)

2. Continued development of universal servers

3. Distributed databases

4. Content addressable storage

5. Artificial intelligence

6. Powerful data mining algorithms

7. Wireless transmission etc

Q) Explain the Basic Structure of the Relational Model

· Relational Model was introduced in 1970 by E.F. Codd (at IBM).

· Relational Model is basis for most DBMSs, e.g., Oracle, Microsoft SQL Server, IBM DB2, Sybase, PostgreSQL, MySQL, . . .

A relation is a set of attributes with values for each attribute such that:

Each attribute value must be a single value only (atomic).
All values for a given attribute must be of the same type (or domain).
Each attribute name must be unique.
The order of attributes is insignificant
No two rows (tuples) in a relation can be identical.
The order of the rows (tuples) is insignificant.

Relational Algebra is a collection of operations on Relations.
Relations are operands and the result of an operation is another relation.

Relation Schema, Database Schema, and Instances

· Let A1,A2,. . . , An be attribute names with associated domains D1, D2,. . . , Dn, then R(A1 : D1,A2 : D2,. . . ,An : Dn) is a relation schema.

For example, Student(StudId : integer, StudName: string, Major : string)

· A relation schema specifies the name and the structure of the relation.

· A collection of relation schemas is called a relational database schema.

· A relation instance r(R) of a relation schema can be thought of as a table with n columns and a number of rows.

Instead of relation instance we often just say relation. An instance of a database schema thus is a collection of relations.

· An element t € r(R) is called a tuple (or row).

Q: EXPLAIN FUNDAMENTAL OPERATIONS IN THE RELATIONAL ALGEBRA

The relational algebra is a procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation as their result. The fundamental operations in the relational algebra are

1. select,

2. project,

3. union,

4. set difference,

5. Cartesian product, and

6. rename.

The select, project, and rename operations are called unary operations, because they operate on one relation. The other three operations operate on pairs of relations and are, therefore, called binary operations.

The Select Operation:

Selection is an unary operator.
The selection operator is sigma:
The selection operation acts like a filter on a relation by returning only a certain number of tuples.
The resulting relation will have the same degree (no. of attributes) as the original relation.
The resulting relation may have fewer tuples than the original relation.
The tuples to be returned are dependent on a condition that is part of the selection operator.

· Notation : C (R) Returns only those tuples in R that satisfy condition C

A condition C can be made up of any combination of comparison or logical operators that operate on the attributes of R.

Comparison operators:
Logical operators:

Consider the BOOK table:

Acc-No	Title	Author
100	“DBMS”	“Silbershatz”
200	“DBMS”	“Ramanuj”
300	“COMPILER”	“Silbershatz”
400	“COMPILER”	“Ullman”
500	“OS”	“Sudarshan”
600	“DBMS”	“Silbershatz”

σAcc-no>300(BOOK) =

Acc-No	Title	Author
400	“COMPILER”	“Ullman”
500	“OS”	“Sudarshan”
600	“DBMS”	“Silbershatz”

πAcc-no (σTitle=”DBMS” (BOOK))=

Acc-No

100

200

600

Project Operation (∏):

Projection is also a Unary operator.
The Projection operator is pi:
Projection limits the attributes that will be returned from the original relation.
The general syntax is: _attributes R or ∏_A1,A2,An (r)
Where attributes is the list of attributes to be displayed and R is the relation.
The resulting relation will have the same number of tuples as the original relation (unless there are duplicate tuples produced).
The degree of the resulting relation may be equal to or less than that of the original relation

πTitle(BOOK) =

Title

“DBMS”

“COMPILER”

“OS”

Union Operation (∪): Union operation performs binary union between two given relations. Duplicate tuples are automatically eliminated and is defined as: r ∪ s = {t | t ∈ r or t ∈ s}.

Consider:

Borrower

Cust-name	Loan-no
Ram	L-13
Shyam	L-30
Suleman	L-42

Depositor

Cust-name	Acc-no
Suleman	A-100
Radheshyam	A-300
Ram	A-401

List of customers who are either borrower or depositor at bank

πCust-name (Borrower) U πCust-name (Depositor)=

Cust-name

Ram

Shyam

Suleman

Radeshyam

Set Difference ( − ): The result of set difference operation is tuples which present in one relation but are not in the second relation.

Notation: r – s

Customers who are borrowers but not depositors = πCust-name (Borrower)  πCust-name (Depositor)=

Cust-name

Shyam

Cartesian Product (Χ) :Combines information of two different relations into one. Notation: r Χ s

Where r and s are relations and there output will be defined as:

r Χ s = { q t | q ∈ r and t ∈ s}

Consider the borrower and loan tables as follows:

Borrower:

Cust-name	Loan-no
Ram	L-13
Shyam	L-30
Suleman	L-42

Loan:

Loan-no	Amount
L-13	1000
L-30	20000
L-42	40000

Cross product of Borrower and Loan, Borrower × Loan =

Borrower.Cust-name	Borrower.Loan-no	Loan.Loan-no	Loan.Amount
Ram	L-13	L-13	1000
Ram	L-13	L-30	20000
Ram	L-13	L-42	40000
Shyam	L-30	L-13	1000
Shyam	L-30	L-30	20000
Shyam	L-30	L-42	40000
Suleman	L-42	L-13	1000
Suleman	L-42	L-30	20000
Suleman	L-42	L-42	40000

Rename operation ( ρ ): Results of relational algebra are also relations but without any name. The rename operation allows us to rename the output relation. rename operation is denoted with small greek letter rho ρ.

Notation:

ρ _x (E)

Where the result of expression E is saved with name of x.

Q) Discuss about various join operations in relational algebra

Join operations bring together two relations and combine their attributes and tuples in a specific fashion.
The generic join operator (called the Theta Join) is:
It takes as arguments the attributes from the two relations that are to be joined.
The join condition can be
When the join condition operator is = then we call this an Equijoin
Note that the attributes in common are repeated.

Assume we have the EMP relation and the DEPART relation:

EMP

Name	Office	Dept	Salary
Smith	400	CS	45000
Jones	220	Econ	35000
Green	160	Econ	50000
Brown	420	CS	65000
Smith	500	Fin	60000

DEPART

Dept	MainOffice	Phone
CS	404	555-1212
Econ	200	555-1234
Fin	501	555-4321
Hist	100	555-9876

Find all information on every employee including their department info:
EMP _{emp.Dept = depart.Dept}DEPART

Results:

Name	Office	EMP.Dept	Salary	DEPART.Dept	MainOffice	Phone
Smith	400	CS	45000	CS	404	555-1212
Jones	220	Econ	35000	Econ	200	555-1234
Green	160	Econ	50000	Econ	200	555-1234
Brown	420	CS	65000	CS	404	555-1212
Smith	500	Fin	60000	Fin	501	555-4321

Natural Join

Notice in the generic (Theta) join operation, any attributes in common (such as dept above) are repeated.
The Natural Join operation removes these duplicate attributes.
The natural join operator is: *
We can also assume using * that the join condition will be = on the two attributes in common.
Example: EMP * DEPART

Name	Office	Dept	Salary	MainOffice	Phone
Smith	400	CS	45000	404	555-1212
Jones	220	Econ	35000	200	555-1234
Green	160	Econ	50000	200	555-1234
Brown	420	CS	65000	404	555-1212
Smith	500	Fin	60000	501	555-4321

Outer Join

In the Join operations so far, only those tuples from both relations that satisfy the join condition are included in the output relation.
The Outer join includes other tuples as well according to a few rules.
Three types of outer joins:

Left Outer Join includes all tuples in the left hand relation and includes only those matching tuples from the right hand relation.
Right Outer Join includes all tuples in the right hand relation and includes only those matching tuples from the left hand relation.
Full Outer Join includes all tuples in the left hand relation and from the right hand relation.

Assume we have two relations: PEOPLE and MENU:

PEOPLE:

Name	Age	Food
Alice	21	Hamburger
Bill	24	Pizza
Carl	23	Beer
Dina	19	Shrimp

MENU:

Food	Day
Pizza	Monday
Hamburger	Tuesday
Chicken	Wednesday
Pasta	Thursday
Tacos	Friday

PEOPLE _{people.food = menu.food}MENU

Name	Age	people.Food	menu.Food	Day
Alice	21	Hamburger	Hamburger	Tuesday
Bill	24	Pizza	Pizza	Monday
Carl	23	Beer	NULL	NULL
Dina	19	Shrimp	NULL	NULL

PEOPLE _{people.food = menu.food}MENU

Name	Age	people.Food	menu.Food	Day
Bill	24	Pizza	Pizza	Monday
Alice	21	Hamburger	Hamburger	Tuesday
NULL	NULL	NULL	Chicken	Wednesday
NULL	NULL	NULL	Pasta	Thursday
NULL	NULL	NULL	Tacos	Friday

PEOPLE _{people.food = menu.food}MENU

Name	Age	people.Food	menu.Food	Day
Alice	21	Hamburger	Hamburger	Tuesday
Bill	24	Pizza	Pizza	Monday
Carl	23	Beer	NULL	NULL
Dina	19	Shrimp	NULL	NULL
NULL	NULL	NULL	Chicken	Wednesday
NULL	NULL	NULL	Pasta	Thursday
NULL	NULL	NULL	Tacos	Friday

Outer Union

The Outer Union operation is applied to partially union compatible relations.
Operator is: *
Example: PEOPLE * MENU

Name	Age	Food	Day
Alice	21	Hamburger	NULL
Bill	24	Pizza	NULL
Carl	23	Beer	NULL
Dina	19	Shrimp	NULL
NULL	NULL	Hamburger	Monday
NULL	NULL	Pizza	Tuesday
NULL	NULL	Chicken	Wednesday
NULL	NULL	Pasta	Thursday
NULL	NULL	Tacos	Friday

Division:

· Notation: r ÷ s

· Precondition: attributes in S must be a subset of attributes in R, i.e., S R. Let r, s be relations on schemas R and S, respectively, where

R(A1, . . , Am, B1,. . . ,Bn)

S(B1, . . . , Bn)

The result of r ÷ s is a relation on schema

R - S = (A1, . . . ,Am)

Suited for queries that include the phrase “for all".

The result of the division operator consists of the set of tuples from r defined over the attributes R - S that match the combination of every tuple in s.

Example: given the relations r, s:

Assignment

· Operation (<--) that provides a convenient way to express complex queries.

Idea: write query as sequential program consisting of a series of assignments followed by an expression whose value is “displayed" as the result of the query.

· Assignment must always be made to a temporary relation variable.

The result to the right of <-- is assigned to the relation variable on the left of the <--. This variable may be used in subsequent expressions.

Ex:

temp1 ← ΠR−S (r)

temp2 ← ΠR−S ((temp1 × s) − ΠR−S,S(r))

result = temp1 − temp2

Q) Explain Modifications of the Database in Relational Model.

· The content of the database may be modified using the operations insert, delete or update.

· Operations can be expressed using the assignment operator.

r_new← operations on(r_old)

Insert

· Either specify tuple(s) to be inserted, or write a query whose result is a set of tuples to be inserted.

· r ← r U E, where r is a relation and E is a relational algebra expression.

· STUDENTS ← STUDENTS U {(1024; 'Clark'; 'CSE'; 26)}

Delete

· Analogous to insert, but - operator instead of [ operator.

· Can only delete whole tuples, cannot delete values of particular attributes.

· STUDENTS ← STUDENTS - (σ_major='CS'(STUDENTS))

Update

· Can be expressed as sequence of delete and insert operations.

· Delete operation deletes tuples with their old value(s) and

· insert operation inserts tuples with their new value(s)

My Notes

Monday, 27 February 2017

Relational Database Management System- III B.Com(CA) - Unit-I

Q) Explain the Traditional Approach for Data Storage and the Need of DBMS (OR) explain the difference between File processing systems and database (OR) Problems with Early Information System (OR) Purpose of Database Systems.

How data is stored in flat files

What is Database Management System (DBMS)?

General Block Diagram

An example of the three levels:

Project Operation (∏):

πTitle(BOOK) =

Union Operation (∪): Union operation performs binary union between two given relations. Duplicate tuples are automatically eliminated and is defined as: r ∪ s = {t | t ∈ r or t ∈ s}.

Consider:

Borrower

Depositor

List of customers who are either borrower or depositor at bank

πCust-name (Borrower) U πCust-name (Depositor)=

Set Difference ( − ): The result of set difference operation is tuples which present in one relation but are not in the second relation.

Customers who are borrowers but not depositors = πCust-name (Borrower)  πCust-name (Depositor)=

Cartesian Product (Χ) :Combines information of two different relations into one. Notation: r Χ s

Cross product of Borrower and Loan, Borrower × Loan =

Rename operation ( ρ ): Results of relational algebra are also relations but without any name. The rename operation allows us to rename the output relation. rename operation is denoted with small greek letter rho ρ.

Notation:

ρ _x (E)

Where the result of expression E is saved with name of x.

No comments:

Post a Comment

Monday, 27 February 2017

Relational Database Management System- III B.Com(CA) - Unit-I

Q) Explain the Traditional Approach for Data Storage and the Need of DBMS (OR) explain the difference between File processing systems and database (OR) Problems with Early Information System (OR) Purpose of Database Systems.

How data is stored in flat files

What is Database Management System (DBMS)?

General Block Diagram

An example of the three levels:

Project Operation (∏):

πTitle(BOOK) =

Union Operation (∪): Union operation performs binary union between two given relations. Duplicate tuples are automatically eliminated and is defined as: r ∪ s = {t | t ∈ r or t ∈ s}.

Consider:

Borrower

Depositor

List of customers who are either borrower or depositor at bank

πCust-name (Borrower) U πCust-name (Depositor)=

Set Difference ( − ): The result of set difference operation is tuples which present in one relation but are not in the second relation.

Customers who are borrowers but not depositors = πCust-name (Borrower)  πCust-name (Depositor)=

Cartesian Product (Χ) :Combines information of two different relations into one. Notation: r Χ s

Cross product of Borrower and Loan, Borrower × Loan =

Rename operation ( ρ ): Results of relational algebra are also relations but without any name. The rename operation allows us to rename the output relation. rename operation is denoted with small greek letter rho ρ.

Notation:

ρ x (E)

Where the result of expression E is saved with name of x.

No comments:

Post a Comment

ρ _x (E)