In a database such as SQL Server, we create tables which are related to each other through a common column. For example, if we have to store customers’ related information, then the customer’s personal information gets stored in one table and the same customer’s transaction information, gets stored in another table. Both the tables have CustID as a common column.
We can implement RDBMS (Relational Database Management System) concepts through any database, such as SQL Server. So the same entity’s information flows down in different tables. This is part of the Normalization process within RDBMS framework.
However as part of the requirement, we may need to show a report containing customer information from multiple tables, on a matching column condition. Here arises the need of using the technique of Joins in SQL scripting.
Example - Let us create an Employee specific set of 2 tables in a new database.
Execute the following script in your instance of SQL Server.
You can observe that Deptno is the common field in both the tables, Dept & Emp. Department details are in Dept table and Employee table contains a DEPTNO column which represents the Department Number an Employee belongs to. Let us see two records of Emp table to get a clarity. Consider the record of SMITH in Emp table. Smith’s Deptno is 20, which means Smith’s Department Name (DNAME) is RESEARCH. Similarly consider ALLEN’s record. His Deptno is 30, which means Allen’s DNAME is Sales. So for the first two records, we have mapped the common column Deptno!
Now we want to display the ENAME and DNAME for the matching Deptno. This is possible by a technique within SQL scripting that is called as Joins.
There are multiple types of joins, but in this article we are going to focus on Inner Join or also known as Equi-Join.
Editorial Note: Check this artilce to understand Types of JOIN in SQL Server - Inner, Self, Outer and Cross JOIN
So what is Inner Join? Inner join is displaying data from multiple tables on matching values of common field (s), within the associated tables.
1) Non-ANSI
2) ANSI
Let us see the Non-ANSI syntax first:
One thing to note is that in the from clause of SQL statement, we have two tables separated by a comma. The mapping of common column is done in the where clause.
How does it work internally? Whenever a JOIN command is given, then each record of the first table is evaluated with each record of the second table. That means a Cartesian product happens and then as per the matching values of where clause, those records are shown. As per the Emp and Dept tables, 14 records of Emp are cross evaluated with 4 records of Dept table. So overall 14 * 4 = 56 evaluations will happen and matching value records are shown.
Now, let us remove the where clause and see what happens.
Oops!! The Cartesian product result is displayed now! Due to the absence of where clause, each evaluation has been displayed as well. So this output is logically going wrong! Actually we have accidentally landed in another type of join that is Cross Join.
Now, let us have the common field for display purpose. We will include the Deptno column along with Ename and Dname columns in the column list.
The common field’s name is same in both the tables. So a confusion arose in understanding that deptno from which table should be considered. The error would not have come if the common column had different name. In such a case where name is same, it is mandatory to mention the table name and . (dot) separator as prefix .
Let us rectify the above query.
Ideally table name prefix should be mentioned for each column irrespective of whether it is common or uncommon. Advantage of this approach is that it will give clarity to any other user who is not habitual with the tables & its columns.
So the same query will be written as follows:
JOIN – Best practice
So now I am coming to the best practice of writing a JOIN statement.
1. The comma within from clause gets replaced by Join type specification
2. The JOIN condition has to be given using ON clause instead of where clause.
The select statement will look like this:
It means that On clause is mandatory in ANSI type of join. The advantage of ANSI join is that it is a dedicated inner join and it prevents an accidental Cartesian product.
What if there is a many-to-many relationship within the tables? Right now the deptno column is not having a primary constraint in the Dept table. So we can add one more record of the same deptno 10.
Let us see that –
Now we will execute the same ANSI join statement again.
CLARK, KING and MILLER are displayed in both Accounting & Testing departments.
An Inner join will give correct result only when there is one-to-many relationship. Therefore normally this problem gets eliminated when we have Primary Key and Foreign Key relationship within the common field.
Let us delete this additional record of deptno 10 so that everything goes back to normal.
2. The data type and size of common fields should be ideally the same.
3. If the data type is not same, then using type casting functions the data types should be made same in the On clause.
4. There should be one-to-many relationship within the common field values.
I hope through this article, your Inner Join principle now is as sound as a dollar!
We can implement RDBMS (Relational Database Management System) concepts through any database, such as SQL Server. So the same entity’s information flows down in different tables. This is part of the Normalization process within RDBMS framework.
However as part of the requirement, we may need to show a report containing customer information from multiple tables, on a matching column condition. Here arises the need of using the technique of Joins in SQL scripting.
Example - Let us create an Employee specific set of 2 tables in a new database.
Execute the following script in your instance of SQL Server.
Use Master go Create Database Inner_Join go Use Inner_Join go CREATE TABLE EMP (EMPNO Integer, ENAME Varchar(15), JOB Varchar(9), MGR Integer, HIREDATE datetime, SAL Integer, COMM Integer, DEPTNO Integer) go
INSERT INTO EMP VALUES (7369, 'SMITH', 'CLERK', 7902, '12/17/80', 800, NULL, 20) INSERT INTO EMP VALUES (7499, 'ALLEN', 'SALESMAN', 7698, '5/20/81', 1600, 300, 30) INSERT INTO EMP VALUES (7521, 'WARD', 'SALESMAN', 7698, '5/22/81', 1250, 500, 30) INSERT INTO EMP VALUES (7566, 'JONES', 'MANAGER', 7839, '4/2/81', 2975, NULL, 20) INSERT INTO EMP VALUES (7654, 'MARTIN', 'SALESMAN', 7698, '9/28/81', 1250, 1400, 30) INSERT INTO EMP VALUES (7698, 'BLAKE', 'MANAGER', 7839, '5/1/81', 2850, NULL, 30) INSERT INTO EMP VALUES (7782, 'CLARK', 'MANAGER', 7839, '6/9/81', 2450, NULL, 10) INSERT INTO EMP VALUES (7788, 'SCOTT', 'ANALYST', 7566, '12/9/82', 3000, NULL, 20) INSERT INTO EMP VALUES (7839, 'KING', 'PRESIDENT', NULL, '11/17/81', 5000, NULL, 10) INSERT INTO EMP VALUES (7844, 'TURNER', 'SALESMAN', 7698, '9/8/1981',1500, 0, 30) INSERT INTO EMP VALUES (7876, 'ADAMS', 'CLERK', 7788, '1/12/83', 1100, NULL, 20) INSERT INTO EMP VALUES (7900, 'JAMES', 'CLERK', 7698, '12/3/81', 950, NULL, 30) INSERT INTO EMP VALUES (7902, 'FORD', 'ANALYST', 7566, '12/3/81', 3000, NULL, 20) INSERT INTO EMP VALUES (7934, 'MILLER', 'CLERK', 7782, '1/23/82', 1300, NULL, 10)
go CREATE TABLE DEPT (DEPTNO INTEGER, DNAME VARCHAR(14), LOC VARCHAR(13) ) go INSERT INTO DEPT VALUES (10,'ACCOUNTING','NEW YORK') INSERT INTO DEPT VALUES (20,'RESEARCH','DALLAS') INSERT INTO DEPT VALUES (30,'SALES','CHICAGO') INSERT INTO DEPT VALUES (40,'OPERATIONS','BOSTON') goLet us see the tables to understand the relationship.
use Inner_Join go select * from dept go select * from emp go
You can observe that Deptno is the common field in both the tables, Dept & Emp. Department details are in Dept table and Employee table contains a DEPTNO column which represents the Department Number an Employee belongs to. Let us see two records of Emp table to get a clarity. Consider the record of SMITH in Emp table. Smith’s Deptno is 20, which means Smith’s Department Name (DNAME) is RESEARCH. Similarly consider ALLEN’s record. His Deptno is 30, which means Allen’s DNAME is Sales. So for the first two records, we have mapped the common column Deptno!
Now we want to display the ENAME and DNAME for the matching Deptno. This is possible by a technique within SQL scripting that is called as Joins.
There are multiple types of joins, but in this article we are going to focus on Inner Join or also known as Equi-Join.
Editorial Note: Check this artilce to understand Types of JOIN in SQL Server - Inner, Self, Outer and Cross JOIN
So what is Inner Join? Inner join is displaying data from multiple tables on matching values of common field (s), within the associated tables.
Inner Join Syntax
There are two types of syntaxes for Inner join:1) Non-ANSI
2) ANSI
Let us see the Non-ANSI syntax first:
select ename, dname from emp, dept where emp.deptno = dept.deptno
One thing to note is that in the from clause of SQL statement, we have two tables separated by a comma. The mapping of common column is done in the where clause.
How does it work internally? Whenever a JOIN command is given, then each record of the first table is evaluated with each record of the second table. That means a Cartesian product happens and then as per the matching values of where clause, those records are shown. As per the Emp and Dept tables, 14 records of Emp are cross evaluated with 4 records of Dept table. So overall 14 * 4 = 56 evaluations will happen and matching value records are shown.
Now, let us remove the where clause and see what happens.
select ename, dname from emp, dept
Oops!! The Cartesian product result is displayed now! Due to the absence of where clause, each evaluation has been displayed as well. So this output is logically going wrong! Actually we have accidentally landed in another type of join that is Cross Join.
Now, let us have the common field for display purpose. We will include the Deptno column along with Ename and Dname columns in the column list.
select ename, dname, deptno from emp, dept where emp.deptno = dept.deptnoIt has thrown the following error –
The common field’s name is same in both the tables. So a confusion arose in understanding that deptno from which table should be considered. The error would not have come if the common column had different name. In such a case where name is same, it is mandatory to mention the table name and . (dot) separator as prefix .
Let us rectify the above query.
select ename, dname, emp.deptno from emp, dept where emp.deptno = dept.deptno
Ideally table name prefix should be mentioned for each column irrespective of whether it is common or uncommon. Advantage of this approach is that it will give clarity to any other user who is not habitual with the tables & its columns.
So the same query will be written as follows:
select emp.ename, dept.dname, emp.deptno from emp, dept where emp.deptno = dept.deptnoBut in projects, table names many a times are lengthy. So mentioning a lengthy table name repeatedly the will be cumbersome. To avoid that, introduce a table alias in the from clause. Use that alias in the column list and where clause.
JOIN – Best practice
So now I am coming to the best practice of writing a JOIN statement.
select e.ename, d.dname, e.deptno from emp e, dept d where e.deptno = d.deptnoInternally, from clause is getting executed first, then the where clause and finally the select clause as per the above statement.
INNER JOIN – ANSI Syntax
So far we have seen the NON-ANSI syntax of joining tables. Now let us see the ANSI syntax. As per ANSI Joins, there are two changes:1. The comma within from clause gets replaced by Join type specification
2. The JOIN condition has to be given using ON clause instead of where clause.
The select statement will look like this:
select e.ename, d.dname, e.deptno from emp e Inner Join dept d On e.deptno = d.deptnoAmongst the multiple types of joins, the default type of join is an Inner Join. So Inner is an optional keyword. The query can have only Join as the keyword which will be as follows:
select e.ename, d.dname, e.deptno from emp e Join dept d On e.deptno = d.deptnoIn Non-ANSI style of join, we have observed that if the where clause is skipped, then it shows a Cartesian product output. On the same lines, let us try to omit the On clause in ANSI and let’s see what happens.
select e.ename, d.dname, e.deptno from emp e Join dept dIt is throwing an error.
It means that On clause is mandatory in ANSI type of join. The advantage of ANSI join is that it is a dedicated inner join and it prevents an accidental Cartesian product.
About one-to-many relationship for Inner join
The inner or equi join will be logically correct if we have one-to-many relationship between the values of common field. In the Emp and Dept tables, there is such relationship, so the output is correct, i.e a matching deptno is once in the dept table and it is multiple times in the emp table.What if there is a many-to-many relationship within the tables? Right now the deptno column is not having a primary constraint in the Dept table. So we can add one more record of the same deptno 10.
Insert into Dept Values(10, 'Testing', 'CALIFORNIA') GoSo right now there are 2 records of the same deptno 10 in dept table.
Let us see that –
select * from dept
Now we will execute the same ANSI join statement again.
select e.ename, d.dname, e.deptno from emp e Join dept d On e.deptno = d.deptnoThe output has logically gone wrong!
CLARK, KING and MILLER are displayed in both Accounting & Testing departments.
An Inner join will give correct result only when there is one-to-many relationship. Therefore normally this problem gets eliminated when we have Primary Key and Foreign Key relationship within the common field.
Let us delete this additional record of deptno 10 so that everything goes back to normal.
Delete from Dept where DName = 'Testing' go
Important features of Inner Join
1. Common field name may be same or different.2. The data type and size of common fields should be ideally the same.
3. If the data type is not same, then using type casting functions the data types should be made same in the On clause.
4. There should be one-to-many relationship within the common field values.
I hope through this article, your Inner Join principle now is as sound as a dollar!
No comments:
Post a Comment