Published on

Joining of Tables - A Comprehensive Guide MY-SQL

Authors

Introduction

Joining tables is akin to piecing together a puzzle. It involves combining data from two or more tables based on specific criteria. This process allows us to extract meaningful information from the interconnected tables, leading to more insightful data analysis.

Primary Key

Primary key uniquely identifies tuples (rows) in a relation(table). There can only one primary key in a table.

Foreign Key

Foreign Key is a field in a table which references its values from the primary key field of another table. There can multiple foreign keys in a single table. This is also known as referential integrity.

Degree

Degree is the number of columns in a table

Cardinality

Cardinality is the number of rows in a table.

student.sql

+-------+----------------+
| admno |    name        |
+-------+----------------+
| 1     | John Doe       |
| 2     | Jane Smith     |
| 3     | Alice Johnson  |
| 4     | Bob Wilson     |
| 5     | Eva Davis      |
+-------+----------------+

student_class.sql

+-------+------------+---------+
| admno |   class    | section |
+-------+------------+---------+
| 1     | Math       | A       |
| 2     | English    | B       |
| 3     | Science    | A       |
| 4     | History    | C       |
| 5     | Geography  | B       |
+-------+------------+---------+

student Table:

  • Degree: 2 (Two columns: admno and name)

  • Cardinality: Assuming that the admno column is unique for each student (which is typical for student ID numbers), the cardinality of the student table is 5. This means there are 5 students in the table.

student_class Table:

Degree: 3 (Three columns: admno, class, and section)

Cardinality: The cardinality of the student_class table is also 5 because it contains records for 5 different students' class and section assignments. Each row represents a different student's enrollment in a specific class and section, and there are 5 such records in the table.

Non-Equi / Cartesian Product / Cross join

In non-equi join every record of the first table gets joined with every other record of second table.

Consider the table students and student_class

student.sql

+-------+----------------+
| admno |    name        |
+-------+----------------+
| 1     | John Doe       |
| 2     | Jane Smith     |
| 3     | Alice Johnson  |
| 4     | Bob Wilson     |
| 5     | Eva Davis      |
+-------+----------------+

student_class.sql

+-------+------------+---------+
| admno |   class    | section |
+-------+------------+---------+
| 1     | Math       | A       |
| 2     | English    | B       |
| 3     | Science    | A       |
| 4     | History    | C       |
| 5     | Geography  | B       |
+-------+------------+---------+

To specify the non-equi join, we write the command as:-

query.sql

SELECT * FROM students, student_class;

-- Output

+-------+----------------+-------+------------+---------+
| admno |    name        | admno |   class    | section |
+-------+----------------+-------+------------+---------+
| 1     | John Doe       | 1     | Math       | A       |
| 1     | John Doe       | 2     | English    | B       |
| 1     | John Doe       | 3     | Science    | A       |
| 1     | John Doe       | 4     | History    | C       |
| 1     | John Doe       | 5     | Geography  | B       |
| 2     | Jane Smith     | 1     | Math       | A       |
| 2     | Jane Smith     | 2     | English    | B       |
| 2     | Jane Smith     | 3     | Science    | A       |
| 2     | Jane Smith     | 4     | History    | C       |
| 2     | Jane Smith     | 5     | Geography  | B       |
| 3     | Alice Johnson  | 1     | Math       | A       |
| 3     | Alice Johnson  | 2     | English    | B       |
| 3     | Alice Johnson  | 3     | Science    | A       |
| 3     | Alice Johnson  | 4     | History    | C       |
| 3     | Alice Johnson  | 5     | Geography  | B       |
| 4     | Bob Wilson     | 1     | Math       | A       |
| 4     | Bob Wilson     | 2     | English    | B       |
| 4     | Bob Wilson     | 3     | Science    | A       |
| 4     | Bob Wilson     | 4     | History    | C       |
| 4     | Bob Wilson     | 5     | Geography  | B       |
| 5     | Eva Davis      | 1     | Math       | A       |
| 5     | Eva Davis      | 2     | English    | B       |
| 5     | Eva Davis      | 3     | Science    | A       |
| 5     | Eva Davis      | 4     | History    | C       |
| 5     | Eva Davis      | 5     | Geography  | B       |
+-------+----------------+-------+------------+---------+

Equi-Join

In equi-join, we join two tables on the basis of common field in two tables.

Consider the table students and student_class

student.sql

+-------+----------------+
| admno |    name        |
+-------+----------------+
| 1     | John Doe       |
| 2     | Jane Smith     |
| 3     | Alice Johnson  |
| 4     | Bob Wilson     |
| 5     | Eva Davis      |
+-------+----------------+

student_class.sql

+-------+------------+---------+
| admno |   class    | section |
+-------+------------+---------+
| 1     | Math       | A       |
| 2     | English    | B       |
| 3     | Science    | A       |
| 4     | History    | C       |
| 5     | Geography  | B       |
+-------+------------+---------+

To display name, class and section, the command will be:-

query.sql

SELECT name, class, section FROM students, student_class
WHERE students.admno = student_class.admno;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+

If we want to display the filed having same name in both the tables then we have to prefix it with tablename and then put a dot (.) and then write fieldname eg. In equi join we specify the admno field of students table with students.admno and to specify the admno field of student_class table then we write student_class.admno

We can also use an alias name to the table something like this.

query.sql

SELECT name, class, section FROM students a, student_class b
WHERE a.admno = b.admno;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+

To display also the admno in the above query we will be giving the command:-

query.sql

SELECT a.admno, name, class, section FROM students a, student_class b
WHERE a.admno = b.admno;

-- Output

+-------+----------------+------------+---------+
| admno |     name       |   class    | section |
+-------+----------------+------------+---------+
| 1     | John Doe       | Math       | A       |
| 2     | Jane Smith     | English    | B       |
| 3     | Alice Johnson  | Science    | A       |
| 4     | Bob Wilson     | History    | C       |
| 5     | Eva Davis      | Geography  | B       |
+-------+----------------+------------+---------+

To add more of the condition to the table we can join it by using and in where clause:-

If we want to display the admno, name, class and section of the students of class XII then the command will be:-

query.sql

SELECT name, class, section FROM students a, student_class b
WHERE a.admno = b.admno and class="English";

-- Output

+-------------+---------+---------+
|     name    |  class  | section |
+-------------+---------+---------+
| Jane Smith  | English | B       |
+-------------+---------+---------+

If we want to display the admno, name, class and section of the students of class XI - B then the command will be:-

query.sql

SELECT name, class, section FROM students a, student_class b
WHERE a.admno = b.admno and class="English" and section="B";

-- Output

+-------------+---------+---------+
|     name    |  class  | section |
+-------------+---------+---------+
| Jane Smith  | English | B       |
+-------------+---------+---------+

Union

  • Union is an operation of combining the output of two SELECT statements.
  • Union of two SELECT statements can be performed only if their outputs contain same number of columns, data types of corresponding columns are also the same.
  • Union does not display any duplicate rows unless ALL is specified with it.
query.sql

SELECT name FROM students 
UNION 
SELECT class from student_class

-- Output

+----------------+
|     name       |
+----------------+
| John Doe       |
| Jane Smith     |
| Alice Johnson  |
| Bob Wilson     |
| Eva Davis      |
| Math           |
| English        |
| Science        |
| History        |
| Geography      |
+----------------+

query.sql

SELECT name FROM students 
UNION ALL
SELECT class from student_class

-- Output

+----------------+
|     name       |
+----------------+
| John Doe       |
| Jane Smith     |
| Alice Johnson  |
| Bob Wilson     |
| Eva Davis      |
| Math           |
| English        |
| Science        |
| History        |
| Geography      |
| Math           |
| English        |
| Science        | 
| History        |
| Geography      |
+----------------+

In MySQL, there are several types of joins that allow you to combine rows from two or more tables based on a related column between them. Here are the most common types of joins in MySQL:

INNER JOIN

An inner join returns only the rows that have matching values in both tables. Syntax: SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column;


SELECT student.name, student_class.class, student_class.section
FROM student
INNER JOIN student_class ON student.admno = student_class.admno;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+

LEFT JOIN (or LEFT OUTER JOIN)

A left join returns all the rows from the left table (table1) and the matched rows from the right table (table2). If there is no match, NULL values are returned for the columns from the right table. Syntax: SELECT * FROM table1 LEFT JOIN table2 ON table1.column = table2.column;


SELECT student.name, student_class.class, student_class.section
FROM student
LEFT JOIN student_class ON student.admno = student_class.admno;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+

RIGHT JOIN (or RIGHT OUTER JOIN)

A right join returns all the rows from the right table (table2) and the matched rows from the left table (table1). If there is no match, NULL values are returned for the columns from the left table.

Syntax: SELECT * FROM table1 RIGHT JOIN table2 ON table1.column = table2.column;


SELECT student.name, student_class.class, student_class.section
FROM student
RIGHT JOIN student_class ON student.admno = student_class.admno;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+

FULL OUTER JOIN (or FULL JOIN)

A full outer join returns all the rows when there is a match in either the left table (table1) or the right table (table2). If there is no match, NULL values are returned for the columns from the non-matching table. Note: FULL OUTER JOIN is not directly supported in MySQL, but it can be emulated using a combination of LEFT JOIN and RIGHT JOIN.

Syntax: Not directly supported, but can be emulated.


SELECT student.name, student_class.class, student_class.section
FROM student
LEFT JOIN student_class ON student.admno = student_class.admno
UNION ALL
SELECT NULL, student_class.class, student_class.section
FROM student_class
LEFT JOIN student ON student.admno = student_class.admno
WHERE student.admno IS NULL;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
| NULL           | Math       | A       |
| NULL           | English    | B       |
| NULL           | Science    | A       |
| NULL           | History    | C       |
| NULL           | Geography  | B       |
+----------------+------------+---------+

SELF JOIN

A self join is used to combine rows from the same table. It's often used to find relationships within the same table. Syntax: SELECT * FROM table1 AS t1 INNER JOIN table1 AS t2 ON t1.column = t2.column;


SELECT s1.name AS student_name, s2.name AS friend_name
FROM students s1
INNER JOIN students s2 ON s1.admno <> s2.admno;

-- Output

+---------------+--------------+
| student_name  | friend_name  |
+---------------+--------------+
| John Doe      | Jane Smith   |
| John Doe      | Alice Johnson|
| John Doe      | Bob Wilson   |
| John Doe      | Eva Davis    |
| Jane Smith    | John Doe     |
| Jane Smith    | Alice Johnson|
| Jane Smith    | Bob Wilson   |
| Jane Smith    | Eva Davis    |
| Alice Johnson | John Doe     |
| Alice Johnson | Jane Smith   |
| Alice Johnson | Bob Wilson   |
| Alice Johnson | Eva Davis    |
| Bob Wilson    | John Doe     |
| Bob Wilson    | Jane Smith   |
| Bob Wilson    | Alice Johnson|
| Bob Wilson    | Eva Davis    |
| Eva Davis     | John Doe     |
| Eva Davis     | Jane Smith   |
| Eva Davis     | Alice Johnson|
| Eva Davis     | Bob Wilson   |
+---------------+--------------+

CROSS JOIN (or Cartesian Join)

A cross join combines all rows from the first table with all rows from the second table, creating a Cartesian product. Syntax: SELECT * FROM table1 CROSS JOIN table2;


SELECT students.name, student_class.class, student_class.section
FROM students
CROSS JOIN student_class;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| John Doe       | English    | B       |
| John Doe       | Science    | A       |
| John Doe       | History    | C       |
| John Doe       | Geography  | B       |
| Jane Smith     | Math       | A       |
| Jane Smith     | English    | B       |
| Jane Smith     | Science    | A       |
| Jane Smith     | History    | C       |
| Jane Smith     | Geography  | B       |
| Alice Johnson  | Math       | A       |
| Alice Johnson  | English    | B       |
| Alice Johnson  | Science    | A       |
| Alice Johnson  | History    | C       |
| Alice Johnson  | Geography  | B       |
| Bob Wilson     | Math       | A       |
| Bob Wilson     | English    | B       |
| Bob Wilson     | Science    | A       |
| Bob Wilson     | History    | C       |
| Bob Wilson     | Geography  | B       |
| Eva Davis      | Math       | A       |
| Eva Davis      | English    | B       |
| Eva Davis      | Science    | A       |
| Eva Davis      | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+

NATURAL JOIN

A natural join is based on columns with the same name in both tables. It returns rows with matching values in those columns. Syntax: SELECT * FROM table1 NATURAL JOIN table2;


SELECT students.name, student_class.class, student_class.section
FROM students
NATURAL JOIN student_class;

-- Output

+----------------+------------+---------+
|     name       |   class    | section |
+----------------+------------+---------+
| John Doe       | Math       | A       |
| Jane Smith     | English    | B       |
| Alice Johnson  | Science    | A       |
| Bob Wilson     | History    | C       |
| Eva Davis      | Geography  | B       |
+----------------+------------+---------+


SELF OUTER JOIN

A self outer join combines rows from the same table, similar to a self join, but it includes non-matching rows from the same table. Syntax: Not directly supported in MySQL but can be emulated using UNION.


SELECT s1.name AS student1_name, s2.name AS student2_name, s1.class AS student1_class, s2.class AS student2_class
FROM student_class s1
LEFT JOIN student_class s2 ON s1.class = s2.class
WHERE s1.admno <> s2.admno;

-- Output

+---------------+--------------+----------------+-----------------+
| student1_name | student2_name| student1_class | student2_class  |
+---------------+--------------+----------------+-----------------+
| John Doe      | Jane Smith   | Math           | Math            |
| Jane Smith    | John Doe     | English        | English         |
| Jane Smith    | Eva Davis    | English        | English         |
| Alice Johnson | Alice Johnson| Science        | Science         |
| Bob Wilson    | Eva Davis    | History        | History         |
| Eva Davis     | Jane Smith   | Geography      | Geography       |
| Jane Smith    | Eva Davis    | Geography      | Geography       |
+---------------+--------------+----------------+-----------------+

These are the primary types of joins in MySQL. The choice of which join to use depends on your specific data requirements and how you want to combine data from multiple tables.

Conclusion

In this comprehensive guide to joining tables in MySQL, we have explored the fundamental concepts and techniques necessary for working with relational databases. Joining tables is a critical skill for anyone involved in database management, data analysis, or software development. Here are the key takeaways from this guide:

  • Understanding Relational Databases: Relational databases are structured collections of data organized into tables, where relationships between tables are established using keys, primarily primary and foreign keys.

  • Types of Joins: MySQL supports various types of joins, including INNER JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN), and FULL JOIN (or FULL OUTER JOIN). Each join type serves a specific purpose in combining data from multiple tables.

  • Syntax: The basic syntax for joining tables involves specifying the tables to join, the columns on which to join them, and any filtering conditions in the WHERE clause.

  • Primary and Foreign Keys: Primary keys uniquely identify records in a table, while foreign keys establish relationships between tables. Properly defining and using keys is crucial for efficient and accurate data retrieval.

  • Alias Usage: Table aliases (AS) can be used to simplify queries with multiple joins by providing short, convenient names for tables.

  • Filtering Data: Filtering criteria can be applied to joined tables in the WHERE clause to extract specific data based on conditions.

  • Performance Considerations: Proper indexing, minimizing the use of wildcard (*) in SELECT statements, and optimizing queries using EXPLAIN are essential for improving join query performance.

  • Complex Joins: Complex joins may involve multiple tables, subqueries, or self-joins. Understanding how to approach these situations is vital for handling real-world data scenarios.

  • Troubleshooting: Debugging and troubleshooting join queries may require examining error messages, analyzing query execution plans, and understanding the database schema thoroughly.

  • Best Practices: Following best practices, such as naming conventions, using aliases, and documenting your queries, can enhance the maintainability and readability of your SQL code.

FAQs

  1. What is the difference between INNER JOIN and LEFT JOIN?

INNER JOIN: Returns only the rows that have matching values in both tables being joined. LEFT JOIN: Returns all rows from the left table and the matching rows from the right table. If there is no match, NULL values are included for columns from the right table.

  1. Can I join more than two tables in a single query?

Yes, you can join more than two tables in a single query by extending the join clauses and specifying the appropriate relationships between tables.

  1. How can I improve the performance of join queries?

To improve join query performance, you can:

  • Ensure that the joined columns are indexed.
  • Limit the use of wildcard (*) in SELECT statements.
  • Use EXPLAIN to analyze query execution plans and identify bottlenecks.
  1. What should I do if I encounter a "Ambiguous column name" error in a join query?

This error occurs when a column name exists in multiple tables being joined, and the database engine cannot determine which one to use. To resolve this, prefix the column name with the table alias or table name to make it unambiguous.

  1. Are there any limitations to joining tables in MySQL?

While MySQL provides powerful tools for joining tables, it's important to consider the database's size, indexing, and hardware resources. Very large joins can impact performance and may require careful optimization.

This guide should provide you with a solid foundation for joining tables in MySQL. Remember that practice is key to mastering this skill, so experiment with different scenarios and continue to build your expertise in working with relational databases.