By using joins, you can retrieve data from two or more tables based on logical relationships between the tables. Joins indicate how Microsoft® SQL Server™ 2000 should use data from one table to select the rows in another table.
A join condition defines the way two tables are related in a query by:
- Specifying the column from each table to be used for the join. A typical join condition specifies a foreign key from one table and its associated key in the other table.
- Specifying a logical operator (=, <>, and so on) to be used in comparing values from the columns.
Joins can be specified in either the FROM or WHERE clauses. The join conditions combine with the WHERE and HAVING search conditions to control the rows that are selected from the base tables referenced in the FROM clause.
Specifying the join conditions in the FROM clause helps separate them from any other search conditions that may be specified in a WHERE clause, and is the recommended method for specifying joins. A simplified SQL-92 FROM clause join syntax is:
FROM first_table join_type second_table [ON (join_condition)]
join_type specifies what kind of join is performed: an inner, outer, or cross join. join_condition defines the predicate to be evaluated for each pair of joined rows. This is an example of a FROM clause join specification:
FROM Suppliers JOIN Products ON (Suppliers.SupplierID = Products.SupplierID)
This is a simple SELECT statement using this join:
SELECT ProductID, Suppliers.SupplierID, CompanyName FROM Suppliers JOIN Products ON (Suppliers.SupplierID = Products.SupplierID) WHERE UnitPrice > $10 AND CompanyName LIKE N'F%' GO
The select returns the product and supplier information for any combination of parts supplied by a company for which the company name starts with the letter F and the price of the product is more than $10.
When multiple tables are referenced in a single query, all column references must be unambiguous. In the previous example, both the Products and Suppliers table have a column named SupplierID. Any column name that is duplicated between two or more tables referenced in the query must be qualified with the table name. All references to the SupplierID columns in the example are qualified.
When a column name is not duplicated in two or more tables used in the query, references to it do not have to be qualified with the table name. This is shown in the previous example. Such a SELECT statement is sometimes difficult to understand because there is nothing to indicate the table that provided each column. The readability of the query is improved if all columns are qualified with their table names. The readability is further improved if table aliases are used, especially when the table names themselves must be qualified with the database and owner names. This is the same example, except that table aliases have been assigned and the columns qualified with table aliases to improve readability:
SELECT P.ProductID, S.SupplierID, S.CompanyName FROM Suppliers AS S JOIN Products AS P ON (S.SupplierID = P.SupplierID) WHERE P.UnitPrice > $10 AND S.CompanyName LIKE N'F%'
The previous examples specified the join conditions in the FROM clause, which is the preferred method. This query contains the same join condition specified in the WHERE clause:
SELECT P.ProductID, S.SupplierID, S.CompanyName FROM Suppliers AS S, Products AS P WHERE S.SupplierID = P.SupplierID AND P.UnitPrice > $10 AND S.CompanyName LIKE N'F%'
The select list for a join can reference all the columns in the joined tables, or any subset of the columns. The select list is not required to contain columns from every table in the join. For example, in a three-table join, only one table can be used to bridge from one of the other tables to the third table, and none of the columns from the middle table have to be referenced in the select list.
Although join conditions usually have equality comparisons (=), other comparison or relational operators can be specified, as can other predicates. For more information, see Using Operators in Expressions and WHERE.
When SQL Server processes joins, the query engine chooses the most efficient method (out of several possibilities) of processing the join. Although the physical execution of various joins uses many different optimizations, the logical sequence is:
- The join conditions in the FROM clause are applied.
- The join conditions and search conditions from the WHERE clause are applied.
- The search conditions from the HAVING clause are applied.
This sequence can sometimes influence the results of the query if conditions are moved between the FROM and WHERE clauses.
Columns used in a join condition are not required to have the same name or be the same data type. However, if the data types are not identical, they must be compatible, or be types that SQL Server can implicitly convert. If the data types cannot be implicitly converted, the join condition must explicitly convert the data type using the CAST function. For more information about implicit and explicit conversions, see Data Type Conversion.
Most queries using a join can be rewritten using a subquery (a query nested within another query), and most subqueries can be rewritten as joins. For more information about subqueries, see Subquery Fundamentals.
Note Tables cannot be joined directly on ntext, text, or image columns. However, tables can be joined indirectly on ntext, text, or image columns by using SUBSTRING. For example, SELECT * FROM t1 JOIN t2 ON SUBSTRING(t1.textcolumn, 1, 20) = SUBSTRING(t2.textcolumn, 1, 20) performs a two-table inner join on the first 20 characters of each text column in tables t1 and t2. In addition, another possibility for comparing ntext or text columns from two tables is to compare the lengths of the columns with a WHERE clause, for example (where a self-join is performed on the pub_info table):
WHERE DATALENGTH(p1.pr_info) = DATALENGTH(p2.pr_info)