Narasimha's Blog on Microsoft Business Intelligence and TABLEAU: 2009

Wednesday 30 December 2009

SQL BASICS

HERE I AM GIVING BASIC FUNDAMENTALS OF SQL.
SQL is a standard computer language for accessing and manipulating databases.
________________________________________
What is SQL?
• SQL stands for Structured Query Language
• SQL allows you to access a database
• SQL is an ANSI standard computer language
• SQL can execute queries against a database
• SQL can retrieve data from a database
• SQL can insert new records in a database
• SQL can delete records from a database
• SQL can update records in a database
• SQL is easy to learn
________________________________________
SQL is a Standard - BUT....
SQL is an ANSI (American National Standards Institute) standard computer language for accessing and manipulating database systems. SQL statements are used to retrieve and update data in a database. SQL works with database programs like MS Access, DB2, Informix, MS SQL Server, Oracle, Sybase, etc.
Unfortunately, there are many different versions of the SQL language, but to be in compliance with the ANSI standard, they must support the same major keywords in a similar manner (such as SELECT, UPDATE, DELETE, INSERT, WHERE, and others).
Note: Most of the SQL database programs also have their own proprietary extensions in addition to the SQL standard!
________________________________________
SQL Database Tables
A database most often contains one or more tables. Each table is identified by a name (e.g. "Customers" or "Orders"). Tables contain records (rows) with data.
Below is an example of a table called "Persons":

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

The table above contains three records (one for each person) and four columns (LastName, FirstName, Address, and City).
________________________________________
SQL Queries
With SQL, we can query a database and have a result set returned.
A query like this:
SELECT LastName FROM Persons
Gives a result set like this:

LastName
Hansen
Svendson
Pettersen

Note: Some database systems require a semicolon at the end of the SQL statement. We don't use the semicolon in our tutorials.
________________________________________
SQL Data Manipulation Language (DML)
SQL (Structured Query Language) is a syntax for executing queries. But the SQL language also includes a syntax to update, insert, and delete records.
These query and update commands together form the Data Manipulation Language (DML) part of SQL:
• SELECT - extracts data from a database table
• UPDATE - updates data in a database table
• DELETE - deletes data from a database table
• INSERT INTO - inserts new data into a database table
________________________________________
SQL Data Definition Language (DDL)
The Data Definition Language (DDL) part of SQL permits database tables to be created or deleted. We can also define indexes (keys), specify links between tables, and impose constraints between database tables.
The most important DDL statements in SQL are:
• CREATE TABLE - creates a new database table
• ALTER TABLE - alters (changes) a database table
• DROP TABLE - deletes a database table
• CREATE INDEX - creates an index (search key)
• DROP INDEX - deletes an index
________________________________________

The SQL SELECT Statement
The SELECT statement is used to select data from a table. The tabular result is stored in a result table (called the result-set).
Syntax
SELECT column_name(s)
FROM table_name
Note: SQL statements are not case sensitive. SELECT is the same as select.
________________________________________
SQL SELECT Example
To select the content of columns named "LastName" and "FirstName", from the database table called "Persons", use a SELECT statement like this:
SELECT LastName,FirstName FROM Persons
The database table "Persons":

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

The resultLastName FirstName

Hansen Ola
Svendson Tove
Pettersen Kari

________________________________________
Select All Columns

To select all columns from the "Persons" table, use a * symbol instead of column names, like this:
SELECT * FROM Persons
Result

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

________________________________________

________________________________________
Semicolon after SQL Statements?

Semicolon is the standard way to separate each SQL statement in database systems that allow more than one SQL statement to be executed in the same call to the server.
Some SQL tutorials end each SQL statement with a semicolon. Is this necessary? We are using MS Access and SQL Server 2000 and we do not have to put a semicolon after each SQL statement, but some database programs force you to use it.
________________________________________
The SELECT DISTINCT Statement

The DISTINCT keyword is used to return only distinct (different) values.
The SELECT statement returns information from table columns. But what if we only want to select distinct elements?
With SQL, all we need to do is to add a DISTINCT keyword to the SELECT statement:
Syntax
SELECT DISTINCT column_name(s)
FROM table_name

________________________________________
Using the DISTINCT keyword

To select ALL values from the column named "Company" we use a SELECT statement like this:
SELECT Company FROM Orders
"Orders" table
Company OrderNumber
Sega 3412
W3Schools 2312
Trio 4678
W3Schools 6798
Result
Company
Sega
W3Schools
Trio
W3Schools

To select only DIFFERENT values from the column named "Company" we use a SELECT DISTINCT statement like this:
SELECT DISTINCT Company FROM Orders
Result:

Company
Sega
W3Schools
Trio

The WHERE clause is used to specify a selection criterion.
________________________________________
The WHERE Clause

To conditionally select data from a table, a WHERE clause can be added to the SELECT statement.
Syntax
SELECT column FROM table
WHERE column operator value
With the WHERE clause, the following operators can be used:

Operator Description
= Equal
<> Not equal
> Greater than
<>= Greater than or equal
<= Less than or equal BETWEEN Between an inclusive range LIKE Search for a pattern Note: In some versions of SQL the <> operator may be written as! =
________________________________________
Using the WHERE Clause

To select only the persons living in the city "Sandnes", we add a WHERE clause to the SELECT statement:
SELECT * FROM Persons
WHERE City='Sandnes'
"Persons" table

LastName FirstName Address City Year
Hansen Ola Timoteivn 10 Sandnes 1951
Svendson Tove Borgvn 23 Sandnes 1978
Svendson Stale Kaivn 18 Sandnes 1980
Pettersen Kari Storgt 20 Stavanger 1960
Result

LastName FirstName Address City Year
Hansen Ola Timoteivn 10 Sandnes 1951
Svendson Tove Borgvn 23 Sandnes 1978
Svendson Stale Kaivn 18 Sandnes 1980

________________________________________
Using Quotes

Note that we have used single quotes around the conditional values in the examples.
SQL uses single quotes around text values (most database systems will also accept double quotes). Numeric values should not be enclosed in quotes.
For text values:
This is correct:
SELECT * FROM Persons WHERE FirstName='Tove'
This is wrong:
SELECT * FROM Persons WHERE FirstName=Tove
For numeric values:
This is correct:
SELECT * FROM Persons WHERE Year>1965
This is wrong:
SELECT * FROM Persons WHERE Year>'1965'

________________________________________
The LIKE Condition
The LIKE condition is used to specify a search for a pattern in a column.
Syntax
SELECT column FROM table
WHERE column LIKE pattern
A "%" sign can be used to define wildcards (missing letters in the pattern) both before and after the pattern.
________________________________________
Using LIKE
The following SQL statement will return persons with first names that start with an 'O':
SELECT * FROM Persons
WHERE FirstName LIKE 'O%'
The following SQL statement will return persons with first names that end with an 'a':
SELECT * FROM Persons
WHERE FirstName LIKE '%a'
The following SQL statement will return persons with first names that contain the pattern 'la':
SELECT * FROM Persons
WHERE FirstName LIKE '%la%'

SQL INSERT INTO Statement
________________________________________
The INSERT INTO Statement
The INSERT INTO statement is used to insert new rows into a table.
Syntax
INSERT INTO table_name
VALUES (value1, value2,....)
You can also specify the columns for which you want to insert data:
INSERT INTO table_name (column1, column2,...)
VALUES (value1, value2,....)

________________________________________
Insert a New Row
This "Persons" table:
LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
And this SQL statement:
INSERT INTO Persons
VALUES ('Hetland', 'Camilla', 'Hagabakka 24', 'Sandnes')
Will give this result:
LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Hetland Camilla Hagabakka 24 Sandnes

________________________________________
Insert Data in Specified Columns
This "Persons" table:
LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Hetland Camilla Hagabakka 24 Sandnes
And This SQL statement:
INSERT INTO Persons (LastName, Address)
VALUES ('Rasmussen', 'Storgt 67')
Will give this result:
LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Hetland Camilla Hagabakka 24 Sandnes
Rasmussen Storgt 67

SQL UPDATE Statement
________________________________________
The Update Statement
The UPDATE statement is used to modify the data in a table.
Syntax
UPDATE table_name
SET column_name = new_value
WHERE column_name = some_value

________________________________________
Person:
LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Storgt 67

________________________________________
Update one Column in a Row
We want to add a first name to the person with a last name of "Rasmussen":
UPDATE Person SET FirstName = 'Nina'
WHERE LastName = 'Rasmussen'
Result:

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Storgt 67

________________________________________
Update several Columns in a Row
We want to change the address and add the name of the city:
UPDATE Person
SET Address = 'Stien 12', City = 'Stavanger'
WHERE LastName = 'Rasmussen'
Result:

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Stien 12 Stavanger

SQL DELETE Statement
________________________________________
The DELETE Statement
The DELETE statement is used to delete rows in a table.
Syntax
DELETE FROM table_name
WHERE column_name = some_value

________________________________________
Person:
LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger
Rasmussen Nina Stien 12 Stavanger

________________________________________
Delete a Row
"Nina Rasmussen" is going to be deleted:
DELETE FROM Person WHERE LastName = 'Rasmussen'
Result

LastName FirstName Address City
Nilsen Fred Kirkegt 56 Stavanger

________________________________________
Delete All Rows
It is possible to delete all rows in a table without deleting the table. This means that the table structure, attributes, and indexes will be intact:
DELETE FROM table_name
or
DELETE * FROM table_name

SQL Try It
________________________________________
Test your SQL Skills
On this page you can test your SQL skills.
We will use the Customers table in the Northwind database:

CompanyName ContactName Address City
Alfreds Futterkiste Maria Anders Obere Str. 57 Berlin
Berglunds snabbköp Christina Berglund Berguvsvägen 8 Luleå
Centro comercial Moctezuma Francisco Chang Sierras de Granada 9993 México D.F.
Ernst Handel Roland Mendel Kirchgasse 6 Graz
FISSA Fabrica Inter. Salchichas S.A. Diego Roel C/ Moralzarzal, 86 Madrid
Galería del gastrónomo Eduardo Saavedra Rambla de Cataluña, 23 Barcelona
Island Trading Helen Bennett Garden House Crowther Way Cowes
Königlich Essen Philip Cramer Maubelstr. 90 Brandenburg
Laughing Bacchus Wine Cellars Yoshi Tannamuri 1900 Oak St. Vancouver
Magazzini Alimentari Riuniti Giovanni Rovelli Via Ludovico il Moro 22 Bergamo
North/South Simon Crowther South House 300 Queensbridge London
Paris spécialités Marie Bertrand 265, boulevard Charonne Paris
Rattlesnake Canyon Grocery Paula Wilson 2817 Milton Dr. Albuquerque
Simons bistro Jytte Petersen Vinbæltet 34 København
The Big Cheese Liz Nixon 89 Jefferson Way Suite 2 Portland
Vaffeljernet Palle Ibsen Smagsløget 45 Århus
Wolski Zajazd Zbyszek Piestrzeniewicz ul. Filtrowa 68 Warszawa

To preserve space, the table above is a subset of the Customers table used in the example below.
________________________________________
Try it Yourself
To see how SQL works, you can copy the SQL statements below and paste them into the textarea, or you can make your own SQL statements.
SELECT * FROM customers

SELECT CompanyName, ContactName
FROM customers

SELECT * FROM customers
WHERE companyname LIKE 'a%'

SELECT CompanyName, ContactName
FROM customers
WHERE CompanyName > 'g'
AND ContactName > 'g'

SQL ORDER BY
________________________________________
The ORDER BY keyword is used to sort the result.
________________________________________
Sort the Rows
The ORDER BY clause is used to sort the rows.
Orders:

Company OrderNumber
Sega 3412
ABC Shop 5678
W3Schools 2312
W3Schools 6798

Example
To display the companies in alphabetical order:
SELECT Company, OrderNumber FROM Orders
ORDER BY Company
Result:

Company OrderNumber
ABC Shop 5678
Sega 3412
W3Schools 6798
W3Schools 2312

Example
To display the companies in alphabetical order AND the ordernumbers in numerical order:
SELECT Company, OrderNumber FROM Orders
ORDER BY Company, OrderNumber
Result:
Company OrderNumber
ABC Shop 5678
Sega 3412
W3Schools 2312
W3Schools 6798
Example
To display the companies in reverse alphabetical order:
SELECT Company, OrderNumber FROM Orders
ORDER BY Company DESC
Result:

Company OrderNumber
W3Schools 6798
W3Schools 2312
Sega 3412
ABC Shop 5678

Example

To display the companies in reverse alphabetical order AND the ordernumbers in numerical order:
SELECT Company, OrderNumber FROM Orders
ORDER BY Company DESC, OrderNumber ASC
Result:

Company OrderNumber
W3Schools 2312
W3Schools 6798
Sega 3412
ABC Shop 5678

SQL AND & OR
________________________________________
AND & OR
AND and OR join two or more conditions in a WHERE clause.
The AND operator displays a row if ALL conditions listed are true. The OR operator displays a row if ANY of the conditions listed are true.
________________________________________
Original Table (used in the examples)

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Svendson Stephen Kaivn 18 Sandnes

________________________________________
Example
Use AND to display each person with the first name equal to "Tove", and the last name equal to "Svendson":
SELECT * FROM Persons
WHERE FirstName='Tove'
AND LastName='Svendson'
Result:

LastName FirstName Address City
Svendson Tove Borgvn 23 Sandnes

Example
Use OR to display each person with the first name equal to "Tove", or the last name equal to "Svendson":
SELECT * FROM Persons
WHERE firstname='Tove'
OR lastname='Svendson'
Result:

LastName FirstName Address City
Svendson Tove Borgvn 23 Sandnes
Svendson Stephen Kaivn 18 Sandnes

Example
You can also combine AND and OR (use parentheses to form complex expressions):
SELECT * FROM Persons WHERE
(FirstName='Tove' OR FirstName='Stephen')
AND LastName='Svendson'
Result:

LastName FirstName Address City
Svendson Tove Borgvn 23 Sandnes
Svendson Stephen Kaivn 18 Sandnes

SQL IN
________________________________________
IN
The IN operator may be used if you know the exact value you want to return for at least one of the columns.
SELECT column_name FROM table_name
WHERE column_name IN (value1,value2,..)

________________________________________
Original Table (used in the examples)

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Nordmann Anna Neset 18 Sandnes
Pettersen Kari Storgt 20 Stavanger
Svendson Tove Borgvn 23 Sandnes

________________________________________
Example 1
To display the persons with LastName equal to "Hansen" or "Pettersen", use the following SQL:
SELECT * FROM Persons
WHERE LastName IN ('Hansen','Pettersen')
Result:

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Pettersen Kari Storgt 20 Stavanger

SQL BETWEEN
________________________________________
BETWEEN ... AND
The BETWEEN ... AND operator selects a range of data between two values. These values can be numbers, text, or dates.
SELECT column_name FROM table_name
WHERE column_name
BETWEEN value1 AND value2

________________________________________
Original Table (used in the examples)

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Nordmann Anna Neset 18 Sandnes
Pettersen Kari Storgt 20 Stavanger
Svendson Tove Borgvn 23 Sandnes

________________________________________
Example 1

To display the persons alphabetically between (and including) "Hansen" and exclusive "Pettersen", use the following SQL:
SELECT * FROM Persons WHERE LastName
BETWEEN 'Hansen' AND 'Pettersen'
Result:

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Nordmann Anna Neset 18 Sandnes

IMPORTANT! The BETWEEN...AND operator is treated differently in different databases. With some databases a person with the LastName of "Hansen" or "Pettersen" will not be listed (BETWEEN..AND only selects fields that are between and excluding the test values). With some databases a person with the last name of "Hansen" or "Pettersen" will be listed (BETWEEN..AND selects fields that are between and including the test values). With other databases a person with the last name of "Hansen" will be listed, but "Pettersen" will not be listed (BETWEEN..AND selects fields between the test values, including the first test value and excluding the last test value). Therefore: Check how your database treats the BETWEEN....AND operator!
________________________________________
Example 2

To display the persons outside the range used in the previous example, use the NOT operator:
SELECT * FROM Persons WHERE LastName
NOT BETWEEN 'Hansen' AND 'Pettersen'
Result:

LastName FirstName Address City
Pettersen Kari Storgt 20 Stavanger
Svendson Tove Borgvn 23 Sandnes

SQL Alias
________________________________________
With SQL, aliases can be used for column names and table names.
________________________________________
Column Name Alias
The syntax is:
SELECT column AS column_alias FROM table

________________________________________
Table Name Alias
The syntax is:
SELECT column FROM table AS table_alias

________________________________________
Example: Using a Column Alias
This table (Persons):

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

And this SQL:
SELECT LastName AS Family, FirstName AS Name
FROM Persons
Returns this result:
Family Name
Hansen Ola
Svendson Tove
Pettersen Kari

________________________________________
Example: Using a Table Alias

This table (Persons):

LastName FirstName Address City
Hansen Ola Timoteivn 10 Sandnes
Svendson Tove Borgvn 23 Sandnes
Pettersen Kari Storgt 20 Stavanger

And this SQL:
SELECT LastName, FirstName
FROM Persons AS Employees
Returns this result:

Table Employees:

LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari

SQL JOIN
________________________________________
Joins and Keys
Sometimes we have to select data from two or more tables to make our result complete. We have to perform a join.
Tables in a database can be related to each other with keys. A primary key is a column with a unique value for each row. Each primary key value must be unique within the table. The purpose is to bind data together, across tables, without repeating all of the data in every table.
In the "Employees" table below, the "Employee_ID" column is the primary key, meaning that no two rows can have the same Employee_ID. The Employee_ID distinguishes two persons even if they have the same name.
When you look at the example tables below, notice that:
• The "Employee_ID" column is the primary key of the "Employees" table
• The "Prod_ID" column is the primary key of the "Orders" table
• The "Employee_ID" column in the "Orders" table is used to refer to the persons in the "Employees" table without using their names
________________________________________
Employees:

Employee_ID Name
01 Hansen, Ola
02 Svendson, Tove
03 Svendson, Stephen
04 Pettersen, Kari

Orders:

Prod_ID Product Employee_ID
234 Printer 01
657 Table 03
865 Chair 03

________________________________________
Referring to Two Tables

We can select data from two tables by referring to two tables, like this:
Example
Who has ordered a product, and what did they order?
SELECT Employees.Name, Orders.Product
FROM Employees, Orders
WHERE Employees.Employee_ID=Orders.Employee_ID
Result

Name Product
Hansen, Ola Printer
Svendson, Stephen Table
Svendson, Stephen Chair

Example

Who ordered a printer?
SELECT Employees.Name
FROM Employees, Orders
WHERE Employees.Employee_ID=Orders.Employee_ID
AND Orders.Product='Printer'
Result
Name
Hansen, Ola

________________________________________
Using Joins

OR we can select data from two tables with the JOIN keyword, like this:
Example INNER JOIN

SELECT field1, field2, field3
FROM first_table
INNER JOIN second_table
ON first_table.keyfield = second_table.foreign_keyfield
Who has ordered a product, and what did they order?
SELECT Employees.Name, Orders.Product
FROM Employees
INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID
The INNER JOIN returns all rows from both tables where there is a match. If there are rows in Employees that do not have matches in Orders, those rows will not be listed.
Result

Name Product
Hansen, Ola Printer
Svendson, Stephen Table
Svendson, Stephen Chair
Example LEFT JOIN
Syntax

SELECT field1, field2, field3
FROM first_table
LEFT JOIN second_table
ON first_table.keyfield = second_table.foreign_keyfield
List all employees, and their orders - if any.
SELECT Employees.Name, Orders.Product
FROM Employees
LEFT JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID

The LEFT JOIN returns all the rows from the first table (Employees), even if there are no matches in the second table (Orders). If there are rows in Employees that do not have matches in Orders, those rows also will be listed.
Result

Name Product
Hansen, Ola Printer
Svendson, Tove
Svendson, Stephen Table
Svendson, Stephen Chair
Pettersen, Kari

Example RIGHT JOIN
Syntax

SELECT field1, field2, field3
FROM first_table
RIGHT JOIN second_table
ON first_table.keyfield = second_table.foreign_keyfield
List all orders, and who has ordered - if any.
SELECT Employees.Name, Orders.Product
FROM Employees
RIGHT JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID
The RIGHT JOIN returns all the rows from the second table (Orders), even if there are no matches in the first table (Employees). If there had been any rows in Orders that did not have matches in Employees, those rows also would have been listed.
Result

Name Product
Hansen, Ola Printer
Svendson, Stephen Table
Svendson, Stephen Chair
Example
Who ordered a printer?
SELECT Employees.Name
FROM Employees
INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID
WHERE Orders.Product = 'Printer'
Result
Name
Hansen, Ola

SQL UNION and UNION ALL
________________________________________
UNION
The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type.
Note: With UNION, only distinct values are selected.
SQL Statement 1
UNION
SQL Statement 2

________________________________________
Employees_Norway:

E_ID E_Name
01 Hansen, Ola
02 Svendson, Tove
03 Svendson, Stephen
04 Pettersen, Kari

Employees_USA:

E_ID E_Name
01 Turner, Sally
02 Kent, Clark
03 Svendson, Stephen
04 Scott, Stephen

________________________________________
Using the UNION Command
Example
List all different employee names in Norway and USA:
SELECT E_Name FROM Employees_Norway
UNION
SELECT E_Name FROM Employees_USA
Result

E_Name
Hansen, Ola
Svendson, Tove
Svendson, Stephen
Pettersen, Kari
Turner, Sally
Kent, Clark
Scott, Stephen

Note: This command cannot be used to list all employees in Norway and USA. In the example above we have two employees with equal names, and only one of them is listed. The UNION command only selects distinct values.
________________________________________
UNION ALL
The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.
SQL Statement 1
UNION ALL
SQL Statement 2

________________________________________
Using the UNION ALL Command
Example
List all employees in Norway and USA:
SELECT E_Name FROM Employees_Norway
UNION ALL
SELECT E_Name FROM Employees_USA
Result

E_Name
Hansen, Ola
Svendson, Tove
Svendson, Stephen
Pettersen, Kari
Turner, Sally
Kent, Clark
Svendson, Stephen
Scott, Stephen

SQL Create Database, Table, and Index
________________________________________
Create a Database
To create a database:
CREATE DATABASE database_name

________________________________________
Create a Table
To create a table in a database:
CREATE TABLE table_name
(
column_name1 data_type,
column_name2 data_type,
.......
)
Example
This example demonstrates how you can create a table named "Person", with four columns. The column names will be "LastName", "FirstName", "Address", and "Age":
CREATE TABLE Person
(
LastName varchar,
FirstName varchar,
Address varchar,
Age int
)
This example demonstrates how you can specify a maximum length for some columns:
CREATE TABLE Person
(
LastName varchar(30),
FirstName varchar,
Address varchar,
Age int(3)
)
The data type specifies what type of data the column can hold. The table below contains the most common data types in SQL:
Data Type Description
integer(size)
int(size)
smallint(size)
tinyint(size) Hold integers only. The maximum number of digits are specified in parenthesis.
decimal(size,d)
numeric(size,d) Hold numbers with fractions. The maximum number of digits are specified in "size". The maximum number of digits to the right of the decimal is specified in "d".
char(size) Holds a fixed length string (can contain letters, numbers, and special characters). The fixed size is specified in parenthesis.
varchar(size) Holds a variable length string (can contain letters, numbers, and special characters). The maximum size is specified in parenthesis.
date(yyyymmdd) Holds a date

________________________________________
Create Index
Indices are created in an existing table to locate rows more quickly and efficiently. It is possible to create an index on one or more columns of a table, and each index is given a name. The users cannot see the indexes, they are just used to speed up queries.
Note: Updating a table containing indexes takes more time than updating a table without, this is because the indexes also need an update. So, it is a good idea to create indexes only on columns that are often used for a search.
A Unique Index
Creates a unique index on a table. A unique index means that two rows cannot have the same index value.
CREATE UNIQUE INDEX index_name
ON table_name (column_name)
The "column_name" specifies the column you want indexed.
A Simple Index
Creates a simple index on a table. When the UNIQUE keyword is omitted, duplicate values are allowed.
CREATE INDEX index_name
ON table_name (column_name)
The "column_name" specifies the column you want indexed.
Example
This example creates a simple index, named "PersonIndex", on the LastName field of the Person table:
CREATE INDEX PersonIndex
ON Person (LastName)
If you want to index the values in a column in descending order, you can add the reserved word DESC after the column name:
CREATE INDEX PersonIndex
ON Person (LastName DESC)
If you want to index more than one column you can list the column names within the parentheses, separated by commas:
CREATE INDEX PersonIndex
ON Person (LastName, FirstName)

SQL Drop Index, Table and Database
________________________________________
Drop Index
You can delete an existing index in a table with the DROP INDEX statement.
Syntax for Microsoft SQLJet (and Microsoft Access):
DROP INDEX index_name ON table_name
Syntax for MS SQL Server:
DROP INDEX table_name.index_name
Syntax for IBM DB2 and Oracle:
DROP INDEX index_name
Syntax for MySQL:
ALTER TABLE table_name DROP INDEX index_name

________________________________________
Delete a Table or Database
To delete a table (the table structure, attributes, and indexes will also be deleted):
DROP TABLE table_name
To delete a database:
DROP DATABASE database_name

________________________________________
Truncate a Table
What if we only want to get rid of the data inside a table, and not the table itself? Use the TRUNCATE TABLE command (deletes only the data inside the table):
TRUNCATE TABLE table_name

SQL ALTER TABLE
________________________________________
ALTER TABLE
The ALTER TABLE statement is used to add or drop columns in an existing table.
ALTER TABLE table_name
ADD column_name datatype
ALTER TABLE table_name
DROP COLUMN column_name
Note: Some database systems don't allow the dropping of a column in a database table (DROP COLUMN column_name).
________________________________________
Person:

LastName FirstName Address
Pettersen Kari Storgt 20

________________________________________
Example
To add a column named "City" in the "Person" table:
ALTER TABLE Person ADD City varchar(30)
Result:
LastName FirstName Address City
Pettersen Kari Storgt 20
Example
To drop the "Address" column in the "Person" table:
ALTER TABLE Person DROP COLUMN Address
Result:
LastName FirstName City
Pettersen Kari

SQL Functions
________________________________________
SQL has a lot of built-in functions for counting and calculations.
________________________________________
Function Syntax
The syntax for built-in SQL functions is:
SELECT function(column) FROM table

________________________________________
Types of Functions
There are several basic types and categories of functions in SQL. The basic types of functions are:
• Aggregate Functions
• Scalar functions
________________________________________
Aggregate functions
Aggregate functions operate against a collection of values, but return a single value.
Note: If used among many other expressions in the item list of a SELECT statement, the SELECT must have a GROUP BY clause!!
"Persons" table (used in most examples)

Name Age
Hansen, Ola 34
Svendson, Tove 45
Pettersen, Kari 19

Aggregate functions in MS Access
Function Description
AVG(column)
Returns the average value of a column
COUNT(column)
Returns the number of rows (without a NULL value) of a column
COUNT(*)
Returns the number of selected rows
FIRST(column) Returns the value of the first record in a specified field
LAST(column) Returns the value of the last record in a specified field
MAX(column)
Returns the highest value of a column
MIN(column)
Returns the lowest value of a column
STDEV(column)
STDEVP(column)
SUM(column)
Returns the total sum of a column
VAR(column)
VARP(column)
Aggregate functions in SQL Server
Function Description
AVG(column)
Returns the average value of a column
BINARY_CHECKSUM
CHECKSUM
CHECKSUM_AGG
COUNT(column)
Returns the number of rows (without a NULL value) of a column
COUNT(*)
Returns the number of selected rows
COUNT(DISTINCT column)
Returns the number of distinct results
FIRST(column)
Returns the value of the first record in a specified field (not supported in SQLServer2K)
LAST(column)
Returns the value of the last record in a specified field (not supported in SQLServer2K)
MAX(column)
Returns the highest value of a column
MIN(column)
Returns the lowest value of a column
STDEV(column)
STDEVP(column)
SUM(column)
Returns the total sum of a column
VAR(column)
VARP(column)

________________________________________
Scalar functions
Scalar functions operate against a single value, and return a single value based on the input value.
Useful Scalar Functions in MS Access
Function Description
UCASE(c) Converts a field to upper case
LCASE(c) Converts a field to lower case
MID(c,start[,end]) Extract characters from a text field
LEN(c) Returns the length of a text field
INSTR(c,char) Returns the numeric position of a named character within a text field
LEFT(c,number_of_char) Return the left part of a text field requested
RIGHT(c,number_of_char) Return the right part of a text field requested
ROUND(c,decimals) Rounds a numeric field to the number of decimals specified
MOD(x,y) Returns the remainder of a division operation
NOW() Returns the current system date
FORMAT(c,format) Changes the way a field is displayed
DATEDIFF(d,date1,date2) Used to perform date calculations

SQL GROUP BY and HAVING
________________________________________
Aggregate functions (like SUM) often need an added GROUP BY functionality.
________________________________________
GROUP BY...
GROUP BY... was added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called, and without the GROUP BY function it was impossible to find the sum for each individual group of column values.
The syntax for the GROUP BY function is:
SELECT column,SUM(column) FROM table GROUP BY column

________________________________________
GROUP BY Example
This "Sales" Table:
Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100
And This SQL:
SELECT Company, SUM(Amount) FROM Sales
Returns this result:

Company SUM(Amount)
W3Schools 17100
IBM 17100
W3Schools 17100

The above code is invalid because the column returned is not part of an aggregate. A GROUP BY clause will solve this problem:
SELECT Company,SUM(Amount) FROM Sales
GROUP BY Company
Returns this result:

Company SUM(Amount)
W3Schools 12600
IBM 4500

________________________________________
HAVING...
HAVING... was added to SQL because the WHERE keyword could not be used against aggregate functions (like SUM), and without HAVING... it would be impossible to test for result conditions.
The syntax for the HAVING function is:
SELECT column,SUM(column) FROM table
GROUP BY column
HAVING SUM(column) condition value
This "Sales" Table:

Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100

This SQL:
SELECT Company,SUM(Amount) FROM Sales
GROUP BY Company
HAVING SUM(Amount)>10000
Returns this result

Company SUM(Amount)
W3Schools 12600

SQL SELECT INTO Statement
________________________________________
The SELECT INTO Statement
The SELECT INTO statement is most often used to create backup copies of tables or for archiving records.
Syntax
SELECT column_name(s) INTO newtable [IN externaldatabase]
FROM source

________________________________________
Make a Backup Copy
The following example makes a backup copy of the "Persons" table:
SELECT * INTO Persons_backup
FROM Persons
The IN clause can be used to copy tables into another database:
SELECT Persons.* INTO Persons IN 'Backup.mdb'
FROM Persons
If you only want to copy a few fields, you can do so by listing them after the SELECT statement:
SELECT LastName,FirstName INTO Persons_backup
FROM Persons
You can also add a WHERE clause. The following example creates a "Persons_backup" table with two columns (FirstName and LastName) by extracting the persons who lives in "Sandnes" from the "Persons" table:
SELECT LastName,Firstname INTO Persons_backup
FROM Persons
WHERE City='Sandnes'
Selecting data from more than one table is also possible. The following example creates a new table "Empl_Ord_backup" that contains data from the two tables Employees and Orders:
SELECT Employees.Name,Orders.Product
INTO Empl_Ord_backup
FROM Employees
INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID

SQL CREATE VIEW Statement
________________________________________
A view is a virtual table based on the result-set of a SELECT statement.
________________________________________
What is a View?
In SQL, a VIEW is a virtual table based on the result-set of a SELECT statement.
A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database. You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from a single table.
Note: The database design and structure will NOT be affected by the functions, where, or join statements in a view.
Syntax
CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition
Note: The database does not store the view data! The database engine recreates the data, using the view's SELECT statement, every time a user queries a view.
________________________________________
Using Views
A view could be used from inside a query, a stored procedure, or from inside another view. By adding functions, joins, etc., to a view, it allows you to present exactly the data you want to the user.
The sample database Northwind has some views installed by default. The view "Current Product List" lists all active products (products that are not discontinued) from the Products table. The view is created with the following SQL:
CREATE VIEW [Current Product List] AS
SELECT ProductID,ProductName
FROM Products
WHERE Discontinued=No
We can query the view above as follows:
SELECT * FROM [Current Product List]
Another view from the Northwind sample database selects every product in the Products table that has a unit price that is higher than the average unit price:
CREATE VIEW [Products Above Average Price] AS
SELECT ProductName,UnitPrice
FROM Products
WHERE UnitPrice>(SELECT AVG(UnitPrice) FROM Products)
We can query the view above as follows:
SELECT * FROM [Products Above Average Price]
Another example view from the Northwind database calculates the total sale for each category in 1997. Note that this view select its data from another view called "Product Sales for 1997":
CREATE VIEW [Category Sales For 1997] AS
SELECT DISTINCT CategoryName,Sum(ProductSales) AS CategorySales
FROM [Product Sales for 1997]
GROUP BY CategoryName
We can query the view above as follows:
SELECT * FROM [Category Sales For 1997]
We can also add a condition to the query. Now we want to see the total sale only for the category "Beverages":
SELECT * FROM [Category Sales For 1997]
WHERE CategoryName='Beverages'

SQL Servers - RDBMS
________________________________________
Modern SQL Servers are built on RDBMS.
________________________________________
DBMS - Database Management System
A Database Management System (DBMS) is a computer program that can access data in a database.
The DBMS program enables you to extract, modify, or store information in a database.
Different DBMS programs provides different functions for querying data, reporting data, and modifying data.
________________________________________
RDBMS - Relational Database Management System
A Relational Database Management System (RDBMS) is a Database Management System (DBMS) where the database is organized and accessed according to the relationships between data.
RDBMS was invented by IBM in the early 1970's.
RDBMS is the basis for SQL, and for all modern database systems like Oracle, SQL Server, IBM DB2, Sybase, MySQL, and Microsoft Access.

SQL Quick Reference
________________________________________
SQL Syntax
Statement Syntax
AND / OR SELECT column_name(s)
FROM table_name
WHERE condition
ANDOR condition
ALTER TABLE (add column) ALTER TABLE table_name
ADD column_name datatype
ALTER TABLE (drop column) ALTER TABLE table_name
DROP COLUMN column_name
AS (alias for column) SELECT column_name AS column_alias
FROM table_name
AS (alias for table) SELECT column_name
FROM table_name AS table_alias
BETWEEN SELECT column_name(s)
FROM table_name
WHERE column_name
BETWEEN value1 AND value2
CREATE DATABASE CREATE DATABASE database_name
CREATE INDEX CREATE INDEX index_name
ON table_name (column_name)
CREATE TABLE CREATE TABLE table_name
(
column_name1 data_type,
column_name2 data_type,
.......
)
CREATE UNIQUE INDEX CREATE UNIQUE INDEX index_name
ON table_name (column_name)
CREATE VIEW CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition
DELETE FROM DELETE FROM table_name
(Note: Deletes the entire table!!)
or
DELETE FROM table_name
WHERE condition
DROP DATABASE DROP DATABASE database_name
DROP INDEX DROP INDEX table_name.index_name
DROP TABLE DROP TABLE table_name
GROUP BY SELECT column_name1,SUM(column_name2)
FROM table_name
GROUP BY column_name1
HAVING SELECT column_name1,SUM(column_name2)
FROM table_name
GROUP BY column_name1
HAVING SUM(column_name2) condition value
IN SELECT column_name(s)
FROM table_name
WHERE column_name
IN (value1,value2,..)
INSERT INTO INSERT INTO table_name
VALUES (value1, value2,....)
or
INSERT INTO table_name
(column_name1, column_name2,...)
VALUES (value1, value2,....)
LIKE SELECT column_name(s)
FROM table_name
WHERE column_name
LIKE pattern
ORDER BY SELECT column_name(s)
FROM table_name
ORDER BY column_name [ASCDESC]
SELECT SELECT column_name(s)
FROM table_name
SELECT * SELECT *
FROM table_name
SELECT DISTINCT SELECT DISTINCT column_name(s)
FROM table_name
SELECT INTO
(used to create backup copies of tables) SELECT *
INTO new_table_name
FROM original_table_name
or
SELECT column_name(s)
INTO new_table_name
FROM original_table_name
TRUNCATE TABLE
(deletes only the data inside the table) TRUNCATE TABLE table_name
UPDATE UPDATE table_name
SET column_name=new_value
[, column_name=new_value]
WHERE column_name=some_value
WHERE SELECT column_name(s)
FROM table_name
WHERE condition

THANKS,
NARASIMHA

LAB EXERCISES FOR SSRS 2008

HI,
Here , I am giving LAB3 and LAB 4 Exercises
LAB 03 – TEST YOUR SKILLS ON GROUPING DATA

Create a report to display all Product details
a) Group records by
i) Product Class
ii) Product Name
b) Interactive Sort records by
i) Color
ii) Size

Data table Information :
RDBMS : SQL Server 2008
Table Name : Product
Database : SSRS

Output Requirements :
Data should be grouped by each Product Class and Product Name. Records should be displayed in the new page only.

LAB 04 – TEST YOUR SKILLS ON DRIL DOWN GROUPING

Create a report to display customer details such as region, city based on the selected country
a) Group records by
i.) Region
ii.) City
b) Sort records by
i.) Customer name inside each group
c) Drill down by
i.) Region
ii.) City
Data table Information :
RDBMS : SQL Server 2000
Table(s) : Customer
Database : SSRS
Inputs :
Country
Sample Output :

Thanks,
Narasimha

LAB EXERCISES FOR SSRS 2008

div>Hi,
Here I am giving Lab Exeercises for Freshers in SQL SERVER REPORTING SERVICES 2008.

LAB 01 – TEST YOUR SKILS ON CREATING A BASIC REPORT
Create a report to display Employee information
a) Show First Name, Last Name, Position and Birth date of all employees
b) Show total employees in the footer
Data table Information:
RDBMS: SQL Server 2008

Table(s): Employee
Database: SSRS
Inputs:
None
Sample Output: see the figure

<

Thanks,
Narasimha

LAB EXERCISES FOR SSRS 2008

Hi,

Here I am giving Some Lab Exercises for freshers in SQL SERVER REPORTING SERVICES (SSRS) 2008.

LAB 01 – TEST YOUR SKILS ON CREATING A BASIC REPORT
Create a report to display Employee information
a) Show First Name, Last Name, Position and Birth date of all employees
b) Show total employees in the footer
Data table Information:
RDBMS: SQL Server 2008
Table(s): Employee
Database: SSRS
Inputs:
None

Sample Output:

Useful Webistes for SQL BI Guys

Hi,
I have collected some of useful SQL BI websites......
http://arcanecode.com/2009/11/18/populating-a-kimball-date-dimension/
http://www.sqllion.com/category/ssis/

SSIS Calling a WebService

http://pedrocgd.blogspot.com/2009/11/bi-stepbystep-ssis-calling-webservice.html
SQL SERVER Reporting Services FAQ's
http://social.msdn.microsoft.com/Forums/en-US/sqlreportingservices/thread/48de91f9-1844-40c1-9614-5ead0b4b69a5
http://blogs.msdn.com/sqlrsteamblog/

DATA MODELLING
http://www.agiledata.org/essays/dataModeling101.html

SQL BI EVENTS AND WEB CASTS
http://www.microsoft.com/events/series/msdnsqlserver2008.aspx

SQL SERVER ANALYSIS SERVICES WEBSITES
http://www.sqlserveranalysisservices.com/default.htm
Top 10 SQL Server Integration Services Best Practices
http://sqlcat.com/top10lists/archive/2008/10/01/top-10-sql-server-integration-services-best-practices.aspx
SQL BI
http://www.learnitfirst.com/blogs/NewVideos/2008
http://www.ssistalk.com/
http://www.sqlservercentral.com/
http://www.pragmaticworks.com/
http://www.1keydata.com/sql
http://www.sql-tutorial.net/
http://www.sqlmag.com/
http://www.sqlsaturday.com/
http://blog.sqlauthority.com/2007/11/04/sqlauthority-news-best-articles-on-sqlauthoritycom/
http://blog.sqlauthority.com/
http://www.simple-talk.com/
www.sqlmag.com/
http://www.dtsxchange.com/
http://www.sql-tutorial.com/
http://www.ssw.com.au/Ssw/Standards/Rules/RulesToBetterSQLReportingServices.aspx
http://www.sqltraining.org/
http://technet.microsoft.com/hi-in/library/cc917721(en-us).aspx
http://www.jumpstarttv.com/
SSIS
http://blogs.conchango.com/
www.learnintegrationservices.com/
http://www.ssistalk.com/
SSIS Vedios
http://msdn.microsoft.com/en-us/library/dd299421.aspx
http://www.pragmaticworks.com/
http://www.jumpstarttv.com/channels/SQL.aspx
http://www.sqlservercentral.com/
www.learnintegrationservices.com/
http://msdn.microsoft.com/en-us/library/dd299413.aspx

PPS
http://technet.microsoft.com/en-us/office/performancepoint/default.aspx

SQLBI BOOKS
http://books.google.co.in/books?id=N7tzbajkNccC&pg=PA201&dq=SSIS+2005#PPA1,M1

Microsoft Virtual Labs:

http://www.microsoft.com/events/vlabs/default.mspx
http://msdn.microsoft.com/en-us/virtuallabs/cc138238.aspx
http://www.beansoftware.com/ASP.NET-Tutorials/Control-Template.aspx
http://www.simple-talk.com/sql/learn-sql-server/robyn-pages-sql-server-data-validation-workbench/
http://www.sommarskog.se/arrays-in-sql-2005.html
http://www.msevents.microsoft.com/
http://www.microsoft.com/events/series/bi.aspx?tab=virtuallabs
http://www.microsoft.com/sqlserver/2008/en/us/white-papers.aspx

Hide/show property in SSRS:

URL=http://theruntime.com/blogs/thomasswilliams/archive/2008/09/29/hiding-and-showing-columns-based-on-a-parameter-in-reporting.aspx

MOSS
Downloading Excel files from SharePoint using SSIS
http://sqlblogcasts.com/blogs/drjohn/archive/2007/11/04/downloading-excel-files-from-sharepoint-using-ssis.aspx

WHITE PAPER: Scorecards and Dashboards for IT http://download.microsoft.com/download/3/5/8/35802290-a7f6-4976-8855-74c8b3b7f035/ITOperationsScorecardsandDashboards_whitepaper.doc
Thanks,
Narasimha

Sample MDX Queries

MDX QUERRIES

select {[Measures].[Tax Amount],[Measures].[Discount Amount],[Measures].[Sales Amount]}on columns,{crossjoin([Dim Customer].[First Name].[First Name],crossjoin([Dim Customer].[Middle Name].[Middle Name],[Dim Customer].[Last Name].[Last Name]))}on rows from[Adventure Works DW]

----------------------------------------------------------------------------------------------------------------------select {[Measures].[Tax Amount],[Measures].[Discount Amount],[Measures].[Sales Amount]}on columns,non empty{crossjoin([Dim Customer].[First Name].[First Name].members,[Dim Customer].[Middle Name].[Middle Name],[Dim Customer].[Last Name].[Last Name])}on rows from[Adventure Works DW]where[Dim Customer].[State Province Name].&[Illinois]

------------------------------------------------------------------------------------------------------------------------select {filter(([Measures].[Tax Amount],[Measures].[Discount Amount],[Measures].[Sales Amount])and[Measures].[Sales Amount]>=500)}on columns,{crossjoin([Dim Customer].[First Name].[First Name],[Dim Customer].[Middle Name].[Middle Name],[Dim Customer].[Last Name].[Last Name])}on rows from[Adventure Works DW]

--------------------------------------------------------------------------------------------------------------------------------------------------with member [Measures].[per] as ([Measures].[Tax Amount]/[Measures].[Sales Amount])*100 select {[Measures].[Tax Amount],[Measures].[Discount Amount],[Measures].[Sales Amount],[Measures].[per]}on columns,{crossjoin([Dim Customer].[First Name].[First Name],[Dim Customer].[Middle Name].[Middle Name],[Dim Customer].[Last Name].[Last Name])}on rows from[Adventure Works DW]

-----------------------------------------------------------------------------------------------------------------------------------------------------

Tuesday 29 December 2009

Best Practices in SSIS

I have collected Best Practices in SSIS from Various Websites........following are the best practices
Here are the 10 SSIS best practices that would be good to follow during any SSIS package development
1. the most desired feature in SSIS packages development is re-usability. In other ways, we can call them as standard packages that can be re-used during different ETL component development. In SSIS, this can be easily achieved using template features. SSIS template packages are the re-usable packages that one can use in any SSIS project at any number of times. To know more about how to configure this, please see http://support.microsoft.com/kb/908018

2. Avoid using dot (.) naming convention for your package names. Dot (.) naming convention sometime confuses with the SQL Server object naming convention and hence should be avoided. Good approach would be to use underscore (_) instead of using dot. Also make sure that package names should not exceed 100 characters. During package deployment in SQL Server type mode, it is noticed that any character over 100 are automatically removed from package name. This might result your SSIS package failure during runtime, especially when you are using ‘Execute Package Tasks’ in your package.

3. The flow of data from upstream to downstream in a package is a memory intensive task, at most of the steps and component level we have to carefully check and make sure that any unnecessary columns are not passed to downstream. This helps in avoiding extra execution time overhead of package and in turn improves overall performance of package execution.

4. While configuring any OLEDB connection manager as a source, avoid using ‘Table or view’ as data access mode, this is similar to ‘SELECT * FROM , and as most of us know, SELECT * is our enemy, it takes all the columns in account including those which are not even required. Always try to use ‘SQL command’ data access mode and only include required column names in your SELECT T-SQL statement. In this way you can block passing unnecessary columns to downstream.

5 In your Data Flow Tasks, use Flat File connection manager very carefully, creating Flat File connection manager with default setting will use data type string [DT_STR] as a default for all the column values. This always might not be a right option because you might have some numeric, integer or Boolean columns in your source, passing them as a string to downstream would take unnecessary memory space and may cause some error at the later stages of package execution.

6 Sorting of data is a time consuming operation, in SSIS you can sort data coming from upstream using ‘Sort’ transformation, however this is a memory intensive task and sometime result in degrade in overall package execution performance. As a best practice, at most of the places where we know that data is coming from SQL Server database tables, it’s better to perform the sorting operation at the database level where sorting can be performed within the query. This is in fact good because SQL Server database sorting is much refined and happens at SQL Server level. This in turn sometime results overall performance improvement in package execution.

7 During SSIS packages development, most of the time one has to share his package with other team members or one has to deploy same package to any other dev, UAT or production systems. One thing that a developer has to make sure is to use correct package protection level. If someone goes with the default package protection level ‘EncryptSenstiveWithUserKey’ then same package might not execute as expected in other environments because package was encrypted with user’s personal key. To make package execution smooth across environment, one has to first understand the package protection level property behaviour, please see http://msdn2.microsoft.com/en-us/library/microsoft.sqlserver.dts.runtime.dtsprotectionlevel.aspx . In general, to avoid most of the package deployment error from one system to another system, set package protection level to ‘DontSaveSenstive’.

8 It’s a best practice to take use of Sequence containers in SSIS packages to group different components at ‘Control Flow’ level. This offers a rich set of facilities
o Provides a scope for variables that a group of related tasks and containers can use
o Provides facility to manage properties of multiple tasks by setting property at Sequence container level
o Provide facility to set transaction isolation level at Sequence container level.
For more information on Sequence containers, please see http://msdn2.microsoft.com/en-us/library/ms139855.aspx.
9 If you are designing an ETL solution for a small, medium or large enterprise business need, it’s always good to have a feature of restarting failed packages from the point of failure. SSIS have an out of the box feature called ‘Checkpoint’ to support restart of failed packages from the point of failure. However, you have to configure the checkpoint feature at the package level. For more information, please see http://msdn2.microsoft.com/en-us/library/ms140226.aspx.

10 Execute SQL Task is our best friend in SSIS; we can use this to run a single or multiple SQL statement at a time. The beauty of this component is that it can return results in different ways e.g. single row, full result set and XML. You can create different type of connection using this component like OLEDB, ODBC, ADO, ADO.NET and SQL Mobile type etc. I prefer to use this component most of the time with my FOR Each Loop container to define iteration loop on the basis of result returned by Execute SQL Task. For more information, please see http://msdn2.microsoft.com/en-us/library/ms141003.aspx & http://www.sqlis.com/58.aspx.
SSIS: Suggested Best Practices and naming conventions
I thought it would be worth publishing a list of guidelines that I see as SSIS development best practices. These are my own opinions and are based upon my experience of using SSIS over the past 18 months. I am not saying you should take them as gospel but these are generally tried and tested methods and if nothing else should serve as a basis for you developing your own SSIS best practices.
One thing I really would like to see getting adopted is a common naming convention for tasks and components and to that end I have published some suggestions at the bottom of this post.
This list will get added to over time so if you find this useful keep checking back here to see updates!
If you know that data in a source is sorted, set IsSorted=TRUE on the source adapter output. This may save unnecessary SORTs later in the pipeline which can be expensive. Setting this value does not perform a sort operation, it only indicates that the data it sorted.
Rename all Name and Description properties from the default. This will help when debugging particularly if the person doing the debugging is not the person that built the package.
Only select columns that you need in the pipeline to reduce buffer size and reduce OnWarning events at execution time
Following on from the previous bullet point, always use a SQL statement in an OLE DB Source component or LOOKUP component rather than just selecting a table. Selecting a table is akin to "SELECT *..." which is universally recognized as bad practice. (http://www.sqljunkies.com/WebLog/simons/archive/2006/01/20/17865.aspx). In certain scenarios the approach of using a SQL statement can result in much improved performance as well (http://blogs.conchango.com/jamiethomson/archive/2006/02/21/2930.aspx).
Use SQL Server Destination as opposed to OLE DB Destination where possible for quicker insertions I used to recommend using SQL Server Destinations wherever possible but I've changed my mind. Experience from around the community suggests that the difference in performance between SQL Server Destination and OLE DB Destination is negligible and hence, given the flexibility of packages that use OLE DB Destinations it may be better to go for the latter. Its an "it depends" consideration so you should consider what you prefer based on your own testing.
Use Sequence containers to organise package structure into logical units of work. This makes it easier to identify what the package does and also helps to control transactions if they are being implemented.
Where possible, use expressions on the SQLStatementType property of the Execute SQL Task instead of parameterised SQL statements. This removes ambiguity when different OLE DB providers are being used. It is also easier. (UPDATE: There is a caveat here. Results of expressions are limited to 4000 characters so be wary of this if using expressions).
If you are implementing custom functionality try to implement custom tasks/components rather than use the script task or script component. Custom tasks/components are more reusable than scripted tasks/components. Custom components are also less bound to the metadata of the pipeline than script components are.
Use caching in your LOOKUP components where possible. It makes them quicker. Watch that you are not grabbing too many resources when you do this though.
LOOKUP components will generally work quicker than MERGE JOIN components where the 2 can be used for the same task (http://blogs.conchango.com/jamiethomson/archive/2005/10/21/2289.aspx).
Always use DTExec to perf test your packages. This is not the same as executing without debugging from SSIS Designer (http://www.sqlis.com/default.aspx?84).
Use naming conventions for your tasks and components. I suggest using acronyms at the start of the name and there are some suggestions for these acronyms at the end of this article. This approach does not help a great deal at design-time where the tasks and components are easily identifiable but can be invaluable at debug-time and run-time. e.g. My suggested acronym for a Data Flow Task is DFT so the name of a data flow task that populates a table called MyTable could be "DFT Load MyTable".
If you want to conditionally execute a task at runtime use expressions on your precedence constraints. Do not use an expression on the "Disable" property of the task.
Don't pull all configurations into a single XML configuration file. Instead, put each configuration into a separate XML configuration file. This is a more modular approach and means that configuration files can be reused by different packages more easily.
If you need a dynamic SQL statement in an OLE DB Source component, set AccessMode="SQL Command from variable" and build the SQL statement in a variable that has EvaluateAsExpression=TRUE. (http://blogs.conchango.com/jamiethomson/archive/2005/12/09/2480.aspx)
When using checkpoints, use an expression to populate the CheckpointFilename property which will allow you to include the value returned from System::PackageName in the checkpoint filename. This will allow you to easily identify which package a checkpoint file is to be used by.
When using raw files and your Raw File Source Component and Raw File Destination Component are in the same package, configure your Raw File Source and Raw File Destination to get the name of the raw file from a variable. This will avoid hardcoding the name of the raw file into the two separate components and running the risk that one may change and not the other.
Variables that contain the name of a raw file should be set using an expression. This will allow you to include the value returned from System::PackageName in the raw file name. This will allow you to easily identify which package a raw file is to be used by. N.B. This approach will only work if the Raw File Source Component and Raw File Destination Component are in the same package.
Use a common folder structure (http://blogs.conchango.com/jamiethomson/archive/2006/01/05/2559.aspx)
Use variables to store your expressions (http://blogs.conchango.com/jamiethomson/archive/2005/12/05/2462.aspx). This allows them to be shared by different objects and also means you can view the values in them at debug-time using the Watch window.
Keep your packages in the dark (http://www.windowsitpro.com/SQLServer/Article/ArticleID/47688/SQLServer_47688.html). In summary, this means that you should make your packages location unaware. This makes it easier to move them across environments.
If you can, filter your data in the Source Adapter rather than filter the data using a Conditional Split transform component. This will make your data flow perform quicker.
When storing information about an OLE DB Connection Manager in a configuration, don't store the individual properties such as Initial Catalog, Username, Password etc... just store the ConnectionString property.
Your variables should only be scoped to the containers in which they are used. Do not scope all your variables to the package container if they don't need to be.
Employ namespaces for your packages
Make log file names dynamic so that you get a new logfile for each execution.
Use ProtectionLevel=DontSaveSensitive. Other developers will not be restricted from opening your packages and you will be forced to use configurations (which is another recommended best practice)
Use annotations wherever possible. At the very least each data-flow should contain an annotation.
Always log to a text file, even if you are logging elsewhere as well. Logging to a text file has less reliance on external factors and is therefore most likely to contain all information required for debugging.
Create a new solution folder in Visual Studio Solution Explorer in order to store your configuration files. Or, store them in the 'miscellaneous files' section of a project.
Always use template packages to standardize on logging, event handling and configuration.
If your template package contains variables put them in a dedicated namespace called "template" in order to differentiate them from variables that are added later.
Break out all tasks requiring the Jet engine (Excel or Access data sources) into their own packages that do nothing but that data flow task. Load the data into Staging tables if necessary. This will ensure that solutions can be migrated to 64bit with no rework. (Thanks to Sam Loud for this one. See his comment below for an explanation)
Don't include connection-specific info (such as server names, database names or file locations) in the names of your connection managers. For example, "OrderHistory" is a better name than "Svr123ABC\OrderHist.dbo".
Here are the advanced SSIS best practices.
Get your metadata right first, not later: The SSIS data flow is incredibly dependent on the metadata it is given about the data sources it uses, including column names, data types and sizes. If you change the data source after the data flow is built, it can be difficult (kind of like carrying a car up a hill can be difficult) to correct all of the dependent metadata in the data flow. Also, the SSIS data flow designer will often helpfully offer to clean up the problems introduced by changing the metadata. Sadly, this "cleanup" process can involve removing the mappings between data flow components for the columns that are changed. This can cause the package to fail silently - you have no errors and no warnings but after the package has run you also have no data in those fields.
Use template packages whenever possible, if not more often: Odds are, if you have a big SSIS project, all of your packages have the same basic "plumbing" - tasks that perform auditing or notification or cleanup or something. If you define these things in a template package (or a small set of template packages if you have irreconcilable differences between package types) and then create new packages from those templates you can reuse this common logic easily in all new packages you create.
Use OLE DB connections unless there is a compelling reason to do otherwise: OLE DB connection managers can be used just about anywhere, and there are some components (such as the Lookup transform and the OLE DB Command transform) that can only use OLE DB connection managers. So unless you want to maintain multiple connection managers for the same database, OLE DB makes a lot of sense. There are also other reasons (such as more flexible deployment options than the SQL Server destination component) but this is enough for me.
Only Configure package variables: If all of your package configurations target package variables, then you will have a consistent configuration approach that is self-documenting and resistant to change. You can then use expressions based on these variables to use them anywhere within the package.
If it’s external, configure it: Of all of the aspects of SSIS about which I hear people complain, deployment tops the list. There are plenty of deployment-related tools that ship with SSIS, but there is not a lot that you can do to ease the pain related to deployment unless your packages are truly location independent. The design of SSIS goes a long way to making this possible, since access to external resources (file system, database, etc.) is performed (almost) consistently through connection managers, but that does not mean that the package developer can be lazy. If there is any external resource used by your package, you need to drive the values for the connection information (database connection string, file or folder path, URL, whatever) in a package configuration so they can be updated easily in any environment without requiring modification to the packages.
One target table per package: This is a tip I picked up from the great book The Microsoft Data Warehouse Toolkit by Joy Mundy of The Kimball Group, and it has served me very well over the years. By following this best practice you can keep your packages simpler and more modular, and much more maintainable.
Annotate like you mean it: You've heard of "test first development," right? This is good, but I believe in "comment first development." I've learned over the years that if I can't describe something in English, I'm going to struggle doing it in C# or whatever programming language I'm using, so I tend to go very heavy on the comments in my procedural code. I've carried this practice over into SSIS, and like to have one annotation per task, one annotation per data flow component and any additional annotations that make sense for a given design surface. This may seem like overkill, but think of what you would want someone to do if you were going to open up their packages and try to figure out what they were trying to do. So annotate liberally and you won't be "that guy" - the one everyone swears about when he's not around.
Avoid row-based operations (think “sets!”): The SSIS data flow is a great tool for performing set-based operations on huge volumes of data - that's why SSIS performs so well. But there are some data flow transformations that perform row-by-row operations, and although they have their uses, they can easily cause data flow performance to grind to a halt. These transformations include the OLE DB Command transform, the Fuzzy Lookup transform, the Slowly Changing Dimension transform and the tried-and-true Lookup transform when used in non-cached mode. Although there are valid uses for these transforms, they tend to be very few and far between, so if you find yourself thinking about using them, make sure that you've exhausted the alternatives and that you do performance testing early with real data volumes.
Avoid asynchronous transforms: In short, any fully-blocking asynchronous data flow transformation (such as Sort and Aggregate) is going to hold the entire set of input rows in memory before it produces any output rows to be consumed by downstream components. This just does not scale for larger (or even "large-ish") data volumes. As with row-based operations, you need to aggressively pursue alternative approaches, and make sure that you're testing early with representative volumes of data. The danger here is that these transforms will work (and possibly even work well) with small number of records, but completely choke and die when you need them to do the heavy lifting.
Really know your data – really! If there is one lesson I've learned (and learned again and again - see my previous blog post about real world experience and the value of pain in learning ;-) it is that source systems never behave the way you expect them to and behave as documented even less frequently. Question everything, and then test to validate the answers you retrieve. You need to understand not only the static nature of the data - what is stored where - but also the dynamic nature of the data - how it changes when it changes, and what processes initiate those changes, and when, and how, and why. Odds are you will never understand a complex source system well enough, so make sure you are very friendly (may I recommend including a line item for chocolate and/or alcohol in your project budget?) with the business domain experts for the systems from which you will be extracting data. Really.
Do it in the data source: Relational databases have been around forever (although they did not write the very first song - I think that was Barry Manilow) and have incredibly sophisticated capabilities work efficiently with huge volumes of data. So why would you consider sorting, aggregating, merging or performing other expensive operations in your data flow when you could do it in the data source as part of your select statement? It is almost always significantly faster to perform these operations in the data source, if your data source is a relational database. And if you are pulling data from sources like flat files which do not provide any such capabilities there are still occasions when it is faster to load the data into SQL Server and sort, aggregate and join your data there before pulling it back into SSIS. Please do not think that SSIS data flow doesn't perform well - it has amazing performance when used properly - but also don't think that it is the right tool for every job. Remember - Microsoft, Oracle and the rest of the database vendors have invested millions of man years and billions of dollars[1] in tuning their databases. Why not use that investment when you can?
Don’t use Data Sources: No, I don't mean data source components. I mean the .ds files that you can add to your SSIS projects in Visual Studio in the "Data Sources" node that is there in every SSIS project you create. Remember that Data Sources are not a feature of SSIS - they are a feature of Visual Studio, and this is a significant difference. Instead, use package configurations to store the connection string for the connection managers in your packages. This will be the best road forward for a smooth deployment story, whereas using Data Sources is a dead-end road. To nowhere.
Treat your packages like code: Just as relational databases are mature and well-understood, so is the value of using a repeatable process and tools like source code control and issue tracking software to manage the software development lifecycle for software development projects. All of these tools, processes and lessons apply to SSIS development as well! This may sound like an obvious point, but with DTS it was very difficult to "do things right" and many SSIS developers are bringing with them the bad habits that DTS taught and reinforced (yes, often through pain) over the years. But now we're using Visual Studio for SSIS development and have many of the same capabilities to do things right as we do when working with C# or C++ or Visual Basic. Some of the details may be different, but all of the principles apply.

References:
http://blogs.msdn.com/sqllive/archive/2007/05/21/sql-server-integration-services-ssis-10-quick-best-practices.aspx
http://bi-polar23.blogspot.com/2007/11/ssis-best-practices-part-1.html
http://bi-polar23.blogspot.com/2007/11/ssis-best-practices-part-2.html
http://blogs.conchango.com/jamiethomson/archive/2006/01/05/SSIS_3A00_-Suggested-Best-Practices-and-naming-conventions.aspx

Thanks,
Narasimha