Wednesday, April 28, 2010

Emailing work log notification to individual users using SSIS

Emailing work log notification to individual users by using SSIS
Another project that I had worked on was to send emails with attached excel work log files to individual users; I used the same package with some modification to send excel Management MIS reports to CEOs & managers, the Excel file contained pie charts and etc...

Criteria
1. A sample Excel (97 – 2003) file with one sheet named “Report”
2. In the “Report” sheet the field names that need to be reported to the user must be on the first row, e.g.


3. The list of the fields will be used in the SSIS package; in this post I will be only using one field (UserID).
4. The Excel file name must be renamed to “SampleExcelFile.xls”
5. Make a main root folder e.g. “C:\SSIS\Email to list with attached Excel File”
6. Make two other sub folders 1- “Sample” 2- “ToBeEmailed”
7. Copy the Excel Sample File in the “Sample” folder


Note(s)
1. The attached file format will be in *.xls format (Excel (97 – 2003))

Over all View



Defining some variables in the package
One of the most important part of a SSIS package are the variables, because some variables will be defined in the SSIS Configuration File(s) and/or Table(s), so that the package can be used dynamically.
Example: The server name on your computer is different that the server name on production server so basically you can have the “server name variable” defined in the SSIS configuration XML file or table so that it can be changes when deploying the package from one server to another.



1. “uVar_BackupActiveFolder” Each time the package runs it makes a backup folder known as the current backup folder which will be used to save the Excel backup files and/or maybe the Error text file.
2. “uVar_BackupMainFolderPath” Some customers prefer to have the backup folder in another server so basically we allow them to redirect the folders location by using this variable. If this variable is empty it will make a default backup folder name (See backup scripts).
3. “uVar_BackupUserFolder” once the variable “uVar_BackupActiveFolder” is defined, the package will make subfolders with the name of the users, so that it will be easier to trace what was reported on that day and for whom.
4. “uVar_DestinationFileNameRenamed” To move the excel file to the backup folder we need to rename it.
5. “uVar_ListOfUsers” A SSIS variable to save the list of users that needs to be notified.
6. “uVar_MainFolderPath” This is the path where you will be having your main ETL folder root; it can contain the Package, configuration files, backup, sample folder, ToBeProcessed folder and etc... (C:\SSIS\ETLName)
7. “uVar_Source_SQLQueryString” A SQL statement that provides the work log of each user, The TOP statement is been used for design time only, the script string will change in the package loop (check object=> “SCR--- Make the WHERE condition of the Source table”).
8. “uVar_SourceSampleExcelFile” the location of the Excel sample file.
9. “uVar_ToBeEmailedFile” The file that will be emailed individually
10. “uVar_ToBeEmailedFolder” The folder that has the file that needs to be emailed.
11. “uVar_UserEmail” The email of the current user in the Loop.
12. “uVar_UserID” The current user ID in the Loop.
13. “uVar_UserName” The current user name in the Loop.

Defining Connection Manager
I am going to define only one connection in this example I am sure you can set the rest of them yourself.

1. Excel connection manager



Set the “DelayValidation” to TRUE and enter the right the connection string e.g.
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\SSIS\Email to list with attached Excel File\Sample\SampleExcelFile.xls;Extended Properties="EXCEL 8.0;HDR=YES";

2. SMTP Connection Manager

3. SQLServerSourceConnection

You can add the Error handling features in the package.

Adding Object (controls) in the Control Flow
Now we can add the objects as seen in the “Over all View
1. SCR---- Making the backup folder string from BackupFolderName Variable (Dynamic Backup folder yyyy-mm-dd-hh-mm-ss)


Redirect the backup folder in the “uVar_BackupMainFolderPath” variable; if the variable is empty the system will pick up the default folder name “BackUpFolder” in the main folder path of “uVar_MainFolderPath”. E.g.
C:\SSIS\Email ..................Excel File\BackUpFolder\2010-04\2010-04-27-103108.



2. FSYS----Creating the backup folder from BackupFolderName
Making the backup active folder from the “uVar_BackupActiveFolder” variable. C:\SSIS\Email ..................Excel File\BackUpFolder\2010-04\2010-04-27-103108.



3. FSYS----Creating the ToBeEmailed Folder
Creating the folder. C:\SSIS\Email ..................Excel File\ ToBeEmailed.



4. SCR--- Make ToBeEmailedFile String "Folder and File Name"
A “variable pointer” that points to the sample excel file.





5. DFT --- Find the list of users to Email to
We now need to get the list of users that need to be notified about their daily work list report, mainly we need the fields of the UserID, UserName and email, the number of fields and the order of the fields are very important (See: Foreach Loop Container).
The DFT looks like...



Retrieving the list of users from the OLE DB Source object.



Mapping fields.



Saving the list of users in a SSIS variable object, the loop in the package will be looping through this variable user by user



6. Foreach Loop Container
What the Loop section dose, is to loop though the object variable “uVar_ListOfUsers” (that has the list of users that needs to be notified) user by user, make an Excel file for each user, make a backup folders and move the excel file to the right backup folder for each user and finally email to the user with the right excel file attached.
Loop settings...



Now we can see why the order and the number of fields in the SELECT statement are critical because of the fields mapping in the loop.



7. FSYS -- Copy a Template Empty Excel file to the ToBeEmailed Folder
In each loop the package (for each user in the notification list) copies a sample of the Excel sample file (SampleExcelFile.xls) to the “uVar_ToBeEmailedFolder”, the file must have a sheet named “Report”.



8. SCR--- Make the WHERE condition of the Source table



Now we have to make a query string that has the current user work log only, in other words the excel file that will be emailed to each user must have his/her work list report only, so we must filter the report table user by user by using the UserID.

Note: In this select statement you must have all the field names that will be in the excel file as header fields, in my sample I will be only using one field (UserID)
The string will be saved in the “uVar_Source_SQLQueryString” variable that will be used by the “OLE DB Source” that is in the “DFT --- Main Task”.



9. DFT --- Main Task
Saving the work log list for the current user into the excel file.



The OLE Source is pointing to the WorkList table that is been filtered by the current UserID.
In the variable “uVar_Source_SQLQueryString” I had defined a query like “SELECT TOP 1 UserID .....” and I am sure that raised few questions like, why using TOP? Why hard coding a UserID? The answers are in the next step, while developing a package you need to define the fields and you need to be able to map them in the OLE DB source editor, and that’s why.



Saving the work list to the excel file in the “Report” sheet.



10. SCR--- Make Folderstring Like "Backup Folder and UserID"



C:\SSIS\Email .........File \BackUpFolder\2010-04\2010-04-27-103108\FName1 LName1



11. FSYS---- Make Backup folder in the name of the User
C:\SSIS\Email .........File \BackUpFolder\2010-04\2010-04-27-103108\FName1 LName1



12. SCR--- Rename file names with the format of "yyyy-mm-dd-hhmmssff"



The string will look like
C:\SSIS\Email ...... File\BackUpFolder\2010-04\2010-04-27-103108\FName1 LName1\FName1 LName1 2010-04-27-10311137.xls



13. FSYS---- Moving processed files to the "Backup Active Folder"



Moving the file from ....
C:\SSIS\Email ..........File\ToBeEmailed\ SampleExcelFile.xls
To .....
C:\SSIS\Email ...... File\BackUpFolder\2010-04\2010-04-27-103108\FName1 LName1\FName1 LName1 2010-04-27-10311137.xls

14. Send Mail Task


Testing Scenarios
Before testing make sure that you have set the variables according to the snap shot of the variables that I have and check the “Criteria”.
Once you run the package you should be able to see.





Thursday, April 15, 2010

Delete the oldest file in a folder with SSIS

Delete the oldest file in a folder with SSIS
Some cases the developer wants to delete the oldest or the latest file, or maybe just save the name in a variable. In this case I am deleting the oldest file

Criteria
1. Make a folder at “C:\SSIS\DeleteTheOldestFile\FolderToDeleteFrom”
2. Copy some text files in it with different “Created Dates”

Over all View


Defining some variables in the package
One of the most important part of a SSIS package are the variables, because some variables will be defined in the Configuration File(s), so that you can use the package dynamically.


1. “uVar_CurrentFileDateTime” a Defined date for the max date.
2. “uVar_CurrentFileName” The current file name
3. “uVar_FolderNameOfFilesToBeProcessed” the folder that will be looking for the oldest file to be deleted
4. “uVar_OldestFileName” The name of the oldest file.
5. “uVar_SourceFileExtension” The files extension.

Adding Object (controls)
Now we can add the objects as seen in the “Over all View”
1. FLC -- Loop through a folder and seek selected files one by one

Setting the ForLoop to find the text files by using the “uVar_SourceFileExtention” and looking into the folder that is in “uVar_FolderNameOfFilesToBeProcessed”.
By setting the Traverse Subfolders it will look into the sub folders as well.

While loop through files, save the file name and path into the above variable

2. SCR -- Find the oldest file by date and time


Now we can use the variable and a script to find each file’s “Created date” and compare it with the others, by saving the oldest file name and folder in the “uVar_OldestFileName” variable



3. fx
In this section we have to check that is they are any files to be deleted, otherwise the package will fail if no file found.


4. FSYS -- Delete the oldest file
Delete the oldest file


Testing Scenarios
Before testing make sure that you have set the variables according to the snap shot of the variables and have the “Criteria” checked.

Scenario 1:
1- Check the folder “C:\SSIS\DeleteTheOldestFile\FolderToDeleteFrom” as make sure that you have few txt files.
2- Check the Create date of each file
3- Run the package and check the folder again and see what files have been deleted.


Scenario 2:
This it with Excel files
1- Make a folder “C:\MySSIS\MainFolder\DelOldest”
2- Save the folder in “uVar_FolderNameOfFilesToBeProcessed”
3- Set the Variable “uVar_SourceFileExtension” = *.xls
4- Copy some excel files in that folder
5- Run the package
6- See what files have been deleted

Note: Try running the package multiple times.

Looping through list of users in SSIS by using SSIS “Object Variables” and “Recordset Destination”

Looping through list of users in SSIS by using SSIS “Object Variables” and “Recordset Destination”
One of the question that came up for me was how to get a list of users & emails and email them the work list in Excel, the second part of the question was they wanted to keep a archive of the work list by user and date, a kind of a back-up list, so in this case I will be showing how to make the folders by user by date, and I will be using objects such as “Recordset Destination” and ADO.NET in a Loop object.
Criteria
1. ---
Over all View


Defining some variables in the package
One of the most important part of a SSIS package are the variables, because some variables will be defined in the Configuration File(s), so that you can use the package dynamically.


1.“uVar_ArchiveFolder” This is where you want to have the root of the Archive folder.
2.“uVar_ArchiveFolderCurrUser” a variable that will save the current user name and folder, e.g. C:\SSIS\Archive\Smith\2010-04\2010-04-15.
3.“uVar_CurrentUser” An object that will contain the list of Users (Record of users).
4.“uVar_CurrentUserEmail” The name of the current Email.
5.“uVar_CurrentUserName” The name of the current UserName.

Defining Connection Manager
We need 1 connection.
1. Excel connection manager


Adding Object (controls)
Now we can add the objects as seen in the “Over all View”
1. DFT----CreateUserlistToLoopThrough


In this section we are retrieving the list of users and saving them into the SSIS Object variable, which is basically playing a role of a recordset.
The order of the select statement is very important because it will be used in the ForLoop. You can filter the SELECT statement dynamically.

2. FLC----Loop through Users list in SSIS Object using ADO


Setting the LOOP section, this is the part that is related to the order of the fields in the select statement.
3. SCR----Make Folder name by User as a String


A script to make the FOLDER string.

4. FSYS----Create archive folderSetting a “File System task” to make the folders


Testing Scenarios
Before testing make sure that you have set the variables according to the snap shot of the variables and have the “Criteria” checked.

Scenario 1:
1- Run the package and check the folder “C:\SSIS\ArchiveFolder” and its sub folders


Scenario 2:
1- Set the Variable “uVar_ArchiveFolder” = “C:\MyFolder\BackUpFolder”
2- Run the package and check the folder “C:\MyFolder\BackUpFolder” and its sub folders

Thursday, April 8, 2010

Looping Through Excel Files And Sheets in SSIS

Looping Through Excel Files and Sheets
I decided to write a small part of my SSIS projects about how to retrieve data from an Excel (97 – 2003) files. Basically this program will loop through a folder and its sub folders, find all the Excel files and within each Excel file it will loop though all its sheets, retrieve data from each sheet and inserting them to a destination table and finally moving the file to a backup folder after each process.

Criteria
1. The file format must be in a *.xls format (Excel (97 – 2003))
2. All sheets in all Excel files must have the same data format (column/Field names, data type, etc..., all meta data must be the same)
3. A copy of one of the excel files that has been renamed to SampleExcelFile.xls
4. In the SampleExcelFile.xls file you must have one sheet with data (Headers included) as a sample and the sheet name must be named “Tab”. This will help us while we are developing and we always have a sample to show the customer what was implemented.
5. A Folder that will be the main folder of the ETL (in this case it will be named => C:\SSIS\LoopingThroughExcelFilesAndSheets)
6. Two other sub folders 1- “Sample” 2- “ToBeProcessed”
7. Copy the Excel Sample File in the “Sample Folder”
8. All headers must be on the first row of each excel sheet
9. No empty sheets are accepted, remove them all

Over all View



Defining some variables in the package
One of the most important part of a SSIS package are the variables. Because some variables will be defined in the Configuration File(s), so that you can use the package dynamically,
Example 1: The server name on your computer is different that the server name on production server so basically you can have the “server name variable” defined in the XML or table configuration so that it can be changes when deploying the package from one server to another.




1. “uVar_BackupActiveFolder” Each time the package runs it makes a folder in the backup folder this folder will be known as the current backup folder which will be used to save the Excel backup files and/or maybe the Error text file.
2. “uVar_BackupMainFolderPath” Some customers prefer to have the backup folder in another server so basically we allow them to have a choice, by adding this variable (See the script section for more details). If this variable is empty it will make a default backup folder name (See backup scripts).
3. “uVar_DestinationDBName” The name of the Destination Database.
4. “uVar_DestinationServerName” The name of the Destination SQL Server.
5. “uVar_DestinationTableName” The name of the Destination Table in SQL Server.
6. “uVar_ErrorOutPutFile” This is the path and file name of the Error file, each time this path and file name changes so that we can keep a backup of each error log file.
7. “uVar_ExcelActiveTabName” The name of the active Excel Sheet name (the default value is “Tab$”) this is more needed in the design mode
8. “uVar_ExcelTabObjectName” A variable to keep the list of the sheet names of each Excel file
9. “uVar_FolderNameOfFilesToBeProcessed
10. “uVar_MainFolderPath” This is the path where you will be having your main ETL; it can contain the Package, configuration files, backup, sample folder, ToBeProcessed folder and etc... (C:\SSIS\LoopingThroughExcelFilesAndSheets)
11. “uVar_NumberOfExtractedRows” A row counter that won’t be used in this example.
12. “uVar_SMTPEmailServerName” The SMTP name or IP address of it
(e.g. 1: Neptune e.g. 2: http://www.mars.com/ e.g. 3: 10.2.10.56)
13. “uVar_SMTPSendEmailsTo_FromError” A List of email address to email the error file. (Snikkhah@Live.ca ; YourEmail@YourBox.Com )
14. “uVar_SourceConStr_ConnectionString” The Excel connection string
E.g. Provider=Microsoft.Jet.OLEDB.4.0;Data Source = C:\SSIS\LoopingThroughExcelFilesAndSheets\Sample\SampleExcelFile.xls;Extended Properties="EXCEL 8.0;HDR=Yes; IMEX=1;"
15. “uVar_SourceConStr_DataSource” A variable that will have the file name and folder (See the first loop property).
16. “uVar_SourceConStr_ExtendedProperties” Variables that have a part of the connection string hard coded (e.g. EXCEL 8.0;HDR=Yes; IMEX=1;)
17. “uVar_SourceConStr_PersistSecurityInfo” Variables that have a part of the connection string hard coded
18. “uVar_SourceConStr_Provider” Variables that have a part of the connection string hard coded
19. “uVar_SourceFileExtention” This variable will define the file extension (e.g. *.xls).
20. “uVar_SourceFileNameRenamed” Before moving a file to the backup folder we will have to rename and set the right backup folder in this variable.
21. “uVar_SourceMultiTabExcel” Some Excel files have few sheets but only one of them contains the data so I am using this variable to set my package to pickup only on defined sheet name (FALSE = Single sheet , TRUE = Multiple sheet with the same data format). In this example it must be set to “TRUE”


Defining Connection Manager
We need 4 connections, in this example I will not be setting the error and email section you can check my blog for that.

1. Excel connection manager


2. FF_DST----Error Output file for debugging


3. SQLServerDestinationConnection


4. SMTP Connection Manager For Error messages





Adding Object (controls) in the Control Flow
Now we can add the objects as seen in the “Over all View”
1. SCR---- Making the "ToBeProcessed Folder"




In this section the customer can redirect the source folder by entering the folder full path name in the “uVar_FolderNameOfFilesToBeProcessed” variable, if not the system will pick up the default name “ToBeProcessed”.

2. SCR---- Making the backup folder string from BackupFolderName Variable (Dynamic Backup folder yyyy-mm-dd-hh-mm-ss)




Same settings again the customer can define the backup folder in the variable “uVar_BackupMainFolderPath”, if not the Package will pick the default folder name “BackUpFolder”. The other thing that happens in this script is that the current backup folder name string is been made,

e.g. C:\SSIS\LoopingThroughExcelFilesAndSheets\BackUp\2010-04\2010-04-10-141000.

3. FSYS----Creating the backup folder from BackupFolderName


This will make the backup folder from the “uVar_BackupActiveFolder” which will be C:\SSIS\LoopingThroughExcelFilesAndSheets\BackUp\2010-04\2010-04-10-141000.

4. SCR--- Make the correct path and file name for the Error Text File



What I am doing is making a file name string that has, the date and time, ETL Name, etc... This will be needed it an error accurse

5. SQL----Truncate the destination Table



6. FLC - Looping through each Excel file - Finding the Data Source


By using the variable “uVar_FolderNameOfFilesToBeProcessed” the first loop will be looping through the folder and sub folder to find *.xls files “name and folder” and each file folder and name will be saved in the “uVar_SourceConStr_DataSource” e.g. C:\SSIS\LoopingThroughExcelFilesAndSheets\ToBeProcessed\AnyExcelFileName.xls

7. SCR - Make the connection string for each file


The settings are...
ReadOnlyVariable= uVar_SourceConStr_Provider, uVar_SourceConStr_DataSource, uVar_SourceConStr_PersistSecurityInfo, uVar_SourceConStr_ExtendedProperties, uVar_ExcelTabObjectName
ReadWriteVariable= uVar_SourceConStr_ConnectionString


As you know that each “Excel Connection manager” object can only point to one Excel file, so each time (On each First Loop) we have to set the “Excel Connection manager” to point to the current Excel file (“uVar_SourceConStr_DataSource”).

8. SCR - Find Tabs in Source file (tabs Only Not "Named Ranged")


Now we have the connection string in the “uVar_SourceConStr_ConnectionString” variable, by using that and this script we will be looping though each Sheet and finding that names of them, and finally saving them in the “uVar_ExcelTabObjectName” variable which is a SSIS Object variable.


Remember the variable “uVar_SourceMultiTabExcel” we use it if the Excel file sheet has multiple sheets but only one can be used, if the this is the scenario that you have your customer must not change the sheet name and the sheet name must be hard coded in the variable “uVar_ExcelActiveTabName” (e.g. Ticket Details$)

9. FLC - Looping through each Excel file TAB


The second loop will loop through the “uVar_ExcelTabObjectName” variable which contains the list of sheet names of the current Excel file.

10. DFT---- MAIN ETL (E and T Only)

11. Ex_SRC---- Connecting to Excel file to extract Data
How is this section working? The sheet name is in the “uVar_ExcelActiveTabName” which is set to the this object and its Connection is using the “Excel Connection Manager” which is set to the current Excel file by the variable “uVar_SourceConStr_ConnectionString” (That it’s been set to DelayValidation = TRUE)



12. OLE_DST----SQL SERVER Destination table


Same story as the previous step except it’s been set for the destination table in SQL server

13. SCR--- Rename file names with the format of "yyyy-mm-dd-hhmmssff"


Again making a string that will rename the current file that has been processed and save it in the current backup folder




14. FSYS---- Moving processed files to the "Backup Active Folder"


Moving the processed file to that final backup folder

15. End of the first Loop, Loop for next file

Testing Scenarios

Before testing make sure that you have set the variables according to the snap shot of the variables that I have and check the “Criteria”.

Scenario 1:
Excel files with multiple sheets (all sheet s have the same data format)
1- Set the Variable “uVar_SourceMultiTabExcel” = TRUE
2- Copy Excel files to the ToBeProcessed folder
3- Run the package
4- Check the destination table
5- Check the backup folder and find the current date time folder and see if the file have been moved from the ToBeProcessed folder to the new current backup folder

Scenario 2:
Excel files with single data sheets (the data sheet must be hard coded)
1- In the sample file do not rename the data sheet (do to rename it to Tab$, keep it as it is because this will be the fixed sheet name that will be in the Excel file)
2- Set the Variable “uVar_SourceMultiTabExcel” = False
3- Name the “uVar_ExcelActiveTabName” as the sheet name agreed with the customer e.g. “Sales$”
4- Copy Excel files to the ToBeProcessed folder
5- Run the package
6- Check the destination table
7- Check the backup folder and find the current date time folder and see if the file have been moved from the ToBeProcessed folder to the new current backup folder