Friday, June 18, 2010

Chronological Order (List files by oldest first)

Chronological Order (List files by oldest first)
Extracting (listing) the files in a chronological order by, (this example “Modified Date”).

Criteria
--
Over all View


Defining some variables in the package


1. “uVar_ArrayOfFilesProperties” An Object that will contain each files information.
2. “uVar_CurrentFileName” Name of the current file
3. “uVar_DateCreated” File creation date
4. “uVar_DateMofified” File creation modified
5. “uVar_FileExtention” i.e. “xls” for excel, “txt” for text.
6. “uVar_FileName” The file name only without the file extension.
7. “uVar_FirstTime” A flag for the flow of the package
8. “uVar_FolderNameOfTheFilesToBeProcessed” The main folder root of the files that need to be in a chronological order.
9. “uVar_Path” The path of each file
10. “uVar_PathFileName” The path, filename and extension all together
11. “uVar_SourceFileExtention” This variable will define the file extension (e.g. *.csv).

Adding Object (controls) in the Control Flow
Adding the objects as seen in the “Over all View”

1. FELC--Looping through files and folder
The first loop will find the file and process them one by one


2. SCR--Getting each files property information
The script will retrieve each files property; it’s like right clicking on a file and selecting properties.



The files properties information will be saved in the variables (See the script or check the variables with a red dot in the variable table).

3. DFT--List of sorted files by ModifiedDate
Over view of the DFT


In the DFT you will have
a. SCR--Appending the records in SSIS object variable with the current file info for the output records

You will have to use the ADD COLUMN option and add each Fields; you will have to select the right data type for each field



(Example for data type: DateModified = [DT_DBTIMESTAMP] , Path = [DT_WSTR] 3000, FileName = = [DT_WSTR] 200 , etc...)


All variable that contain the files information must be passed on to the Script component and the “User::uVar_ArrayOfFileProperties” variable.
The files properties information will be saved in the variables (See the script or check the variables with a red dot in the variable table).
• Scrip
The script is divided into 2 section, the first section is before the “IF” and the second section is within the “IF”, the first section is adding the current files information into the Output Buffer of the script component, the second part is looping through all the records in the SSIS object “uVarArrayOfFileProperties” and adding them to the Output Buffer of the script component, so the output result will always be the current files and the previous files information.
---------------
Public Overrides Sub CreateNewOutputRows()
'
'
With MainOutputrecordsBuffer
.AddRow()
.PathFileName = Me.Variables.uVarPathFileName
.Path = Me.Variables.uVarPath
.FileName = Me.Variables.uVarFileName
.FileExtension = Me.Variables.uVarFileExtension
.DateCreated = Me.Variables.uVarDateCreated
.DateModified = Me.Variables.uVarDateModified
End With

If Me.Variables.uVarFirstTime = False Then

Dim oleDA As New Data.OleDb.OleDbDataAdapter
Dim dt As New Data.DataTable

Dim row As System.Data.DataRow

Dim strPathFileName As Integer
Dim strPath As Integer
Dim strFileName As Integer
Dim strFileExtension As Integer
Dim strDateCreated As Integer
Dim strDateModified As Integer

oleDA.Fill(dt, Me.Variables.uVarArrayOfFileProperties)

strPathFileName = dt.Columns("PathFileName").Ordinal
strPath = dt.Columns("Path").Ordinal
strFileName = dt.Columns("FileName").Ordinal
strFileExtension = dt.Columns("FileExtension").Ordinal
strDateCreated = dt.Columns("DateCreated").Ordinal
strDateModified = dt.Columns("DateModified").Ordinal

For Each row In dt.Rows
With MainOutputrecordsBuffer
.AddRow()
.PathFileName = row(strPathFileName).ToString()
.Path = row(strPath).ToString()
.FileName = row(strFileName).ToString()
.FileExtension = row(strFileExtension).ToString()
.DateCreated = CType(row(strDateCreated).ToString(), DateTime)
.DateModified = CType(row(strDateModified).ToString(), DateTime)
End With
Next
MainOutputrecordsBuffer.EndOfRowset()

End If
End Sub

----------------

b. SRT--Sorting the list by "DateModified"
This is almost where you have to choose your chronological order , I have selected the modified date, why?


Do you think that you can find a files property that says the Create date is bigger that the modified date (Create = 2010-01-01 and Modified = 2007-01-01)? Yes you can, a file can be made on a computer on 2005, and be modified on the same computer on 2007, so far so good, and the same file can be pushed to a another computer on 2010-01-01 so on the second computer the files properties show that Create Date = 2010-01-01 and Modified Date= 2007-01-01.
c. RS_DST--Saving records in the SSIS objects
Now we have to save the sorted result to a SSIS Object variable.


Please remember to select all fields.


4. SCR--Setting a flag
Passing on the flag variable to set the flag to FALSE, the reason we do this is that, if you look at the script in the DFT you will see that the flag is been used because the SSIS Object does not have a structure format for the first time so we won’t be able to use the object (See in script “oleDA.Fill(dt, Me.Variables.uVarArrayOfFileProperties)”).




5. FELC--Looping the sorted file list
Setting the second file


This will loop through each record in the object one by one and save the value in the mentioned variables.


6. SCR-- Show the result in a MsgBox (For test only)
In the second loop I am sure that you will be having a DFT to do the main process in my case I just want to display the sorted order.


Note: Step 6 is only for testing only, please delete the object after testing.


Testing Scenarios
Before testing make sure that you have set the variables according to the snap shot of the variables that I have and check the “Criteria”.

Scenario 1:
1- Select a folder that contains the files that you need to be sorted
2- Cut and past the folder into the SSIS variable “uVar_FolderNameOfFilesToBeProcessed”, this will point to that folder
3- Maybe you want to select a certain group of files like *.csv then you need to put “*.csv” in the variable “uVar_SourceFileExtention”
4- Run the package and you will see a MsgBox showing the files properties, check the files Modified date with the results of the MsgBox

Note: Step 6 is only for testing please delete the second SCR after testing.

2 comments:

  1. Thanks a lot for posting this! It really helped me out a lot with a similar project.

    ReplyDelete
  2. I guess this is a very interesting and informative blog post about the usages of SSIS and other very interesting components included with it.

    SSIS Postgresql Read

    ReplyDelete