Wednesday 10 June 2015

Reading Multi Lingual data from a Flat File using SSIS Package


 Reading Multi lingual File using SSIS Package


 When we are dealing with the business which has operations across the geographic regions, we will get customer information in respective regional languages. We have to read the data in the same language and maintain in the database tables.


In this post, I am going to explain how to read a flat file which has data in different languages using SSIS package. There are multiple ways to achieve this. I am going to explain one of them.


Following is the screen shot of the flat file I am going to use in SSIS package, which has records in different languages. I have created a column to tell you it is in which language.




Here are the languages you can find in the file -
--------------------------------------------------------------------------------
1st Record is in Telugu (South Indian Language)
2nd Record is in Kannada (South Indian Language)
3rd Record is in Tamil (South Indian Language)
4th Record is in English
5th Record is in Japanese Language
--------------------------------------------------------------------------------


Following are the steps to create the Package.


1. Create an SSIS Project using SQL Server Data Tools Wizard.




 2. Create a package.




 3. Create a flat file Connection




 4. Choose FlatFile Connection Manager




 5. Browse through the files and choose appropriate file.









6. Select the code page as 65001 (UTF-8). This is the place where we are going to give instructions to engine to read data in Unicode format.






7. Choose the appropriate text qualifier.






 8. Go to column properties to set the row delimiter and the column delimiter.




 9. Navigate to advanced properties to change some properties at each column level like
  1. Column Delimiter
  2. Column Width
  3. Text qualified to true or false at the column level.   
  4. Data type to Unicode String [DT_WSTR]





 10. Preview the data after setting the properties.




 11. Let’s create a Data Flow Task to read data from source flat file and load into a database table.




12. Select the connection manager and preview to cross check the data in file.





 13. Here we can change the output column names if needed.



 14. Drag and Drop OLEDB destination





 15. Select the Destination connection manager and the destination table.




 16. Click on Columns and map source columns with target. As I told you in my previous post, SSIS maps the columns based on the names by default. If it is finding the same column names in both source and Target, it will map them. So, make sure the column mappings are correct. Otherwise, you can manually change the mappings.






 17. Save the package and execute the task.




 18. Here is the output after successfully executing Data Flow Task.






Hope this post is helpful to those who are dealing with multi lingual data.


All the best.

No comments:

Post a Comment