Skip to main content

SSIS: Transform All String Columns in a Data Flow Stream

By April 2, 2012May 7th, 2021Press

How many times has a Quality Assurance person asked, “Can we trim the extra spaces off all of the text columns?” Or even “Can we convert all of the blanks to nulls?” In the past, I would just roll my eyes (mostly because I forgot to check this before sending to QA in the first place) and go through the monotonous process of updating each of my string columns with a trim and conversion to null. However, I recently was asked to do just this for a file/table containing over 200 columns. I figured there must be an easier way. After a few minutes of searching through the capabilities of an SSIS Script Task, I created the following solution using a few lines of C#:

  1. Place a Script Component in the Data Flow where the data needs to be trimmed and converted from blank to null. Choose “Transformation” as the type and be sure to connect the source to the Script Task.
  2. Double click the new Script Task to open its editor
  3. Select Input Columns from the left pane, select the string columns and set the “Usage Type” of each to “ReadWrite” (I just set all of my columns to ReadWrite to make sure)
  4. Select Script from the left pane
  5. Choose Microsoft Visual C# 2008 for the ScriptLanguage property
  6. Click “Edit Script…” button
  7. Remove everything inside of the ScriptMain class (Remove all 3 methods: PreExecute(), PostExecute() and Input0_ProcessInputRow(Input0Buffer Row)
  8. Insert the following code inside of the curly braces for the ScriptMain class:
public override void ProcessInput(int InputID, Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer Buffer)

{

while (Buffer.NextRow())

{

var cols = Buffer.ColumnCount;

//loop through each column

for (var i = 0; i < cols; i++)

{

//get datatype of the column

var type = Buffer.GetColumnInfo(i).DataType;

//only operate on columns of type DT_STR or DT_WSTR

if (type == DataType.DT_STR || type == DataType.DT_WSTR)

{

//grab the data from the column and trim it

var colData = Buffer[i].ToString().Trim();

if (string.IsNullOrEmpty(colData))

{

//if the trimmed data is blank or null, set the output to null

Buffer.SetNull(i);

}

else

{

//if the column has data, set the output to the trimmed version

Buffer.SetString(i, colData);

}

}

}

}

}

 

The function loops through all columns in the stream, searching for columns of data type DT_STR or DT_WSTR, trims the data, and converts empty strings to nulls.  However, this function could easily be tweaked to perform any operation that would apply to all columns of a single data type.  Simply update the line that contains “type=DataType.DT_STR” to contain the data types to be operated on and edit the block of code inside to perform the transformations that are needed for those data types.  To set the output data, use any of the Buffer.Set* methods, passing in the column index (i in the example) and new data (colData in the example) after applying the necessary transformation logic.  To set a column null, use the Buffer.SetNull method, passing only the index to the column that needs to be null.

 

Book A Discovery Call

Fill out the form below to schedule your 20-minute discovery call.

  • This field is for validation purposes and should be left unchanged.
Close