Adding Or Removing Dashes From A Social Security Number In SQL

Social security numbers tend to be stored in one of two ways in a database.  Typically they are stored as a character string with a length of either 9 or 11, depending on whether or not they include dashes.  I recently built a report that pulled employee data from a 3rd party database that we get on a daily basis.  It’s basically a data dump of the 3rd party’s database that we use solely for reporting.  For some reason, there does not appear to be any consistency with the data being provided in terms of the social security numbers. Some are stored with dashes, while others are not.  It’s likely this is a free-form field within the application used to enter the data and no validation is being performed.

There are several ways this could be resolved.  One option would be to modify the data in the database to either remove all dashes or insert the dashes where they do not exist.  However, as stated above, this data is overwritten every day so if I went with this option, I would have to create an automated job to handle this.  Since this database is only used for reporting purposes, the better option, in this case, is to just handle this within the report itself, though I will discuss the first option below as well.

Inserting the dashes is actually quite simple.  The code shown below demonstrates inserting the dashes into a character string to conform with the standard format of a social security number.

Since this is simply an example I have created a variable named @ssn and SET it to represent a social security number without any dashes so that I can then run a SELECT against it to demonstrate the necessary syntax to insert the dashes.

If I were incorporating this into an actual report I would simply replace the @ssn variable within the SELECT with the column name of the actual SSN.  I would also need to add some additional logic to ignore any social security numbers that already include the dash.  An example of this can be seen further down when I demonstrate updating the data in the table.

Running the above code will display the following:


Additionally, you can also remove the dashes from a social security number.  The following code will replace all dashes with a blank value, thus removing them from the data.

The screenshot below shows this code being executed along with the results.


As I mentioned above, handling this within the report was the best option in this case.  However, there could be cases where you might want to just update the data within the database to allow for the data to be consistent within the table.  However, before going this route you will need to make sure the column containing the social security number data can store values with a length of 11 or more if inserting dashes.

The examples below demonstrates updating a table named testssn and modifying the data based on the initial length. If the length is equal to 9 it assumes there are no dashes and inserts them in the appropriate location.  If the length is equal to 11 it removes the dashes.

Though this script is by no means perfect, as it does not account for bad data being entered by the users, it should accomplish the required task.  Since in my case the data is being entered free-form by users, which is why the data was inconsistent in the first place, an additional step might be necessary once the data is updated.  Once the update has been made, it would be a good idea to check for any data in the table column where the length is not equal to either 9 or 11, based on the option you selected.  This should return only values where the social security number was not updated.  An example of this could be a social security number that was entered with only 8 digits.