Wednesday, September 9, 2009

Excel Macros : Remove Duplicates Excel

Excel 2007 has built in command to remove duplicates, however earlier versions of excel does not have built in command. You need to use VBA routine to do the job.

Use following code to Identify duplicate values in a column and Remove Rows having Duplicate Values

After identifying the duplicate values system will give you warning and ask for confirmation to delete the entire row(s) having duplicate values. You can choose No in case you do not want to delete them.

You will need to copy this code to regular VBA module of your workbook

Sub Find_RemoveDuplicates()

'****************************************************************************************************
'* Macro recorded by Yogesh Gupta, smiley123z@gmail.com, Ygblogs.blogspot.com on September 9, 2009 *
'* Macro Modified on September 13, 2009 for not deleting empty cells within duplicate values column *
'****************************************************************************************************


Dim cRow As Long
Dim lRow As Long
Dim sCell As Range
Dim D As Long

lRow = GetLastRowWithData

On Error Resume Next
Set sCell = Application.InputBox(Prompt:= _
"Select Starting Row of Column with Duplicate values.", _
Title:="Select Column", Type:=8)

On Error GoTo 0
If sCell Is Nothing Then
Exit Sub
End If

sCell.Select

If Selection.Cells.Count > 1 Then
MsgBox ("Pls select Single Cell only")
Exit Sub

Else

dCol = ActiveCell.Column
sRow = ActiveCell.Row

For cRow = lRow To sRow Step -1
If IsEmpty(Cells(cRow, dCol)) = False Then
If WorksheetFunction.CountIf(Range(sCell, Cells(cRow, dCol)), Cells(cRow, dCol).Text) > 1 Then
Cells(cRow, dCol).Interior.Color = 255 'This Marks the duplicate values as Red
D = D + 1
End If
End If
Next cRow

If D = 0 Then

MsgBox ("No Duplicate Values Found")

Else

caution = MsgBox(D & " Duplicate entries selected and marked RED" & vbCrLf & _
"Do you want to delete them? " & vbCrLf & _
"Entire Row for the marked entries will be deleted. " & vbCrLf & _
"Do you want to Continue?", vbYesNo, "Confirmation")

If caution = vbYes Then

For cRow = lRow To sRow Step -1
If IsEmpty(Cells(cRow, dCol)) = False Then
If WorksheetFunction.CountIf(Range(sCell, Cells(cRow, dCol)), Cells(cRow, dCol).Text) > 1 Then
Cells(cRow, dCol).EntireRow.Delete ' This deletes the entire row
End If
End If
Next cRow

End If
End If
End If

End Sub

'********************************************************************
'* UDF to Get Last Row with Data on worksheet *
'********************************************************************

Public Function GetLastRowWithData() As Long

Dim ExcelLastCell As Object, lRow As Long, lLastDataRow As Long, l As Long

Set ExcelLastCell = ActiveSheet.Cells.SpecialCells(xlLastCell)
lLastDataRow = ExcelLastCell.Row
lRow = ExcelLastCell.Row

Do While Application.CountA(ActiveSheet.Rows(lRow)) = 0 And lRow <> 1
lRow = lRow - 1
Loop

lLastDataRow = lRow
GetLastRowWithData = lLastDataRow

End Function

Download file having Macro to Find and Remove Duplicates in Excel.



Above works well in case you are trying to clean a data table on the sheet, however if you have lots of other data also on the sheet, you should try the Find and Remove duplicates within Range

RemoveDuplicates

You will need to copy this code to regular VBA module of your workbook

Sub Find_RemoveDuplicatesWithinRange()

'***************************************************************************************************
'* Macro recorded by Yogesh Gupta, smiley123z@gmail.com, Ygblogs.blogspot.com on September 13, 2009 *
'***************************************************************************************************

Dim cRow As Long
Dim sRow As Long
Dim lRow As Long
Dim sCol As Long
Dim lCol As Long
Dim dColumn As Long
Dim D As Long
Dim sCell As Range
Dim dRange As Range


On Error Resume Next

Set dRange = Application.InputBox(Prompt:= _
"Select Data Range with Duplicate values.", _
Default:=Selection.Address, Title:="Select Range", Type:=8)

If dRange Is Nothing Then
Exit Sub
End If

With dRange
sRow = Cells(.Row, .Column).Row
sCol = Cells(.Row, .Column).Column
lCol = Cells(.Row, .Column + .Columns.Count - 1).Column
lRow = Cells(.Row + .Rows.Count - 1, .Column + .Columns.Count - 1).Row
End With

dColumn = 0

Do While dColumn = 0

Set sCell = Application.InputBox(Prompt:= _
"Select Column with Duplicate values.", _
Title:="Select Column", Type:=8)

On Error GoTo 0

If sCell Is Nothing Then
Exit Sub
End If


If sCell.Columns.Count > 1 Then

MsgBox ("Pls select Single Column only")

Else

dCol = sCell.Column
If dCol < sCol Then
MsgBox ("Select Column within earlier Range")
ElseIf dCol > lCol Then
MsgBox ("Select Column within earlier Range")
Else
dColumn = 1
End If

End If

Loop

For cRow = lRow To sRow Step -1
If IsEmpty(Cells(cRow, dCol)) = False Then
If WorksheetFunction.CountIf(Range(Cells(sRow, dCol), Cells(cRow, dCol)), Cells(cRow, dCol).Text) > 1 Then
Range(Cells(cRow, sCol), Cells(cRow, lCol)).Select
Selection.Interior.Color = 255
D = D + 1
End If
End If
Next cRow

Cells(sRow, sCol).Select

If D = 0 Then

MsgBox ("No Duplicate Values Found")

Else

caution = MsgBox(D & " Duplicate entries selected and marked RED" & vbCrLf & _
"Do you want to delete them? " & vbCrLf & _
"Marked entries will be deleted and data will move up " & vbCrLf & _
"Do you want to Continue?", vbYesNo, "Confirmation")

If caution = vbYes Then

For cRow = lRow To sRow Step -1
If IsEmpty(Cells(cRow, dCol)) = False Then
If WorksheetFunction.CountIf(Range(Cells(sRow, dCol), Cells(cRow, dCol)), Cells(cRow, dCol).Text) > 1 Then
Range(Cells(cRow, sCol), Cells(cRow, lCol)).Select
Selection.Delete Shift:=xlUp
End If
End If
Next cRow
Cells(sRow, sCol).Select
End If
End If

End Sub

Download file having macro to find and remove duplicates within a range



Delete Duplicates , Delete Duplicate , Remove Duplicates , Remove Duplicate , Find Duplicates , Find Duplicate , Excel Delete Duplicates , Remove Duplicates Excel , Remove Duplicate Excel

7 comments:

  1. How do you remove both the original and the duplicate?

    ReplyDelete
  2. Yogesh,

    I tried running your macro to delete rows with duplicate data in the column which contains "Description" i.e. long sentences. It gives me an errors "Run-time error 1004, Unable to get the CountIf property of the WorksheetFunction class". Do you know how to resolve this. The macro runs fine if I try to remove duplicates with minimum string length, but to remove duplicate sentences, it gives the above error.

    Thanks,
    Robin

    ReplyDelete
  3. Looks like there's a minor change required for the first macro. The alert should read "Select Starting Cell For Column with Duplicate values." because it's looking for a cell reference, not the row #

    ReplyDelete
  4. This is an amazing Macro, just not exactly what i need. I am trying to change (Sorry I am a total n00b with macros)it and make it delete all the duplicated, but the last one. I am doing a list of IDs and years but i need to delete all dublicated reports, but the last one. Can anyone help me, I have already tried changing the For order so go from first to last i guess it is ok this way:

    For cRow=sRow to lRow step 1
    Next cRow

    But actually it didnt change anything at all help me please

    ReplyDelete
  5. In the example it has

    1 AA
    2 AA

    Well, I need you help to make it delete the 1 AA instead of the 2 AA

    11 FF
    12 FF
    13 FF
    14 FF
    15 FF I want to all the previous be deleted expet the last one, please help i just coudlnt figure it out

    ReplyDelete
  6. removing duplicate rows ..your macro helped me a lot,
    thanks you

    AMIT TARALI

    ReplyDelete
  7. I want to convert this chart:

    Name People
    XYZ KK
    XYZ KY
    ABC LL
    ABC LY
    BBC XY
    BBC XY
    CBA XX
    DBC XX
    DDD XX
    DDD XY
    DDD XX
    to this chart:

    XYZ KK KY
    ABC LL LY
    BBC XY XY
    CBA XX
    DBC XX
    DDD XX XY XX

    Is there any macro for that?

    ReplyDelete