Filter to display only duplicate entries in Excel 2010

Question

Filter to display only duplicate entries in Excel 2010

I have two files: Old and New. I need to delete any entry that Old contains from New. So I exported both files to a tab delimited text file, and then merged the two text files, so now I have a lot of duplicate entries. My hopes were that I could come up with some kind of formula showing only duplicate values, so I could just remove them from there. Now, how can I go from the point where I am now?

An example of the old:

[41] [john] [smith] [732] [bill] [murry] [234] [adam] [sandler] [9029] [tim] [allen]

New example:

 [10422] [bill] [gates] [11456] [steve] [ballmer] [732] [bill] [murry] [24510] [jimmy] [jones] [41] [john] [smith] [234] [adam] [sandler] [78241] [funny] [bunny] [9029] [tim] [allen]

Example combined:

 [10422] [bill] [gates] [11456] [steve] [ballmer] [732] [bill] [murry] [732] [bill] [murry] [24510] [jimmy] [jones] [41] [john] [smith] [41] [john] [smith] [234] [adam] [sandler] [234] [adam] [sandler] [78241] [funny] [bunny] [9029] [tim] [allen] [9029] [tim] [allen]

An example of what I want to show:

 [732] [bill] [murry] [732] [bill] [murry] [41] [john] [smith] [41] [john] [smith] [234] [adam] [sandler] [234] [adam] [sandler] [9029] [tim] [allen] [9029] [tim] [allen]

The name of the column that I will check / compare is "voter_id", which is the field of the number that you see in the examples. I hope that I just show only these entries, I can select them all and delete them. Then remove the filter / formula and leave something like:

 [10422] [bill] [gates] [11456] [steve] [ballmer] [24510] [jimmy] [jones] [78241] [funny] [bunny]

I apologize for asking a separate question for this, but I found a way to ask the same thing differently, so that it makes more sense to me. These are essentially the same results, but in two completely different methods.

+7

filter duplicates excel formula

Shea Jan 31 '12 at 3:58

source share

2 answers

Try conditional formatting On the home tab - Conditional formatting - highlighting cell rules - duplicate values then filter by color and delete duplicates.

+23

Raystafarian Jan 31 '12 at 17:31

source share

S200 · Accepted Answer · 2012-01-31T06:41:31+0000

There are many ways to do this. If you are not familiar with the formulas in Excel, the easiest solution that I consider for you now is to use a pivot table:

Select a table (including headers)
Paste → Pivot Table → OK
If you do not see the field list, right-click on the pivot table and select "Show Field List".
Now in the pivot table, drag all the columns to Row Labels.
Drag the voter_id column to Values.
Your pivot table should now have a column that says "Total voter_id". Right-click on it and select "Summarize Data" with → Count.
Now you can see all your entries and the number of entries. You can select all cells in the pivot table, copy and paste the values back into the original table. Then you can use a simple filter to show only records with more than one occurrence.

If you like formulas, you can simply add a new column to this column COUNTIF , use COUNTIF to find out the number of times voter_id exists in your table. After that, use a simple filter to filter out those that have count = 1.

Another easy way to do this is, instead of sharing the two files, simply use VLOOKUP to determine if each entry exists in one file in the other.

Filter to display only duplicate entries in Excel 2010

More articles: