Filter to display only duplicate entries in Excel 2010

I have two files: Old and New. I need to delete any entry that Old contains from New. So I exported both files to a tab delimited text file, and then merged the two text files, so now I have a lot of duplicate entries. My hopes were that I could come up with some kind of formula showing only duplicate values, so I could just remove them from there. Now, how can I go from the point where I am now?

An example of the old:

[41] [john] [smith] [732] [bill] [murry] [234] [adam] [sandler] [9029] [tim] [allen] 

New example:

 [10422] [bill] [gates] [11456] [steve] [ballmer] [732] [bill] [murry] [24510] [jimmy] [jones] [41] [john] [smith] [234] [adam] [sandler] [78241] [funny] [bunny] [9029] [tim] [allen] 

Example combined:

 [10422] [bill] [gates] [11456] [steve] [ballmer] [732] [bill] [murry] [732] [bill] [murry] [24510] [jimmy] [jones] [41] [john] [smith] [41] [john] [smith] [234] [adam] [sandler] [234] [adam] [sandler] [78241] [funny] [bunny] [9029] [tim] [allen] [9029] [tim] [allen] 

An example of what I want to show:

 [732] [bill] [murry] [732] [bill] [murry] [41] [john] [smith] [41] [john] [smith] [234] [adam] [sandler] [234] [adam] [sandler] [9029] [tim] [allen] [9029] [tim] [allen] 

The name of the column that I will check / compare is "voter_id", which is the field of the number that you see in the examples. I hope that I just show only these entries, I can select them all and delete them. Then remove the filter / formula and leave something like:

 [10422] [bill] [gates] [11456] [steve] [ballmer] [24510] [jimmy] [jones] [78241] [funny] [bunny] 

I apologize for asking a separate question for this, but I found a way to ask the same thing differently, so that it makes more sense to me. These are essentially the same results, but in two completely different methods.

+7
source share
2 answers

There are many ways to do this. If you are not familiar with the formulas in Excel, the easiest solution that I consider for you now is to use a pivot table:

  • Select a table (including headers)
  • Paste β†’ Pivot Table β†’ OK
  • If you do not see the field list, right-click on the pivot table and select "Show Field List".
  • Now in the pivot table, drag all the columns to Row Labels.
  • Drag the voter_id column to Values.
  • Your pivot table should now have a column that says "Total voter_id". Right-click on it and select "Summarize Data" with β†’ Count.
  • Now you can see all your entries and the number of entries. You can select all cells in the pivot table, copy and paste the values ​​back into the original table. Then you can use a simple filter to show only records with more than one occurrence.

If you like formulas, you can simply add a new column to this column COUNTIF , use COUNTIF to find out the number of times voter_id exists in your table. After that, use a simple filter to filter out those that have count = 1.

Another easy way to do this is, instead of sharing the two files, simply use VLOOKUP to determine if each entry exists in one file in the other.

+6
source

Try conditional formatting On the home tab - Conditional formatting - highlighting cell rules - duplicate values ​​then filter by color and delete duplicates.

+23
source

All Articles