Duplicate Table
Has this ever happened to you? You are doing left joins to tables
and somehow your most recent table has more records than it should. What
happened? You were careful to ensure your ID field is unique, but for
some unexplained reason you have too many records. The best way to solve
the problem is to identify if your table has duplicates and what those
duplicates are. I like to create a sub-table that lists all of the fields
of every record that is part of a duplicate. The code below will quickly
create a table called "dups" that contains every record that is part
of a duplicate and sort it by the field in question.
Create a
table of duplicate records
%Let
TableName = TableName;
%Let
FieldName = FieldName;
proc sql;
create table dups as
select *
from &TableName
where ((&FieldName in(select &FieldName from &TableName as tmp
group by &FieldName
having Count(*)>1)))
order by &FieldName
;
Quit;
This query uses the in
statement to query against itself and pulls all fields from all
records where your field name occurs more than once. The limitation of
this query is that it assumes you are dealing with one field that is supposed
to be unique.
Macro
Names
Use the %Let function to identify your table name and your field name so
you don't have to type the table and field names multiple times
TableName is the
table you think has duplicate records
FieldName is the
field that is supposed to be unique
Usage
I keep this code in a macro abbreviation (see previous blog) under the shortcut text “dups”. This way it is always very easy to do a quick
duplicate check. Use this code to
identify the records where duplicate occur.
Usually once you see the value of all of the fields that are part of a
duplicate is becomes obvious why the duplicates occur. I like to use this code to confirm there are
no duplicates in my tables. Run the code
and check the log screen to ensure the number of records for the dups is 0. It takes less than a minute and it is a good
check.
I would love to hear how you check for duplicates
in your tables.
No comments:
Post a Comment