| 
     
How to remove duplicate records from a large table containing
    about 5 million  
records in a single run and with a lesser time. 
i tried it with following query but it takes 10 hours of time. 
  
delete from test1 where rowid not in (select min(rowid) from
    test1 group by rc_no); 
  
even after incraesing the rollback segs tablespace to 7gb 
we are not getting desired results and while using not in
    clause and cursor we generally  
come across this kind of problem 
  
thanks   
 
     
and we said... 
I'd
    generate the set of rowids to delete using analytics and then delete
    them..  like  
this: 
  
  
ops$tkyte@ORA9IR2> create
    table t as select * from cust; 
  
Table created. 
  
Elapsed: 00:00:03.64 
ops$tkyte@ORA9IR2> 
ops$tkyte@ORA9IR2> select
    count(*), count(distinct cust_seg_nbr) from t; 
  
  COUNT(*) COUNT(DISTINCTCUST_SEG_NBR) 
----------
    --------------------------- 
   1871652                      756667 
  
Elapsed: 00:00:05.30 
ops$tkyte@ORA9IR2> 
ops$tkyte@ORA9IR2> delete
    from t 
  2   
    where rowid in ( select rid 
  3                       from ( select rowid
    rid, 
  4                                    
    row_number() over 
  5                                      
    (partition by <column_name> order by rowid) rn 
  6                                from t 
  7                            ) 
  8                     where rn <> 1 ) 
  9 
    / 
  
1114985 rows deleted. 
  
Elapsed: 00:01:46.06 
ops$tkyte@ORA9IR2> 
ops$tkyte@ORA9IR2>  select count(*), count(distinct
    cust_seg_nbr) from t; 
  
  COUNT(*) COUNT(DISTINCTCUST_SEG_NBR) 
----------
    --------------------------- 
    756667                      756667 
  
Elapsed: 00:00:02.48 
  
  
  
As for the RBS -- it'll get as big as it needs to be in
    order to process the delete --  
every index will make it "larger" and take
    longer as well (index maintainence is  
expensive) 
  
if you are deleting "alot of the rows" you
    might be better off disabling indexes, doing  
the delete and rebuilding them. 
  
  
OR, creating a new table that just keeps the "right
    records" and dropping the old table: 
  
  
ops$tkyte@ORA9IR2> create table t as select * from
    cust; 
  
Table created. 
  
Elapsed: 00:00:02.41 
ops$tkyte@ORA9IR2> 
ops$tkyte@ORA9IR2> select count(*), count(distinct
    cust_seg_nbr) from t; 
  
  COUNT(*)
    COUNT(DISTINCTCUST_SEG_NBR) 
---------- --------------------------- 
   1871652                      756667 
  
Elapsed: 00:00:04.60 
ops$tkyte@ORA9IR2> 
ops$tkyte@ORA9IR2> create table t2 
  2  as 
  3  select cust_seg_nbr 
  4    from ( select t.*, row_number() over
    (partition by cust_seg_nbr order by rowid) rn 
  5             from t 
  6             ) 
  7   where rn = 1 
  8  / 
  
Table created. 
  
Elapsed: 00:00:10.93 
ops$tkyte@ORA9IR2> drop table t; 
  
Table dropped. 
  
Elapsed: 00:00:00.56 
ops$tkyte@ORA9IR2> rename t2 to t; 
  
Table renamed. 
  
Elapsed: 00:00:00.01 
ops$tkyte@ORA9IR2> 
ops$tkyte@ORA9IR2> select count(*), count(distinct
    cust_seg_nbr) from t; 
  
  COUNT(*)
    COUNT(DISTINCTCUST_SEG_NBR) 
---------- --------------------------- 
    756667                      756667 
  
Elapsed: 00:00:01.18 
ops$tkyte@ORA9IR2> 
============================================================================================== 
 | 
   
No comments:
Post a Comment