How to remove duplicate records from a large table containing
about 5 million
records in a single run and with a lesser time.
i tried it with following query but it takes 10 hours of time.
delete from test1 where rowid not in (select min(rowid) from
test1 group by rc_no);
even after incraesing the rollback segs tablespace to 7gb
we are not getting desired results and while using not in
clause and cursor we generally
come across this kind of problem
thanks
and we said...
I'd
generate the set of rowids to delete using analytics and then delete
them.. like
this:
ops$tkyte@ORA9IR2> create
table t as select * from cust;
Table created.
Elapsed: 00:00:03.64
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select
count(*), count(distinct cust_seg_nbr) from t;
COUNT(*) COUNT(DISTINCTCUST_SEG_NBR)
----------
---------------------------
1871652 756667
Elapsed: 00:00:05.30
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> delete
from t
2
where rowid in ( select rid
3 from ( select rowid
rid,
4
row_number() over
5
(partition by <column_name> order by rowid) rn
6 from t
7 )
8 where rn <> 1 )
9
/
1114985 rows deleted.
Elapsed: 00:01:46.06
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select count(*), count(distinct
cust_seg_nbr) from t;
COUNT(*) COUNT(DISTINCTCUST_SEG_NBR)
----------
---------------------------
756667 756667
Elapsed: 00:00:02.48
As for the RBS -- it'll get as big as it needs to be in
order to process the delete --
every index will make it "larger" and take
longer as well (index maintainence is
expensive)
if you are deleting "alot of the rows" you
might be better off disabling indexes, doing
the delete and rebuilding them.
OR, creating a new table that just keeps the "right
records" and dropping the old table:
ops$tkyte@ORA9IR2> create table t as select * from
cust;
Table created.
Elapsed: 00:00:02.41
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select count(*), count(distinct
cust_seg_nbr) from t;
COUNT(*)
COUNT(DISTINCTCUST_SEG_NBR)
---------- ---------------------------
1871652 756667
Elapsed: 00:00:04.60
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t2
2 as
3 select cust_seg_nbr
4 from ( select t.*, row_number() over
(partition by cust_seg_nbr order by rowid) rn
5 from t
6 )
7 where rn = 1
8 /
Table created.
Elapsed: 00:00:10.93
ops$tkyte@ORA9IR2> drop table t;
Table dropped.
Elapsed: 00:00:00.56
ops$tkyte@ORA9IR2> rename t2 to t;
Table renamed.
Elapsed: 00:00:00.01
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select count(*), count(distinct
cust_seg_nbr) from t;
COUNT(*)
COUNT(DISTINCTCUST_SEG_NBR)
---------- ---------------------------
756667 756667
Elapsed: 00:00:01.18
ops$tkyte@ORA9IR2>
==============================================================================================
|
No comments:
Post a Comment