Updating millions of rows

We wrote a custome procedure which opens a simple cursor and reads all the 58 million rows from the SOURCE Table and in a loop processes the rows and inserts the records into a TARGET Table. BULKLOAD", line 66 ORA-06512: at line 1 We got the same error even with 1 million rows. src_cpd_dt_array.count INSERT INTO ima_dly_acct ( CPD_DT, ACQR_CTRY_CD, ACQR_TIER_CD, ACQR_PCR_CTRY_CD, ACQR_PCR_TIER_CD, ISSR_BIN, OWNR_BUS_ID, USER_BUS_ID, MRCH_LOCN_REF_ID, NTWRK_ID, STIP_ADVC_CD, AUTHN_RESP_CD, AUTHN_ACTVY_CD, RESP_TM_ID, PROD_REF_ID, MRCH_REF_ID, ISSR_PCR, ISSR_CTRY_CD, ACCT_NUM, TRAN_CNT, USD_TRAN_AMT) VALUES ( src_cpd_dt_array(j), src_acqr_ctry_cd_array(j), null, src_acqr_pcr_ctry_cd_array(j), null, src_issr_bin_array(j), null, null, src_mrch_locn_ref_id_array(j), src_ntwrk_id_array(j), src_stip_advc_cd_array(j), src_authn_resp_cd_array(j), src_authn_actvy_cd_array(j), src_resp_tm_id_array(j), null, src_mrch_ref_id_array(j), src_issr_pcr_array(j), src_issr_ctry_cd_array(j), src_acct_num_array(j), src_tran_cnt_array(j), src_usd_tran_amt_array(j)); COMMIT; END bulkload; / SHOW ERRORS -------------------------------------------- good gosh -- you aren't serious are you??? Also, I utterly fail to see how you could use rownum to limit rows with bulk binds, it it not possible. You want to "stream" data -- get some, process some, write some, get some, process some, write some.The logic works fine but it took 20hrs to complete the load. We declared PL/SQL BINARY_INDEXed Tables to store the data in memory. We do have the following configuration: SGA - 8.2 GB PGA - Aggregate Target - 3GB - Current Allocated - 439444KB (439 MB) - Maximum allocated - 2695753 KB (2.6 GB) Temp Table Space - 60.9 GB (Total) - 20 GB (Available approximately) I think we do have more than enough memory to process the 1 million rows!! Use the LIMIT clause, bulk collect say 100 to 1000 rows -- process them, bulk insert them, get the next 100/1000 rows. Your process got bigger then your OS would allow you (you hit an OS limit, might be ulimit related or whatever). LIMIT clause will do one fetch less if the last fetch has less then [limit] rows. You don't want to GET ALL, process all, WRITE ALL and flood one thing or the other -- do a bit, process a bit, write a bit, start over. I'd make sure to have suitable indexes from the where clauses and joins and drop any indexes not used by the query.I'd expect setting the database in single user mode might help.* the update statement comes up with a good, efficient query plan. COMMIT TRANSACTION end It may actually take longer to run, but you'll at least make forward progress that doesn't get undone after the update runs for 20 hours and dies for one reason or another.* you're updating the majority of rows in the table there are no real 'acceleration' techniques available -- there's a fixed amount of work to do..processing that many rows, no matter how you slice it, is time-consuming. And the table remains more-or-less usable during the update.Or clustered on the restriction in the WHERE clause.

FYI, On my site guys use rownum clause to limit rows when using bulk-binds.

One useful technique is to break it up into a bunch of smaller batches rather than one large batch: set rowcount -- pick a doable batch size (1000 rows? It does impose constraints on the UPDATE statement, though. I myself have in the past set a rowcount of about 1000 and wrote a SQL statement in a loop to spin through and update the rows. If you're talking about a record with less then 10 columns it really shouldn't be that big a deal.

The UPDATE statment has to be phrased in such a way that each iteration picks up from where it left off. I would worry about filling the log done in one transaction. Maybe run in parallel the same statement somehow using a range so as not to cause blocking. The thing I would like to know is what in the heck takes up 250 million records? But I have one server that runs an update each night and updates about 10 million records, 150 columns per record...

-- Tibor Karaszi, SQL Server MVP Please reply to the newsgroup only, not by email.

Assuming that you are updating a large portion of the rows in bigtable, the most efficient join is a merge join.

Leave a Reply