Tuesday, March 27, 2012

Cursor or not?

Here is the scenario , I need to pull data from several tables and there wil
l
also be a sub query in one of the joins that does summing of an amount
column, then I need to perform 3 separate Inserts into diff tables with the
data from this query.
I was going to use a cursor and then perform the Inserts as one Transaction
for each row in the set. Is there a better approach or is this example a goo
d
candidate for cursor usage?Sounds like you'll just need three INSERT statements. Based on what
you've said I don't think you should use a cursor. To give you a better
solution it would help if you could post DDL and sample data.
David Portas
SQL Server MVP
--|||Info I forgot to add. I need the inserts to perform as a single transaction
with no client application involved, this proc will be run as a scheduled SQ
L
Job.
After the the three inserts are complete then an update is done to flag
column in a source table indicating the record has been processed.|||It sounds like you can accomplish this inserting the results of a group by
query: For example:
insert into MyTable select ... group by ...
"Chris Lane" <ChrisLane@.discussions.microsoft.com> wrote in message
news:996D0192-162F-40D1-814C-FEC2410EF5AD@.microsoft.com...
> Here is the scenario , I need to pull data from several tables and there
> will
> also be a sub query in one of the joins that does summing of an amount
> column, then I need to perform 3 separate Inserts into diff tables with
> the
> data from this query.
> I was going to use a cursor and then perform the Inserts as one
> Transaction
> for each row in the set. Is there a better approach or is this example a
> good
> candidate for cursor usage?
>|||also the update is part of the same transaction.|||So it sounds like you'll need three INSERT statements and an UPDATE...
David Portas
SQL Server MVP
--|||Yeah I think, I was wrong to use a cursor, I would be better of using a loca
l
temp table with the aggregated values I need then perform the 3 Inserts and
the Update as a single logical unit in a transaction.
The reason for the temp table is I need to ensure that the 3 inserts insert
an equal number of related inserts, if you get what I mean.
Thanks|||>> After the three inserts are complete then an update is done to flag column in
a source table indicating the record [sic] has been processed. <<
Rows are not records, and we don't use flags in an RDBMS. Flags are
for older file systems where we did record-at-a-time processing. You
will probably use temp tables the way we used scratch files in the
1970's. Getting rid of the cursor does not get rid of the sequential
processing mindset or design.|||What would you suggest as an alternative, give me an example of your own.
"--CELKO--" wrote:

> Rows are not records, and we don't use flags in an RDBMS. Flags are
> for older file systems where we did record-at-a-time processing. You
> will probably use temp tables the way we used scratch files in the
> 1970's. Getting rid of the cursor does not get rid of the sequential
> processing mindset or design.
>|||Cursors are almost never necessary, and even when you must use one, you
should never perform DML operations against normal tables within the fetch
loop. You should instead cache the values in a temporary table or table
variable and commit the changes using set-based operations. Flushing the
changes using a set-based operation causes triggers to fire only once, less
information is written to the transaction log, and SQL Server can optimize
the update of any indexes. I think that this cache-and-flush pattern can
also be used to solve your problem, although I don't think a cursor is
indicated in this case.
What you can do is cache the results of the subquery in a table variable and
then issue three separate set-based insert statements to flush the results.
There's something you should be aware of, however. In a concurrent
environment, it is possible for the data that was used to calculate the
subquery results to change before the final insert is executed, which,
depending on the data model, may introduce incorrect information into the
database. Note that this problem exists regardless of whether you cache the
information or not. At least by cacheing the information, you can be sure
that the data inserted into the three tables will reflect the same subquery
results.
There are two ways to deal with this problem. You can either prevent
changes until the transaction completes, or you can detect changes and if
necessary, rollback and restart. The first solution is to use pessimistic
concurrency--that is, the calculation of the subquery results occurs within
the transaction with a serializable isolation level (HOLDLOCK)--which
prevents changes to any of the data used to calculate the results until the
transaction is committed. The other solution uses optimistic concurrency.
The calculation of the subquery results occurs before the transaction
starts, and then within the transaction the source data is checked for
changes and locked for the duration of the transaction which is rolled back
if any changes are detected. Optimistic concurrency works best if you have
a rowversion (timestamp) column on each table, because you can save the
maximum rowversions from each source table for each row along with the
subquery results while you're calculating those results, and then after
initiating the transaction, lock the source rows and verify that the maximum
rowversions from each source table haven't changed before issuing the DML
statements.
You should ignore Joe Celko's comments. It appears that he finds so much
delight in bashing posters for using terms like "record" or "flag" or, God
forbid, using IDENTITY, that he doesn't take the time to understand what is
being asked, and thus issues poor, irrelevant and often incorrect advice.
As indicated above, there are very good reasons to cache results during a
long-running transaction (as used to be done in a scratch file).
"Chris Lane" <ChrisLane@.discussions.microsoft.com> wrote in message
news:996D0192-162F-40D1-814C-FEC2410EF5AD@.microsoft.com...
> Here is the scenario , I need to pull data from several tables and there
will
> also be a sub query in one of the joins that does summing of an amount
> column, then I need to perform 3 separate Inserts into diff tables with
the
> data from this query.
> I was going to use a cursor and then perform the Inserts as one
Transaction
> for each row in the set. Is there a better approach or is this example a
good
> candidate for cursor usage?
>

No comments:

Post a Comment