ABC .Insert() vs Add() Speed using ODBC Driver with MySQL

Daniel1 · September 19, 2024, 7:51pm

Hello
just found something interesting, it is about insert-ing and add-ing records into MySQL table (using ODBC Clarion driver and ABC templates).
This code:

  loop L = 1 to 50000
    yield()
    clear(m:record)
    m:name = 'some name '&L
    m:userId = 1
    clear(sql:record)
    sql_file{prop:sql} = 'select max(id)+1 from materials where userId = 1'
    next(sql_file)
    m:id = sql:f1
    add(materials)
  end

works almost 10 times faster as this one

  loop L = 1 to 50000
    yield()
    clear(m:record)
    m:name = 'some name '&L
    m:userId = 1
    access:materials.insert()
  end

why is that?

mriffey · September 19, 2024, 8:30pm

What’s materials look like in the DCT?

mriffey · September 19, 2024, 8:33pm

the DDL for that table also wouldn’t hurt.

seanh · September 19, 2024, 11:39pm

At a guess, because you’re missing info, I’d say that the table is doing an auto inc, and the .insert adds a blank record, checks it’s ok then updates. Which is all missing from your first example.

Essentially the Insert is more robust, the select and add could have someone sneak in between the 2 statements.

JonW · September 19, 2024, 11:51pm

If you look at the InsertServer method in the FielManager class I think the answer is fairly obvious. When you use Insert, then after every record, assuming you are not using stream/flush at all – you don’t have them in your code – Clarion will wait to see if the server returns an error, and will act on it if it does. Waiting for the server to respond “Yes I added that”, or “I didn’t add that, and here is the error”, takes time.

In your Add loop you are not checking for any errors, and if you encounter them you are just going to keep on trying to add more records.

If you are worried about time, firstly stream/flush would speed things up. Secondly, if you actually had a real-world need to do something like that, then this code (this is the Oracle version, but MySQL you could do something similar) will probably work a hundred times or more faster than your ADD loop (you could write a prop:SQL):

with maxid as
   (select max(id) id 
   from materials
   where UserID = 1)
, idrange as
  (select 'some name' || to_char(maxid + level) addname
  from dual
  connect by level <= 50000
  )
insert into materials(userID,name)
select 1,addname
from idrange

This is also getting rid of the need to get max(ID) after every insert. Downside would be that if someone else tried to enter a name in the range that you have mapped out your whole insert would fail.

Daniel1 · September 20, 2024, 6:21am

@mriffey here is table declared in the DCT (not sure what is DDL?)

materials           FILE,DRIVER('ODBC'),OWNER(glo:owner),NAME('materials'),PRE(m),BINDABLE,THREAD !                     
idKey                   KEY(m:id),NOCASE,OPT,PRIMARY    	 !                     
nameKey                 KEY(m:name),DUP,NOCASE,OPT     		 !                     
userIdKey               KEY(m:userId),DUP,NOCASE,OPT    	 !                     
ordersKey               KEY(m:p_id,m:userid,m:orders),DUP,NOCASE,OPT !                     
pUpKey             	    KEY(m:p_id,m:up_id),DUP,NOCASE,OPT !                     
Record                   RECORD,PRE()
id                          LONG                           !                     
name                        CSTRING(255)                   !                     
jm                          CSTRING(50)                    !                     
price                       DECIMAL(14,4)                  !                     
userId                      LONG                           !                     
orders                      LONG                           !                     
mycode                      CSTRING(100)                   !                     
p_id                   	    LONG                           !                     
up_id                       LONG                           !                     
                         END
                     END

Daniel1 · September 20, 2024, 6:27am

@seanh I assumed too that the insert was a better and safer option, but it is 10 times slower. It’s not about a single table - the whole application is designed to work as an insert (I have a lot of hand code).

Daniel1 · September 20, 2024, 6:39am

@JonW I just checked Stream/Flush and Logout/Commit - no difference. The help says this: "On some filesystems, STREAM and FLUSH are simply ignored when a file is opened for shared access,'" and my files are opened for shared access. Additionally, logout/commit (and access:materials.insert()`, too), create a “deadlock” when running code in the same thread.

RchdR · September 20, 2024, 10:08am

ADD is just a function, its got no errorcode handling, no client side auto numbering and client side shadow buffer checking.

Insert is a class method which is carrying out errorcode checking, client side auto numbering (where applicable - (dct) driver switch can off load this back to the SQL Server/ODBC source) and client side shadow buffer management.

The extra code in the class simply takes more time to process.

Watch the ABC class in action by stepping through the debugger. Set your breakpoint on the Access:Materials.Insert() line and step through the code, you’ll see what I mean then about its got more code to process.

Max · September 20, 2024, 10:32am

Hi,

Change your table to use server side autoinc, is there any reason to use application driven autoinc ?
Use prop:sql with insert statement to add a new record
Try to use builk mysql insert statement, that might improve perfomance

Daniel1 · September 20, 2024, 10:49am

@RchdR Thanks for the details, I got your point but I don’t need all bells and whistle’s, just simple add to database.
Adding the basic error check

    if errorcode()
      message(fileerrorcode()&' : '&fileerror())
    .

does not slow down ADD-ing at all (adding 6000 records is done by 1 or 2 second using MySQL table and ODBC driver). Using Insert(), it take 10-12 seconds.

RchdR · September 20, 2024, 10:52am

If you need the fastest possible method to import records, in MS-SQL so if the equivalent exists in MySQL, doing a batch import is the fastest way to get millions/billions of records into the db.

The indexing et al will take place as a background process, so calling the batch import via prop:sql and importing all the records from a txt file is also an option besides using clarion code.

Daniel1 · September 20, 2024, 11:05am

@Max thanks for idea but I already have +30000 lines of code and I have no idea where I use just added ID (on the client side) so it can be dangerous to to this now.
Same for your other suggestions (adding by SQL).
Thanks anyway.

Daniel1 · September 20, 2024, 11:09am

@RchdR there will be adding max 5000-10000 records at time.
Now I found another problem - when adding 5000 records from one command and then from another, it give me “Attempts to automatically number this record have failed” error.
Tables are declared using THREAD attribute, so I have no idea why it display that error?

seanh · September 20, 2024, 11:44am

That might be a transaction isolation problem. Adding many records to SQL by many options fast has problems making sure that each doesn’t stomp on the other. You might be a victim of that. It’s also one of the reasons that the Insert method takes care in doing its thing.
Perhaps giving up a little time to ensure everything is ok might be worth it.

JonW · September 20, 2024, 12:58pm

I was surprised to find that you were right about logout and stream/flush making no difference. That’s with ODBC access against Oracle. My results for adding 5000 over a WAN were an abysmal 500+ seconds whether or not I included an if error() and stream/flush or a logout/commit wrapped around the full 5000. That’s ADDs, not filemanager inserts.

The PROP:SQL version, which I got a little wrong, plus an adjustment to match the other tests, ran in 0.32 seconds.

   insertspeed{PROP:SQL} = 'insert into insertspeed(ID,descrip) ' & |

‘with idrange as (select level id,’‘test’’ || to_char(level) descrip from dual connect by level <= 5000 ) ’ & |
‘select id,descrip from idrange’

Basic rule is to reduce the number of roundtrips to the server. Wrapping logout/commit around the loop reduces the burden on the server, but clearly that was insignificant compared to the time for sending data over and checking that it was added OK.

JonW · September 20, 2024, 1:04pm

DDL is “Data Definition Language”, i.e. the way things are “defined” on the server, i.e. your MySQL table declaration.

Matching acronym is “DML” – data manipulation language – which is your selects/inserts/updates/deletes, etc.

CarlBarnes · September 20, 2024, 1:52pm

That is likely to be problem code. The FileErrorCode() is only set for a few errors, IIRC 90 (driver error) and 47 (invalid file def). So any other error codes would just show a Colon, e.g. Duplicate Key or File Not Open.

FileErrorCode() returns a STRING not a Number so you would check for it = Blank.

Here’s the way I would code:

    if errorcode() then
      message('Add "record id" failed '& Err4Msg())
    end

Using this function to easily get all the RTL Error info:

Err4Msg  PROCEDURE(Byte NoCRLF=0)!,STRING 
  !Example: IF ERRORCODE() THEN STOP('Failed ADD(xxx)' & Err4Msg()).
  !Note: Return starts '<13,10><13,10>Error Code:' so no need to put in the Message()
  CODE
  IF ~ERRORCODE() THEN RETURN ''.   
  IF ~NoCRLF THEN 
     RETURN '<13,10><13,10>Error Code: ' & ERRORCODE()&' '&ERROR() & |
             CHOOSE(~FILEERRORCODE(),'','<13,10>Driver Error: ' & FILEERRORCODE()&' '&FILEERROR() ) & | 
             CHOOSE(~ERRORFILE(),'','<13,10>File Name: ' & ERRORFILE() )
  END 
  !NoCRLF<>0 is 1 line format for use by logging
  RETURN ERRORCODE()&' '&ERROR() & |     ! {148}
         CHOOSE(~FILEERRORCODE(),'',' [Driver ' & FILEERRORCODE()&' '&FILEERROR() &']' ) & | 
         CHOOSE(~ERRORFILE(),'',' {{' & ERRORFILE() & '}' )

Daniel1 · September 20, 2024, 2:04pm

@JonW I understand, SQL is best option for adding lot of records but I try to avoid that because of SQL injection stuff…
I’m happy with the ADD-ing of records (I do not work over WAN, everything is on server and 1-2 sec for 6000 records is pretty good result for me).
I still have the problem by add records (or ABC insert) in the same time. Attached is sample app and dct with 4 buttons. Just compile and press “Run MyProc” and second later, click on Run MyProc 2".
You will see error by inserting a record. I try with CriticalSection (it is commented inside a code) and it works OK (you can click as many times you want on each button) but the problem is when you run 2 instances of exe file - in that case error raises again
Any idea how to add records using other ways or is the ServerAutoInc only option for ODBC drivers?
test.zip (563.4 KB)

Daniel1 · September 20, 2024, 2:08pm

Thank you @CarlBarnes for details regarding error code. Will implement this as soon I find a way of how to add records into same table (in the same time).
Currently, I know what happens without detailed error description