How to use BUFFER()?

I have browse procedure that can contain 130 000 records if only the minimum filter is set (customerID en date).

Right now I am using the I am playing with the BUFFER() statement as
BUFFER(500)

The loading takes a while but the scrolling is fast. I wonder if there a better settings for the BUFFER statement in my situation. Suggestions?

Which database driver are you using?

Good point, forgot to mention: MS SQL.

The MSSQL Buffer Driver only causes problems above values ​​like 100. At least in my experience.

In a browse the ABC classes apply a buffer to the clarion VIEW to make it quicker to scroll. I’m pretty sure there’s a method or properties to alter it.
The BUFFER statement takes about 5 arguments for how much to store forward and behind.
It can increase performance a lot.
But the backend still has to process the query and create the result set. That may be where your time is being taken up.

There are two things in play here.
A) the MSSQL driver uses a cursor on the backend. So even though your user will only see a few pages, the entire result set is created and kept (on the server)

B) The BUFFER statement serves to determine how much of the result set is fetched each time. Generally people dont usually scroll down 130000 records, they might scroll a bit, but they locate first. So a buffer of a ā€œfew pagesā€ is typically plenty.

The first page takes time to display as the server constructs the first result set. The server also holds that result set for the life of the browse, consuming resources.

One of the goals of the new MSSQL driver is a strict ā€œno server side cursorsā€ approach. This means no delay opening the browse - the first page is instant. There are also no resources consumed on the server side past the actual fetch.

I simplified the BUFFER call somewhat - you can now just set the number of rows to get. Or you can use properties like prop:fetchsize to make it clearer.

For browses a small number, perhaps 2 or 3 pages worth is ideal. The default is 30 records. For reports , a bigger number can improve performance- perhaps a few hundred at a time. Obviously the buffer uses RAM so thats the balancing factor.

2 Likes

I know. But I am struggling what would be the setting with those arguments. At first my understanding was that a page is the number of records visible in the listbox of the browse. After better reading I got that it is the number of records that is fetched in one time. But now I don’t see what is the use of the behind and ahead arguments.

For example what is the difference between:
BUFFER(150) versus Buffer(30,2,2)?
I would expect both will fetch the same number of records. So what am I missing?

Of course that is true. But I noticed that there is a difference in time duration when executing the same SQL statement in MS Management Studio versus the browse in my application. That is why I wonder if the BUFFER statement can do something to speed up the browse a bit.

OK, I understand.
What I see is that resizing the window causes the browse to reload which consumes time. IF the entire result set is kept on the server it might be worth not the reload (executing the SQL again?) but just fetch a series of records. Is there a way to do that.

These parameters are useful for ā€œFiles,ā€ at least in those cases, if you can work safely with them.

As for SQL, at least in MSSQL, it doesn’t work, and when it seems to work, it stops working in production. The best advice is to ignore buffering, at least in SQL queries.

I’ll post the result of a little test I did a while ago (which is in the databasedrivers newsgroup).

My test of buffer size, which involved reading a file from my server
(which is on a WAN, so slow back and forth: 230 ms compared to 20 in the
office), about 4400 rows, read into a queue,yields a table like this:

Buffer Time
1 19948
10 8596
20 7103
30 6707
40 6391
50 6332
60 6144
70 6169
80 6040
90 6053
100 5970
200 5861
300 5770
400 5744
500 5751
600 5806
700 5768
800 5741
900 5700
1000 5747
1200 5765
1400 6225
1600 6159
1800 6022
2000 5985

So, rapid improvements (from 200 seconds down to about 60 seconds) for
buffers up to 100 rows. Only minor improvements after that, lowest point
was at 900 rows, and it seems to increase slightly after that.

I would say that you have two main choices:
a) you load the whole file into your queue (using an appropriate buffer size, like 200), and then moving around in that data set is a matter of using Clarion queue commands, or
b) you get a smallish data set from the server (using LIMIT) and if you need another set, you send another query to get another smallish data set.

a) your effort is up-front and everything else is quick, but you need memory on the client to store data in the queue, but you are a bit on your own because the traditional clarion way of working is closer to b)

b) you are potentially sending a bunch of different queries to the server, but each query should be relatively quick so long as you have the indexes to support it. Trick is that your queries should result in small data sets either because your real-world filters result in a small data set, or you use limits. You will still benefit from using a reasonable buffer size (not 1, which is the default).

What is a sort of worst of both worlds is your out-of-the-box Clarion method which sends queries that result in large queries being put together on the server (send my everything in this order starting from this row), and send new queries all the time (though with bind variables) each time you need a new page of data.

Bruce’s drivers are basically set up to optimize the the b) method.

Really depends what you are doing. I do a lot of non-clarion SQL work, and the way most of that is done (like in the MSSQL Management Studio) is that your initial query results in a data set limited to the first 50 or 100 rows that match your query, basically so that you can check that you are getting what you want, and then you can ask to get the full data set which you might then be wanting to export somewhere (a flat file, and Excel spreadsheet, or whatever).

Creating a cursor takes time. I dont think SSMS creates a cursor. Thus it can start displaying records immediately.

This approach is also taken hy the MSSQL2 driver. Because theres no cursor, theres no delay, so the first page appears instantly. (This is clearly visible between the SQLITE and SQLITE2 examples in the driver kit.)

BUFFER cannot help speed up the cursor creation because creating a cursor is a server function. It can help in moving the data to the client, but that’s not what is causing the delay you are seeing.

In other words, the solution is not BUFFER. IMO the solution is using the new driver.

Actually just setting the browse to ā€œFile Loadedā€ instead of ā€œPage Loadedā€ is all you need to do. But that initial load time can be substantial, and of course for large tables (result sets) it is not practical.

Just FYI - the default in the new driver is 30, not 1, but thats under your control.

Hi Bruce,

100% agreed for a straight ā€œshow every recordā€ that FileLoaded does it for you. Where I’m far less clear is if you use something like QBE, so you want only the items that belong to a certain country, for example.

My expectation would be, that if you have a country dropdown set up as a reset field, and you select a country, that the browsemanager will create a new query for you with the addition in the WHERE clause of your country limit, and then you run that new query to get all the items that are in Country X. It’s still file-loaded in that you have now loaded all the Country X records, but you did that by discarding your original full file data (your queue is emptied) and ran a new query. So if you want Country Y you have to run a new query again.

But with a file-loaded approach probably what you might prefer to do is, instead of going back to the database, to apply the filter criteria to a local ā€œmasterā€ queue and run the browse off your filtered queue. You start by creating a master queue that is your local copy of all of the data in which you might be interested, and you have a second queue which is what your listbox actually displays, and which can include additional filter criteria. And if I’m right about how things are currently dealt with for your regular Clarion file-loaded queue, that sorting and selecting on the master queue instead of going back to the database to get the restricted data set, is something that you as the programmer would have to implement.

Jon

1 Like

You are correct in that the total buffer will be the same.

However if you think the 3 parameter version as being 3 buffers it will help.
Think of Parameter 1 (p1) being the visible bit of your browse
Then p2 is the bit above where you see now, when you press page up, the p2 buffer moves to p1 and if need some more rows are fetched into p2.
Similarly pressing Page down will move rows from p3 to p1 and fetch more if necessary.
The result is that it all moves smoothly and quickly.

Just having the one buffer will mean a fetch of that number of rows when paging up or down and a possible pause as it does so.

I’ve had great success with reports using forward buffers and no back buffers. Speeds things a lot.

None of this helps build the query in the first place as Bruce and Jon, et cetra explained

Yes, you would have to code this manually.
I’ve personally not gone down this road because I think (generally speaking) it would lead to a lot of other problems (although performance would be fast after the initial load.) I’m thinking specifically of memory consumption, long load times for large tables, and of course deciding when the underlying queue needed to be refreshed.

But for situations where database access was particularly slow, and the result set a reasonable size, I think this could be quite an effecient approach.

I don’t think there’s any way to tell the old driver to behave like that. Although, thinking about it, it may not be the driver that controls that, as much as the template controls that. I haven’t dug into it far enough to find out. So, perhaps it could be optimized by changing the template (but I’d be careful myself in that direction - the flow throughthe browse template and class is not trivial, and small changes can have large effects.)

On resize, the new driver does indeed fetch from the local result set (not from the backend) so that is a lot faster with the new driver. As mentioned elsewhere it limits the result to 30 rows (which is then cached on the client side), so I’m assuming the resized window contains <= 30 rows. If more then naturally it just fetches the next page.

The paging approach shows up in other places too. For example when incremental-locating, the old driver builds a complete new result set with each keypress. the new driver does the same, but because it’s cursor free, and limited to a single page, the overhead when locating (especially incremental locating) is Much lower.

Bruce, have you considered adding support for WITH instructions to your controller?

If you are asking whether the new drivers allow you to use CTEs (Common Table Expressions), then the answer to that would be yes. The current driver would require PROP:SQLRowSet to bind the file buffer to hold the results even though the query doesn’t look like a SELECT.

The new driver would just have:

SQL(yourfilebuffer,ā€˜SQLstringwith param placeholders’,param1, param2…)

your sqlstring can start with CTEs. Also (from the help)

ā€œIn object based drivers no such care [the fields in the SELECT must be ordered based on the field order in the file definition] must be taken. The fields returned can be in any order, whether working on a FILE or a VIEW. The results are matched up to the appropriate fields by name. Indeed any result set can be retrieved, and processed, even if it does not access the file at allā€.

So your SQL string could be:

with basic as
  (select company_id,order_amt
    from orders
    where trunc(dt,'mm') = ?1
  )
select company_id,sum(order_amt) total
from basic
group by company_id

and so long as your buffer has field names compay_id and total, it would all be good.

Jon said it better than i could. CTEs are already possible via SQL().

In addition I’d add;
They are used internally on occasion when the command set desires it.

Although unrelated to your question directly, you can also use commands like UNION, INTERSECT and EXCEPT to build complex queries.

Also, the data set returned doesnt need to actually match any actual file fields ‐ it can be returned into simple variables if desired.