SQLite.org just dropped this - HCTree

hctree: doc/hctree/index.html (sqlite.org)

SQLite is sometimes used as the core of a client/server database system. While it works reliably well in such cases, the database backend module that it uses to store b-tree structures in its database file was not designed with this case in mind and can be improved upon in several ways. The HC-tree (hctree) project is an attempt to develop a new database backend that improves upon regular SQLite as follows:

  • Improved concurrency: Stock SQLite is limited to a single concurrent writer.Using the begin-concurrent extension changes this so that multiple writers may run concurrently using optimistic page-level locking. This improves concurrency somewhat, but page-level locking can detect conflicts between logically independant transactions, and COMMIT operations must still be serialized.Hctree uses optimistic row-level locking and is designed to support dozens of concurrent writers running at full-speed. Test results obtained from the prototype show that this is possible.
  • Support for replication: Stock SQLite supports the sessions extension, which allows the contents of a committed transaction to be serialized for tranmission and application to a second database.Hctree builds this into the database backend, and adds support for application of such transactions to follower databases in leader-follower configurations. In this case, transactions received from a leader database can be applied more quickly and with greater concurrency than with which they were originally applied to the leader database, because no transaction validation is required.
  • Removal of database size limitations: Stock SQLite uses 32-bit page numbers. Using the default 4KiB page-size, this leads to a maximum database size of 2^44 bytes, or 16TiB.Hctree uses 48-bit page numbers, allowing 2^60 byte databases, or 1EiB. Roughly one million TiB.

An implicit goal is that hctree must be as fast or faster than stock SQLite for all single-threaded cases. There is no point in running dozens of concurrent writers if each of them is an order of magnitude slower than a single writer writing to a legacy database.

Hctree clients (those that use a version of SQLite compiled from this repository) may read hctree databases and stock SQLite databases.

Their forum post announcement can be seen here, along with their caveats.
SQLite Forum: Hctree is an experimental high-concurrency database backend for SQLite.

2 Likes

Thanks Richard,

Very interesting. I left a question there, regarding simoultaneous access to a database file stored in a shared filesystem and was replied it was not thought for that. A pity, if it could be used in that way it would be a good candidate as a replacement for ISAM files shared on a network without going to a full Client Server SQL database.

Though there is still a problem remaining with SQLite on Clarion classes and RTL regarding >= queries and not consuming the entire resultset to finalize or a way to reset the statement, releasing the locks. Discussed on SV Clarion11 newsgroups thread started 2023-01-05 by Jon and PTSS 43363 reported by him.

My understanding of SQL lite and this HCTree is its a db for a device.

I’ve seen reports that Apple use it to power their iCloud core data, so when thinking of redundancy models, sql lite would be a db to enable individual workstations to continue working in isolation, ie they have no network communication. I dont know enough about the replication that comes with it, but I have seen some devices work where they communicate to a networked database server but also run a smaller local db to carry on when the network goes down. This was in the 90’s when networking was not very good.

What I’m going to investigate if I have the time is to see how fast it replicates across an office lan. I’d be curious to see if this gives the traditional server based db on a lan a run for its money, and how it compares with multi branches with multi users connected over the internet, as an alternative to using cloud data centres.

Like I said, I dont know how good the replication system is in HCtree over the lan or the internet, but this might be a lower cost distributed db model instead of the traditional central db model.

Core Data | Apple Developer Documentation
Mirroring a Core Data Store with CloudKit | Apple Developer Documentation
Synchronizing a local store to the cloud | Apple Developer Documentation

Hello Federico…
I read your question (and the answer they wrote you) to SQLite forum and I instantly thought about FirebirdSQL. It comes from Interbase, it has an embedded mode which means that you can use it for standalone use (instead of using .tps and/or .dat files) and when more users need access to database, ODBC client has to be installed in those clients and a small change to connection string…
I’d love to hear thoughts, arguments for, arguments against, etc…

Theodore

Hi Theodore

I agree with you that it provides both use cases and by definition it is easy on do the transition from standalone to server.

Personally I 'm not using it but have tested it on the past. I also followed comments about it on SV Newsgroups.

I don’t have many arguments against, probably if comparing the embbeded version to SQLite, the last has a native file driver in Clarion, but I feel comfortable with ODBC, sometimes it is even preferable, for multi-engine selectable on run-time for example. And SQLite Android deployment is unbeatable. I haven’t done a characteristics comparison.

I think what motivated my post on that SQLite forum is just being alert if there is an embbeded version of a db that works fine on network environment, but it seems by definition that it is unlikely. I think that i could include in that category one embbeded db that uses a small server component instead of a full blown db engine, but that is vaguely described anyway. I think this was an example (I have not tested): https://docs.oracle.com/database/bdb181/html/programmer_reference/bdb_server.html
until they retired support for client-server and SQLite api:
https://download.oracle.com/otndocs/products/berkeleydb/html/changelog_18_1_40.html
quote
“…Changes between version 18.1.32 and version 18.1.40

2. The Berkeley DB client-server architecture is no longer supported. If you require the Berkeley DB client-server architecture you must use Berkeley DB 18.1.32 or earlier.
3. The SQL API is no longer supported. If you require SQL support you must use Berkeley DB 18.1.32 or earlier.”

Hello Federico
Thank you for your thoughts…
FirebirdSQL also says that embedded version can’t be used in multi-user environment. For that, ODBC connection has to be done from clients.
MSSQL also has (had?) Compact Edition made primarily for mobile devices but could also be installed on pc’s via sql server express installation. But could also be used in a single user environment.
It must be a kind of “a rule” or something everyone is following?

Theodore

Theodore,

The MSSQL Compact Edition was deprecated in 2013 and support ended 2021 SQL Server Compact - Wikipedia
Some links
Differences Between SQL Server Compact and SQL Server | Microsoft Learn
Everything SQL Server Compact: Comparison of SQL Server Compact, SQLite, SQL Server Express and LocalDB
This Word document “Choosing Between SQL Server 2005 Compact Edition and SQL Server 2005 Express Edition” describes both approaches and also describes previous offerings:
http://download.microsoft.com/download/A/4/7/A47B7B0E-976D-4F49-B15D-F02ADE638EBE/Compact_Express_Comparison.doc

CE doens’t allowed storing the file on network shares. Others allow storing but exclude concurrent access:

Quote:
“Berkeley DB works great with a SAN (and with any other filesystem type as far as we know), but if you attempt to access any filesystem from multiple machines, you are treating the filesystem as a shared, remote filesystem and this can cause problems for Berkeley DB. …”
EndQuote (more info follows about mutexes and cache)

Another explanation is on the response from Donal Fellows from Richards link.

The problem is that network filesystems have some subtly broken semantics, especially in relation to locking, and that these problems are essentially inherent to how networking functions. This is because of the sheer number of potential weird failure modes, including some idiot unplugging a router in the middle of a commit in such a way that neither side of the connection can spot the problem for a while, or another device deciding to flood the network with packets such that comms become essentially impossible until the offending system is forcefully disconnected.

So the clarion logout timeout can help with some network failures, but you cant rely on db’s all having the same abilities. MS SQL has this COMMIT TRANSACTION (Transact-SQL) - SQL Server | Microsoft Learn
SQL Lite also has a commit.
Transaction (sqlite.org)

Commits should be handle.

On the point of a device flooding the network, an unmanaged switch should handle this automatically, but there will be noticeable disruption, it depends on what the factory settings are from a manufacturer. A managed switch if set up properly can prioritise the network traffic for db’s over the other devices. Its no different to traffic management seen in firewalls and home routers where gaming network traffic can have highest priority to avoid lag/latency in multiplayer online games. Same goes with teleconference and voip calls having high priority.

Sure in windows there is oplocks which breaks ISAM file’s but oplocks in windows was built in to allow office documents to be shared, once thats is switched off in the registry, its not really a problem for ISAM files but is for office documents.

Typical OS handling of networked filesystems provides no way to properly report these failures, or indeed any way to even do non-blocking access to files, so you instead get some very bad behaviours. (I remember losing a week of work back in the 1990s to a severely misbehaving NFS server that spent about 95% of the time in a crashed state with every process on my workstation — including the Xserver — blocked waiting to page in from the downed service.)

It looks like SQLlite3.dll is probably going to be like ISAM files from what I have seen, so if using a windows share, make sure oplocks is switched off.

If using on a linux share/NAS using something like Samba, then caching is off by default last time I looked.

There is also a few other options, sockets and ODBC.
SQLite CVSTrac

There is also this guide.
Appropriate Uses For SQLite

However there still is the logout and commit facilities in this driver so I dont know how you will fair trying this out. I also dont know what compilation options are in the version shipped by SV compared to the compile options described here.
How To Compile SQLite3.dll

There is also an ODBC driver for SQLite which would mean making sure the odbc ports are visible on the network and then it becomes more like a SQL server, or any other odbc connection, just make sure the network ports are not blocked by the firewall on the machine running the SQLite db. Same goes for the ports if using the sockets route, which is a bit more like a webserver then.

Network Database Access - ODBC API Reference | Microsoft Learn

I know a few people are using the ODBC driver to connect to postgres db’s, dont see why the same couldnt be done with SQLite, but just bear in mind, you might be pushing the technology in heavy use conditions.
SQLite Forum: ODBC 64bit drivers for windows download
SQLite ODBC Driver (ch-werner.de)

This link I think is showing how to connect the firefox web browser SQLite instance. I think MS Edge also uses SQLite as its browser db, ie bookmarks and other stuff.
Using ODBC to connect to SQLite (synametrics.com)

The other thing to consider is the replication route, ie each workstation having its own instance of the db, and then replicating with other sqlite db’s. This is something I havent tried and I dont know how robust it is or how fast it is, but Apple use this replication method to synch with their main db online.

Horses for courses as some would say. :grinning:

Edit.

One other thing thats worth checking is whether the SMB on the network is the latest for improved robustness and security.

I remember one of the ransomware attacks exploited a bug in SMBv1 EternalBlue - Wikipedia
ODBC uses SMB and CIFS so I would imagine it uses the SMBv2 and SMBv3. Sockets are direct tcp/ip just like a webserver and I dont know about the sqlite3.dll. It might only work on SMBv1 which is a no no because of the ransomware risk, but hopefully it will work with SMBv3, if not its ODBC or Sockets, which means nettalk territory (or even Replication if thats fast enough?).

I think it would be worth checking if TPS files also work on SMBv3 as well, just out of curiosity.
How to detect, enable and disable SMBv1, SMBv2, and SMBv3 in Windows | Microsoft Learn

Richard,
Coding specifically to overcome the problem was not part of the goal, it wouldn’t worth the effort, it would be easier to install a full db server. And it may not be feasible on code generated automatically.

Christian Werner SQLite ODBC uses SQLite3.dll locally. It is stated on your link also:
“Note: Using ODBC does not imply client/server or networking. The ODBC driver for SQLite accesses the SQLite database file directly just like SQLite does.”

The other links on your link (sockets and tcp/ip) seems broken or relating to closed project like SQLiteDBMS: “This project has been closed”.

There is Clarion IP Driver. But again the first sentence, it would need specific adaptations on some places.

Everything is like that though, and I dont know what your specific use case requirements are either.
SQLite seems quite good for being a device or single user app db , and how far its other functionality can be taken depends on other factors I’m not privy too in your case. At some point I’m going to be looking at its replication abilities to see if it can give the traditional client/server db like sql servers a run for its money in offline situations, like agents working in the field where there is no internet access, but other than that, thats its really.

On most projects I just use db servers. I also use SQLite. This is just to keep an eye open on products or trends if they could worth using on future projects or for other programmers here. This 2 threads on databasedrivers SV newsgroup “SQLite performance” from Arnor on 2021-11-04 and “Berkeley db replacement for SQLite” from Jon on 2022-01-02 and this from here Best portable database for clarion may explain the motivations. SQLite is a great database. At least in the first two threads mentioned, the problems realated to concurrent usage from a network avoided using it on those projects.

1 Like

Its replication facilities caught my eye.

Hello again…
I was trying to remember a product based on sqlite and finally did it:

From their site:

cubeSQL is a fully featured and high performance relational database management system built on top of the sqlite database engine. We developed the first commercial grade DBMS based on sqlite back in 2005 and over the years we continued to improve our server to better suit all our customer’s needs. cubeSQL is the final result of all our efforts.

It is the ideal database server for both developers who want to convert a single user database solution to a multi-user project and for companies looking for an affordable, easy to use and easy to maintain database management system.

cubeSQL is incredibly fast, has a small footprint, is highly reliable and runs on Windows, Mac and Linux (32 bit and 64 bit for all platforms). cubeSQL can be accessed with PHP, JSON, Xojo and with the native C SDK. cubeSQL is written in low level ANSI C so it can be easily ported to any operating systems.

They used to (still do it) give a two seats key for developers, just to have the opportunity to try their product.

  • Up to 2 concurrent connections: FREE

I found out about FirebirdSQL and never actually did anything with that product.

Maybe worth a try. If someone does it, it’d be nice to report back here his/her opinion and results.

HTH
Theodore

1 Like

Their pricing is cheaper than some other sql db’s. $599 for unlimited connections.

Yes… They also have “Redistributable License” for developers with very large base:

Pay one price to deploy an unlimited number of servers and server updates during a 12-month period. The Redistributable license is an excellent choice for the developer who already has a large installed base that is migrating from another database server or for the developers who want to lower the price per server for their custom solutions.

Shame they dont do a real world stress testing licence, but I dont know of any company that does a real world stress testing licence. I dont hand over money as easily now. :grinning: