Call the same endpoint 60000 times

Hi
I have a challenge. Once a month I have to call an endpoint 60000 times to retrieve customer data.
If I call the endpoint serially it takes forever. So my plan is to have many threads running at the same time.
I am using a Nettalk Web Client for the call.
Does anyone have a good idea how I can approach this task?

My own idea is to start 100 or more web clients which then pick a record from a queue and process the single call. When the call is completed the record is deleted. Then the next record in the queue.

Regards Niels

So, your “endpoint” is an API call to a webserver?

Do you have any control over it?

Yes a call to a webserver. No control.

Calling an API 60,000 times in succession, or parallel, doesn’t seem like a good plan to me? Sorry I’ve got no suggestions for how to handle this apart from contacting the provider and asking them to have a better solution.

is each call different? if not can you cache the data locally instead of calling out.
I would agree the 60K calls in a short period could be trouble.

I only have the option to retrieve data from the api.
It’s actually more the clarion method/design to make that many calls I’m looking for.

If the data is different every time it’s hard.
However I would create a multi-threaded design where you start n threads and pass each thread x customers to work with. Each thread would report back when finished those and ask for the next lot of customers to process.
This way you can scale up threads and how many they process to find an optimum.
Do not assume massive numbers of threads is more efficient than a small number. There is an overhead with threads and that can impact you.
I’d start with 1 to get it correct then try 4.

That’s actually what I’ve tried. 50 threads works fine. 100 works even better but 200 some calls start to time out. I’ve also tried changing the timeout but it doesn’t really do anything for the speed.
Of course, it also requires that the server can deliver at the same pace as I want.
I’ll probably end up somewhere between 50-100 threads.
So time to test.

1 Like

In the NetTalk examples\webclient folder is an example called Web Strain. That might give you some ideas.
Note that you can run the stress on multiple threads in the client, and can simultaneously start multiple clients running the same test(s) each on multiple threads.

There is a TCurlMultiClass in libcurl that enables multiple simultaneous transfers in the same thread.

1 Like

Actually, you don’t really need multiple threads - well no more than the cores in your CPU - and likely not more than 1. What you want are multiple NetTalk objcts.

Because NetTalk is non-blocking (aka asynchronous) it’s possible for multiple objects to exist in the same procedure. Given the overhead (in CPU and more importantly RAM) with multiple threads, your best performance is likely to be having some small number of threads (like 16, if your CPU has 16 cores) and then a bunch of objects per thread.

Creating a bunch of objects is straight-forward. Either you can just declare them;

net1 NetWebClient
net2 NetWebClient

with their respective PageReceived and ErrorTrap methods, or with a bit of cunning you can make an array of them. The NetMaps.Inc and NetMaps.Clw contains an example of doing it with an array. It’s a little bit more complicated (so takes a bit of figuring out) but ultimately scales up a lot easier.

1 Like

Thanks Bruce

That makes a lot of sense. I’ll definitely try playing with more objects.

/Niels