Hi
I have a challenge. Once a month I have to call an endpoint 60000 times to retrieve customer data.
If I call the endpoint serially it takes forever. So my plan is to have many threads running at the same time.
I am using a Nettalk Web Client for the call.
Does anyone have a good idea how I can approach this task?
My own idea is to start 100 or more web clients which then pick a record from a queue and process the single call. When the call is completed the record is deleted. Then the next record in the queue.
Calling an API 60,000 times in succession, or parallel, doesn’t seem like a good plan to me? Sorry I’ve got no suggestions for how to handle this apart from contacting the provider and asking them to have a better solution.
If the data is different every time it’s hard.
However I would create a multi-threaded design where you start n threads and pass each thread x customers to work with. Each thread would report back when finished those and ask for the next lot of customers to process.
This way you can scale up threads and how many they process to find an optimum.
Do not assume massive numbers of threads is more efficient than a small number. There is an overhead with threads and that can impact you.
I’d start with 1 to get it correct then try 4.
That’s actually what I’ve tried. 50 threads works fine. 100 works even better but 200 some calls start to time out. I’ve also tried changing the timeout but it doesn’t really do anything for the speed.
Of course, it also requires that the server can deliver at the same pace as I want.
I’ll probably end up somewhere between 50-100 threads.
So time to test.
In the NetTalk examples\webclient folder is an example called Web Strain. That might give you some ideas.
Note that you can run the stress on multiple threads in the client, and can simultaneously start multiple clients running the same test(s) each on multiple threads.
Actually, you don’t really need multiple threads - well no more than the cores in your CPU - and likely not more than 1. What you want are multiple NetTalk objcts.
Because NetTalk is non-blocking (aka asynchronous) it’s possible for multiple objects to exist in the same procedure. Given the overhead (in CPU and more importantly RAM) with multiple threads, your best performance is likely to be having some small number of threads (like 16, if your CPU has 16 cores) and then a bunch of objects per thread.
Creating a bunch of objects is straight-forward. Either you can just declare them;
net1 NetWebClient
net2 NetWebClient
with their respective PageReceived and ErrorTrap methods, or with a bit of cunning you can make an array of them. The NetMaps.Inc and NetMaps.Clw contains an example of doing it with an array. It’s a little bit more complicated (so takes a bit of figuring out) but ultimately scales up a lot easier.