Standard caGet vs improved caGet comparison

Standard caGet vs. improved caGet comparison

here is my summary of caget's performance studies done yesterday+today :

1. Right now, "normal" (sequential mode) caget from CaTools package
takes 0.25 sec to fetch 400 channels, and, according to callgrind, it
could be made even faster if I optimize various printf calls (40%
speedup possible, see callgrind tree dump) :

http://www.star.bnl.gov/~dmitry/tmp/caget_sequential.png

[Valgrind memcheck reports 910kb RAM used, no memory leaks]

2. At the same time, "bulk" (parallel mode) caget from EzcaScan package
takes 13 seconds to fetch same 400 channels. Here is a callgrind tree
again:

http://www.star.bnl.gov/~dmitry/tmp/caget_parallel.png

[Valgrind memcheck reports 970kb RAM used, no memory leaks]

For "parallel" caget, most of the time is spent on Ezca_getTypeCount,
and Ezca_pvlist_search. I tried all possible command-line options
available for this caget, with same result. This makes me believe that
caget from EzcaScan package is even less optimized in terms of
performance. It could be better optimized in terms of network usage,
though (otherwise those guys won't even mention "improvement over
regular caget" in their docs).

Another thing is that current sequential caget is *possibly* using same
"bulk" mode internally (that "ca_array_get" function is seen for both
cagets)..

Oh, if this matters, for this test I used EPICS base 3.14.8 + latest
version of EzcaScan package recompiled with no/max optimizations in gcc.