Hi Niels
nice to see a solution using ST
I have taken the liberty of optimizing your code.
a couple of points
-
I generally don’t like implicits so I removed your T#
-
At one stage I thought to pass the email address by reference rather than by value. ie. (*STRING pEmail) We can safely do that as we are not altering the pEmail value at all. It takes 8 bytes (a long for address and a long for length) which is particularly advantageous when you are dealing with large strings. But in the case of email addresses the strings would not be very large so it may be of minimal value and it does restrict how you can call it so I ended up leaving it passed by value.
-
I added “auto” to A, E and S as the first thing that they are used for assigns a value to them. Minor difference but every little bit adds up.
-
I added “static,thread” to the ST object so that it didn’t have to be initialized and later cleaned up on each call. This is a bit like a threaded global but limited in scope to this procedure so safer. Note when you enter the procedure st will have the value from the previous call - this is not a problem here as the first thing we do is st.setValue() - however if instead you did st.append then you should st.free() at the top of the procedure to clear the buffer. Also note you would NOT add this “static,thread” if a procedure was recursive as, in that case, each instance needs its own separate object. An alternative to this is to pass in a temporary “scratch” or “worker” ST object which is an approach Bruce is using a bit these days in some of his Capesoft stuff.
-
I used st.containsChar rather than st.containsA
-
on the loops I used the character values directly using st.valuePtr rather than st.sub. Sub is definitely safer in that it checks for “out of bounds” conditions but in these cases where you are constraining your loop bounds (to st.length() going forwards and to 1 going backwards) you are pretty safe using the direct approach.
-
I changed the EQUATEs to STRINGs. While using equates is generally considered good practice, I am much more comfortable using a STRING which I know will definitely be passed by reference not by value where that option is available.
-
the final return statement could either be by st.slice or again directly using valuePtr. As with the discussion above, slice is safer but going the direct route is faster. In this case it is not just the (tiny) overhead of the call to slice, but also how the compiler is able to efficiently return the string value by reference.
anyway an interesting exercise
cheers for now
Geoff R
CleanEmail PROCEDURE (STRING pEmail)
st StringTheory,static,thread
A LONG,auto ! @ char pos
E LONG,auto ! end char pos
S LONG,auto ! start char pos
EQ:Name STRING('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!#$%&''*+-/=?^_`{{|}~.')
EQ:At STRING('@')
EQ:Domain STRING('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-.')
CODE
st.SetValue(pEmail)
!Find most right @
A = st.Instring(EQ:At,-1)
if ~A then return ''.
!Find domain
loop E = A+1 TO st.Length()
if ~st.ContainsChar(st.valuePtr[E],EQ:Domain) then break.
end
!Find name
loop S = A-1 to 1 by -1
if ~st.ContainsChar(st.valuePtr[S],EQ:Name) then break.
end
return st.valuePtr[S+1 : E-1] ! or st.slice(S+1,E-1)