C10 Read and Write version of ASCIIFileClass

DonnEdwards · August 8, 2019, 8:10pm

All I want to do is read an ASCII text file line by line, do a search and replace, and then write the changed version back to the original file. In Visual Basic this is pretty simple. What am I missing in Clarion?

Can anyone point me to some Open Source code or Clarion Examples that read and write text files?

I’m a newbie so hopefully I’m not asking something really obvious

seanh · August 8, 2019, 10:22pm

Same as any other file in Clarion. The file just needs to be defined with the ASCII file driver.
This is actually something Clarion does quite well, all files are accessed the same way and changing type is as easy as changing the file definition.

MyTextFile   FILE,Driver('ASCII')
record  RECORD
line   string (100)
end
Access:Mytextfile.Open(MyTextfile)
Loop until access:MyTextFile.Next()
  do stuff
End

Or you can define the file in the dct and use a Process procedure and put code in the TakeRecord embed point

Bruce · August 9, 2019, 4:48am

Hi Donn,

What’s interesting about your question is that while it includes some context, it’s not immediately apparent what your goals are. And depending on your goals, the answer could be remarkably different.

First, I should point out that Sean’s answer is completely correct, as far as it goes. You can indeed use the ASCII driver to read, and write, an ASCII file. You declare the table in the dictionary (or better yet, inside the procedure in hand-code) and using that you can iterate through he table very easily.

Unfortunately this driver is quite slow - which is understandable given that it has to pretty much re-write the entire file on the disk with each write. (And disk access is the slowest part of this function.) That doesn’t detract from it’s usefullness, but does possible open the door to other alternatives.
So, very simple code, but not very fast. One big plus is that it can cope with any file up to 2 Gigs in size while consuming basically no memory. (but the speed on a file of that size would be prohibitively slow.)

An alternative is to use the DOS driver. The DOS driver allows you to read the table in much bigger chunks. This reduces the read time considerably. You can define a large buffer, and load the file in a loop, into say a giant string. (you can use BYTES on the driver and NEW on the string to get a string of the correct size.)

You would then need to parse the string yourself (INSTRING) breaking it up into “lines” (most commonly done by using a QUEUE). Then you can inspect and edit each line of the QUEUE, culminating by writing the queue back to disk.

This approach is faster (much faster) than the ASCII driver, but (assuming you parse-as-you-load) requires at least as much RAM as the file itself. Meaning it’ll work on files up to about 1 Gig.

Lastly you can using Windows API calls to slurp the file off the disk, bypassing the DOS driver. Similarly using an API call to write it back to the disk. This speeds things up even more - although still leaves you with quite a lot of work doing the parsing, and of course replacing.

If you are primarily treating this as a learning exercise I’d recommend you work through all three of these approaches. By implementing all 3 you will learn a lot about the ASCII and DOS file drivers, parsing, string manipulations, and of course how to use Windows API calls.

You may consider all the above to be a lot of work, and you’d be right. If you are not interested in learning all the finer details, or if you have learnt it all already, then you’ll probably want to make use of an existing class which does all this for you. You could write your own, or build on the shipping SystemString class (alas having to work around, or fix, its bugs), or use a commercially available library like StringTheory.

Clearly I’m biased since I currently maintain and sell the StringTheory library (with a lot of help from my friends.) StringTheory has all the functions you would need to do the above in probably 10 lines of code or less. See here for an example (https://www.capesoft.com/docs/StringTheory3/StringTheory.htm#ParsingCSVFile)

Of course using StringTheory delivers the fastest approach to your code, and incidentally the easiest code as well (your search and replace code disappears thanks to a REPLACE command in StringTheory) - in fact chances are you don’t even need to parse the lines, just use REPLACE. But while this may create the best result in your program, in the fewest lines of code, you won’t actually learn a whole lot. Writing 3 lines of code generally doesn’t lead to a whole lot of learning.

This comes back to goals - if you are primarily in the learning phase then it’s worth taking some time to learn different approaches, and the techniques mentioned earlier are useful to learn. However once you are in the “I get paid to ship solutions, not write code” stage, then it becomes economically more useful to make use of well written, optimised-for-speed, and well maintained, libraries. Libraries like StringTheory ship as source code (although not under an open source license) so you are free to inspect the code, learn from it, improve it, submit suggested changes, and so on. (I get a lot of submissions which are usually folded back in.)

And of course you end up writing fewer lines of code, so you end up with fewer bugs.

Because it’s shipped as source code you aren’t dependant on the supplier (currently me) to stick around. If I disappear you still have the source code so all your work keeps working.

As I said at the beginning of my (now rather lengthy reply) the answer to your question largely depends on your goals. Hopefully I’ve covered most of the possibilities in this reply.

cheers
Bruce

DonnEdwards · August 9, 2019, 8:08am

Thanks for both replies. This is really helpful. Yes I am in a learning phase and there is nothing so helpful to aid learning as a project or task you want to achieve. Hence my many newbie questions and I am most grateful for the informative replies.

I am happy to use String theory to parse the lines and replace the text.
I am currently an Access developer and over the years I have written and collected a lot of library code, and agree that it is more productive and reliable to re-use libraries where needed.

I am writing a utility called Fixer because I am allergic to Clarion’s favorite ‘MS Sans Serif’ font like others are allergic to ‘Comic Sans’.

I figured there would be other “annoyances” to the default generated Clarion code I could fix at the same time. Just put them in a config file and use Fixer to make my project more bearable before I actually start working on it and writing business rules and stuff.

I don’t want my software to look like it was written before the release of Windows 95

Thanks for the pointers and I look forward to learning a whole lot more.

Mike_Duglas · August 9, 2019, 8:39am

You can change default WINDOW declartions just editing libsrc\win\defaults.clw.

Bruce · August 9, 2019, 12:25pm

ha - a topic after my own heart. I hate old fonts so much I wrote a class to change them at runtime. See
CapeSoft AnyFont. Alas I’m also in the habit of responding to newbie questions with “I use xxx tool” because I’ve been expanding my own Clarion toolbox for 30-odd years now, and over that sort of time I’ve accumulated, or written, quite a lot

Wolfgang_Orth · August 14, 2019, 9:16am

Thats true, Mike, but only until the next Clarion-Update.
Okay, they are not thaaaaat often.

RchdR · October 16, 2024, 8:55pm

I’ll chip in with this thread because I got hit with the ascii txt file encoding issue.

So before opening a txt file with the ascii driver, open it with the dos driver and read the first few bytes to establish how its encoded first before wasting hrs trying to figure why the next(txtfile) was throwing an error with the ascii driver.

Encoding byte 1,2,3,4

Utf8 0xef 0xbb 0xbf
Unicode 0xfe 0xff
UTF32 0x0 0x0 0xfe 0xff
UTF7 0x2b 0x2f 0x76
Ascii - no magic packet encoding

In todays encoded world it would be nice if the ascii driver automatically detected the encoding, same for stringtheory, and just worked accordingly.

jslarve · October 16, 2024, 9:13pm

FWIW, StringTheory does make provisions for BOM. StringTheory Complete Documentation

As far as the ASCII driver, what’s it going to do with a BOM anyway?

RchdR · October 16, 2024, 10:19pm

Last version of StringTheory I had was the C6 1.93 version and I dont know if it had the auto detetction. I cant download the saf file because I dont have any of my old serial numbers to unlock it and I cant remember what email domain account I used back then.
I’m not knocking Capesofts efforts but we get into that grey area where we no longer can use the std functions of the drivers to process a file and get into that domain of using methods in stringtheory to process it, which whilst more capable could try to work in a similar way to process records when its just an encoded file thats not ascii.

Re the ascii driver, it could throw an error saying its nots an ascii file at the very least. No?

vitesse · October 16, 2024, 11:56pm

wow that is an ancient version Richard from 2013. The product has greatly improved since then. It looks like st.SetEncodingFromBOM was added almost ten years ago. At the time of writing this, the current version is 3.70

I am sure Capesoft could help you with your serial numbers.

If you wanted to go to the current version it looks like an upgrade costs $39. (But if you are still using C6 then you might be stuck using an old ST version for those C6 apps.)

https://www.capesoft.com/accessories/stringtheorysp.htm#Cost

RchdR · October 17, 2024, 12:25am

I could use my old v1.com domain to login cuz I think thats the one I used with them, but it could have been the company name .com or .co.uk I used, and then Id see all the saf serial numbers on their website, but Im having too much fun building my own api code. Ive just created a new StrPosExtractMatch(*cstring pSearchString, *cstring pRegex, long pmode=0, *cstring pExtractedMatch, long pRemoveSearchStringUpToMatchEnd=0),long !return errorcode to make my regex validations easier.

I use c11 gold for the template builder because I hit the 16bit limit of c6, but for apps Im so much quicker using c6 so still use that.

C11 needs a better Saab night mode to make it fast to use like c6.

Bruce · October 17, 2024, 3:34am

If you email me with your current email address i can sort you out. Include the name of the TV show we watched at your place in Cambridge and ill know its you. Bonus points if you can remember the episode

vitesse · October 17, 2024, 4:43am

If you get stuck Richard, Carl had an example of this at

specifically:

RchdR · October 17, 2024, 10:21am

Thanks for the offer, but I never lived in Cambridge, you, and a few others crashed at mine in Stortford after Cambridge to get some where early the next day, and who was the girl with me?

But with questioning like that, I now wonder if you are Jono because he stayed at my other place in Stortford and I think we did watch some TV.

So thats why I say I have no clue what you guys watched.

But at Cambridge Uni, during one of the morning sessions I remember sitting opposite you and/or rob and you were using wireshark or ethereal back then, to pull the passwords off the network of everyone there logging into their email, because you were reading out their passwords. Or was that place Kings…?

You know with Cambridge Uni supposedly being a world class IT centre of learning, you’d have thought their IT security would have been better!

Too early in the morning, but if you asked me to rattle off your bank acc no. I transferred money into, then thats a different thing…