How to import CSV into Clarion 11 TopSpeed file without writing code

I wrote a CSV parser that I keep meaning to clean up and put on github.
One of these days. It’s a very different approach.

You still load the whole CSV into a string, but the only additional memory allocated is for a bunch of 8-byte &STRING references. There are no copies needed of the data itself. The only NEW’d data other than the CSV is a single allocation to house all of the &STRINGs. So for huge CSV, you need a lot less memory to parse it out than you would via Split().

interesting Jeff - I had a conversation with someone about something similar maybe a year or so ago - most probably Bruce in relation to possible future StringTheory optimizatons. Of course doing that with ST would break much of the existing code base as often you want to change the lines queue without it affecting the main string value (until you later do a join to combine the lines into the string value). So it is a case of swings and roundabouts… but would be good for large files where you cannot fit effectively two copies into memory.

do you use an array for the &string references or a queue? An array would be quicker and you could use a dynamic string underneath it. Ha Ha you could even use an ST object for that (grins)
.

yes that’s true, but sooner or later you need to learn to write code - you can’t do everything like “paint by numbers”.

I haven’t used the DOS, ASCII or BASIC drivers in years, preferring ST, but having said that, your way with the BASIC driver is pretty nifty (as Jeff might say).

yes that sounds right (at least as it is at the moment - version 3.35), you could do a replace as you suggest, something like:

str   StringTheory
lne   StringTheory
fld   StringTheory
x     Long
y     Long
  code
  str.LoadFile('Somefile.CSV')
  str.LineEndings()
  str.Split('<13,10>','"')
  loop x = 1 to str.Records()
    Lne.SetValue(Str.GetLine(x))
    Lne.Split(',','"','"',true)
    loop y = 1 to lne.records()
      fld.setValue(lne.GetLine(y))
      if fld.replace('""','"')
        lne.SetLine(y, fld.getValue())
      end
    end
    field1 = Lne.GetLine(1)
    field2 = Lne.GetLine(2)
  end

I used that for my Do2Class utility to load an entire TXA into a String and then have a Queue of lines. I also keep an UPPER version. The TxtTXA &STRING is NEW() to the size of the file and loaded using API but could be done with with DOS driver. In the Repo see code in CBTxa2Q.CLW / .INC

People want to use INSTRING but I prefer a simple loop byte by byte. Since I exclude 13,10 it is possible I won’t have a valid [slice] so all those are &= Glo:String1 STRING(1). That way I don’t have an invalid &STRING to crash me.

CBTxa2QClass.SplitLines     PROCEDURE() !,VIRTUAL
!-----------------------------------
LQ   &CbTxaLineQueueType
Chr BYTE,AUTO
Tx  &STRING
LenTxt &LONG
LineCnt LONG
B LONG,AUTO   !Begin [] of current line
X LONG,AUTO   !Current [X] char
E LONG,AUTO   !End [] of line w/o 13,10
    CODE 
    FREE(SELF.LinesQ)
    IF SELF.TxtSize < 2 THEN RETURN.
    LQ &= SELF.LinesQ 
    Tx &= SELF.TxtTXA 
    LenTxt &= SELF.TxtSize

    B=1 ; LineCnt=0 
    LOOP X=1 TO LenTxt+1
        IF X<LenTxt AND VAL(Tx[X])<>13 THEN CYCLE.
        LineCnt += 1  
        E=X+1                !E=End with 10 after 13
        IF X=LenTxt+1 THEN   !The last byte, might not end with 13,10
           E=X-1
           IF E<B THEN BREAK.  !Hit the end and
        END 
        LQ.LineNo = LineCnt   
        LQ:LenWO2 = E-B+1 -2  !w/o 13,10    ! LenTxt      LONG      !LQ:LenTxt
        LQ:PosBeg = B
        LQ:PosEnd = E
        IF LQ:LenWO2 > 0 THEN  !some text & 13,10
           LQ:TxtTxa &= Tx[B : E-2]  
           LQ:TxtUPR &= Self.TxtUPR[B : E-2]
        ELSE  !Just 13,10
           LQ:TxtTxa &= Glo:String1  !<--- Prevent GPF
           LQ:TxtUPR &= Glo:String1        
        END
        ADD(LQ)
        X=E ; B=X+1 
    END           
    RETURN

It uses kind of a hybrid, but could do the whole thing without a QUEUE if needed. A queue is handy for storing a &STRING representing a whole record and also a &STRING of &STRINGs representing the columns. But both of the &STRINGs in the queue represent a small portion of the existing contiguous pre-allocated blocks of memory.

I don’t want to bust anyone’s bubble but the whole point of my (question/topic), Is there a way that Clarion can handle a simple import and export of Basic data into a Clarion TPS format. Pretty much the answer is no.

I was an early customer of Clarion going back to the dos version all the way up to 5.5 before I started AS400/IBM work, I didn’t loose interest I just did not have the time.

You guys have had great suggestions and some of them worked, I do not mind hard coding if necessary, but when I purchased this software again, 30 odd years later, I assumed this minor issue was addressed so developers can get to creating the app instead of having to code to get the data. You would think that the neandrathal Clarion would have evolved to tackle this simple task 30 years later and would just be a simple procdure without plugins or additional coding, as far as my knowledge that did not happen.

There are a lot of hoops to jump through to accomplish this simple task, disappointed in Clarion at the moment, I did SQL work for 7 years but do not want to do it in my home enviroment and developement process unless absolutely necessary. SQL is wonderful but I do not want to use that as an alternative to gathering data for a TPS file when a simple import procedure should be in place for the Clarion product, this should have been implemented years ago. On the drawing board, it should be let’s take DataA to DataB, and this is how we do it, it doesn’t matter which format it is in.

So, why do I not have a problem exporting data and importing the same data file into google sheets, excel or OpenOffice and manipulating with no issues. I can take the same file with Clarion and it is sucking it’s thumb trying to figure out what to do. It probably depends on format, but why have other companies figued it out and Clarion is still sitting on their hands, this should be a simple import/export function/procedure so design and functionality of the project can move along without being a cumbersome task building the foundation of the App in the first place. Sure the Dictionary can be manipulated to show the data from a .csv but the Clarion development environment to create an application is lacking in utilizing the work that has already been performed in the Dictionary you created. Wonderful you can create a Browse on the CSV, Not to wonderful on the options to manipulate it or convert to a TSP, this should be a procedure within the Clarion development environment.

I am sure this has been asked and addressed many times but I am not a firm believer in re-creating the wheel when softvelocity/Clarion can do better. No matter what file format needs converted, there should be a better way to manipulate it within the Clarion program, right now it is Dictionary design and hard coding to get a simple project off the ground due to Data Manipulation with the Clarion tools, this should be an all inclusive process.

Regards,

Hi Waldo,

I fear my reply will be long and I suspect it may easily be interpreted as condescending, or patronising, neither of which are intended. Also I write this not to “convince” you of any specific thing (use Clarion, don’t use Clarion, clearly it doesn’t affect me) but rather simply to help you to better understand what is going on.

The essence of what you posted comes up a fair bit in this, and other, forums. Programming as a career takes us down various paths with roads that both brig us in and take us away. Sometimes those roads circle back, and there’s a somewhat-steady stream of people who return after a decade or two of absence.

Of course in the time away the world has changed, and so returning to Clarion now may not be the same as the world you left Clarion in 30 years ago. Clarion itself has changed, both in ownership over the years, and also in the goals it has.

To fully appreciate this change it is helpful to recall the DOS CPD 2.1 days. At the time the product was helmed by Bruce Barrington, who had a specific vision and goal for the product. Being for DOS (which was a very limited OS) programs were also limited in their scope. The biggest problems were RAM availability and Printer codes. A networked computer was one that was on a LAN and had some sort of shared-file access. Program data was “silo’d” - only your program could access your data.

To avoid a long boring history lesson, if we contrast it with today, the landscape is very different. Today we are running on the windows desktop, or more commonly inside a web browser. Printers and RAM are (almost) non-issues, data is expected to be shared (via SQL or ODBC), programs easily communicate with each other via Web Services, and “networked” includes all kinds of things from email and ftp to web pages, maps, and endless other things. And that’s before we talk about the changes brought on by haardware changes - Threads, Global Memory and so on.

So ultimately programs have changed, and perhaps more importantly, as a result of that, the scope of programs created in Clarion have changed.

When I started in CPD the “scope” of a Clarion program was pretty fixed. “Embed points” were few, hand-code was rare, and one Clarion program was more or less exactly the same as the next. Basically the programs did “one thing” (browses / forms / reports) - but it made that one thing easy enough for non-programmers to deal with.

Today - especially with mature programs - while the Browse / Form / Report structure is still often there, that is the very starting point - perhaps 5% of the program. The last 20 years have been focused on “coding” - writing the thing that makes your program special, or different from the rest.

When Bruce Barrington retired the funding model for Clarion changed, and when Topspeed dissolved and SoftVelocity was born it changed again. Each change has brought about a decrease in overall funding to the main product. It turns out writing developer tools is not terribly profitable because developers are both cheap, and think they could do better themselves. :slight_smile: that Clarion still exists as a product is the amazing part, given that more-or-less none of it contemporaries still exist.

I suppose my point is this; In the DOS days Clarion was a tool - something you used. Today Clarion is a programming environment - in which you program. The focus is much more on writing, or including, code, than it is on specific “tasks”. It’s also massively extensible, so it’s probably not a surprise that over the last 2 decades lots of people have made (and in some cases) sold extensions. (Of course I am one of those people.)

For commercial developers, many people see the economic benefit in extensions (it’s cheaper than paying an employee, or writing something yourself) others prefer the challenge of writing it themselves. For hobbiest programmers, with limited budgets, especially ones returning to Clarion, there is both a LOT to write, and a LOT to learn. Once upon a time the “Clarion in a box” was all you needed to do everything. But the scope of everything has changed, and that is no longer a realistic approach.

To return you your note above - you asked a question, and got several different approaches. I think you were hoping for a “convert CSV to TPS” button in the IDE - alas that simply does not exist. Neither does a XML to TPS, or JSON to TPS or EDI to TPS or any of a dozen other file formats that might be considered common.

You do of course have (as mentioned) lots of ways to do this task - from a BASIC driver file, and PROCESS template, to using SQL as an external tool to do the task, to writing code with StringTheory or even just simple Clarion code to read and write records. None of these are the “right” way, none of them are the “have to” way - they are just a few of the many approaches you could take.

I think this line of yours sums it up best;

No matter what file format needs converted, there should be a better way to manipulate it within the Clarion program,

I think you are seeing Clarion as an ETL type program here - whereas it’s really not that - it’s a programming environment. So most things you do with it will be programming.

I say all this not to convince you that you are wrong, but perhaps to suggest you are looking in the wrong place for the functionality you are looking for. An ETL program may be a better fit for what you have in mind…

3 Likes

Here’s the current rendition of my CSVParserClass. One of these days, I’ll find some time to update the readme and add some more features. Would appreciate feedback and whatever files that it currently fails with. Usually, the problem is incorrect delimiter or line endings.

If combo list of separators doesn’t have what you need, you can enter your own in single quotes. For a comma, you’d enter: ‘,’ OR you can put CHR(02ch). OR, you can modify the program.

1 Like