I wanted to create md5 hash for multiple files for finding duplicates, now I am using string theory for this but due to the problem of size limits I can’t get the value for large files.
Thank you for your reply, I tried that and it is ok but what I would like to have is something within clarion code because I have thousands of files and I don’t like my application to call an external command each time.
From memory I think Nettalk and Cryptonite may have MD5 as well as StringTheory but I do not know if they require the file to be read into memory first.
There are two implementations there that you could look at and see if they work for you.
If it turns out that all these versions require a string in memory, you may have to see if you can modify them to just read through the file and process as it goes without having it all in memory at the same time. If you do that it would be good to share your final code back here.
I found this “MD5 (RSA compatible) calculation” and did a rapid test for a 13 GB file and it gave the same result as “certutil” command which is a very good start.
It is good that it has an option to specify a filename rather than just a string with
MDString(*CSTRING myString, *CSTRING MD5Result)
perhaps a downside is that you are using a DLL without having the source code (but I guess you could say the same about the Clarion runtime and drivers dll’s). Looking online I see mention made of:
For MD5 hashing in my classic ASP apps I used a freeware Win32 DLL (aamd532.DLL from “Almeida & Andrade Ltd”)
which looks to be the DLL in use here. It appears to be written by Francisco Carlos Piragibe de Almeida.
Hi Marcelo - there is an “upload” button that allows uploading a file but note that this is the same C code that StringTheory uses to give the MD5 of a string in memory (including the mods made by Marshall Reeve in 2002).
The problem for sk with code like this remains being able to get the MD5 for large files that will not fit in memory - like the 13GB file he successfully tested earlier using the aamd532 dll.
Just added a method to EasyDotNet which computes a MD5 hash for files. Works well on huge files (tested on a 6Gb file - the largest file I could find on my notebook). The advantage of this method - it doesn’t load a file into a memory.
thanks for the info; I do not remember where I got that md5 code, but I used it en several programs and it works ok. I do not have string theory.
You could use this code to process files in hard disk. I tested it on the biggest file in my machine (5,5GB) and it took 18s. Not very fast but it works, and do not load the file in memory.
@vitesse, you said that the code is the same that is included in string theory.
It should be interesting to see how to include a procedure in another language in a class.
not sure if you have ST but all the code is there. Also I seem to recall there was a mixed language example in the Clarion examples that used to ship with Clarion (maybe it still does?). You need to provide the relevant/correct prototype for Clarion to be able to call a C function.
**edit - sorry I see you already mentioned that you do not have StringTheory. (I am tempted to tell you it is worth getting )
Excellent that you have got it working with a large 5.5GB file. It might help the original poster sk2107 if you could share your code to do that - or did I miss that in the earlier file you posted?