Last month, we discussed logging user submissions to the SEC in our blog post LDC 154. Well, thinking more about this, I realized it was probably not a good idea to not secure the log file. If someone makes a mistake, and files a document they shouldn’t, they could simply alter the log to show different times or users filing the document, and there is no record of it. A malicious user could easily just change what the log says! This is unacceptable and easily within our power to fix. To do so, we’re going to use a common function, the MD5 hash function.
Friday, November 15. 2019
LDC #160: Securing Log Files with MD5
We’ve used the MD5 hash function in previous scripts. To put it simply, it will read a file, and generate a unique text string that identifies that file. If the file has different contents, then it will generate a different string. This is very helpful for our purposes here, since we want to detect if a user has made any changes to our file. If they change something, the file will generate a different MD5 hash, and we can tell the file has been altered. I’ve made some changes to the logging file outside the scope of this particular blog post, so it now reads information from GoFiler’s log to get the activity of what happened while filing to the SEC, but the overall program flow hasn’t changed. The general order of things happening is still:
1 ) User presses filing button
2 ) Script activates, takes time stamp, allows execution to continue
3 ) Filing completes, script takes time stamp
4 ) If no log file exists, create it
5 ) Record activity in log file
Now that we’re adding some hashing functions in there, we’re going to need to slightly modify this flow. We need to check if the file was altered before we change the log, and we need to write out a hash value at the end of the process to effectively “sign” our file. Our new flow will look like:
1 ) User presses filing button
2 ) Script activates, takes time stamp, allows execution to continue
3 ) Filing completes, script takes time stamp
4 ) If no log file exists, create it
5 ) If log file exists, read hash value, compare contents of file to hash, check if it has been altered.
6 ) If altered, mark file as altered. Else, continue
7 ) Record activity in the log
8 ) Remove the hash code at the top of the file, get hash of file, put new hash on top of file.
Let’s take a look at some of the new code we’ll need then to modify this flow, starting with the run function We’ll skip to the relevant portion, right after GoFiler tests to see if the submission log file exists.
if(IsFile(filename)){ /* if the file exists */ mtext = OpenMappedTextFile(filename); /* open file as mapped text */ s1 = ReadLine(mtext,0); /* read first line */ ReplaceLine(mtext,0,""); /* replace first line from mtext */ MappedTextSave(mtext); /* save the mtext file */ CloseHandle(mtext); /* close handle to mtext */ mtext = OpenFile(filename); /* get handle to log file */ s2 = MD5CreateDigest(mtext); /* get the handle to the file */ CloseHandle(mtext); /* close handle to file */ mtext = OpenMappedTextFile(filename); /* open the mapped text object */ ReplaceLine(mtext,0,s1); /* put hash back */ if(s1!=s2){ /* if we have a hash mismatch */ InsertLine(mtext,-1,"*** FILE ALTERED ***"); /* display error */ } /* */
Our first modification to the file comes right after we test to see if the log file already exists. If it does, we need to open it as mapped text, then read the first line. The first line is going to be our hash code value that we created by using the MD5 algorithm on our file. We need to read it, save it as a variable, and then remove it from the file with the ReplaceLine function. Now that the hash is removed, we can open the file with the OpenFile function to get a file handle to it, and run the MD5CreateDigest function to create a hash string of the file. We can close the handle to the file then, and re-open it as a Mapped Text Object again so we can read and edit it easily, and then compare our hash strings. If they do not match, it means something altered this file outside of our script, so we want to stamp the “FILE ALTERED” wording at the bottom of it to show that something has changed it.
for(ix=size-1; ix>=0; ix--){ /* for each sublog entry */ InsertLine(mtext,-1, /* insert into mtext */ FormatString("%s%s",pre,sublog[ix])); /* format string for mtext */ } /* */ ReplaceLine(mtext,0,""); /* strip out previous signature */ MappedTextSave(mtext); /* save mapped text object */ CloseHandle(mtext); /* close handle to mtext */ mtext = OpenFile(filename); /* open the file */ digest = MD5CreateDigest(mtext); /* create MD5 hash */ CloseHandle(mtext); /* close file */ mtext = OpenMappedTextFile(filename); /* open the file */ ReplaceLine(mtext,0,digest); /* read the line */ MappedTextSave(mtext); /* save the mapped text file */ return ERROR_NONE; /* return */ } /* */
The second modification comes after we finish writing out our logging information to the file. It’s very similar to the modification above, we need to first strip out the first line, then save and close the file. After it’s saved and closed, we can use OpenFile to get a handle, use MD5CreateDigest to create a hash string, then close our file. We can then re-open it as mapped text, so we can insert the hash string onto the first line, save it again, and return. It may seem odd that we have to save and close the file so many times, but it’s important that we remove the hash string from the top of the file before we create a new hash string to make it easier to compare and test to see if the file has been altered.
This is all well and good, however, what happens if a user just goes back into the file, and deletes the line “ *** FILE ALTERED ***”? Or they make a mistake, edit the file, and then just don’t file it again, so the words “FILE ALTERED” don’t appear in the log? It is obviously not an acceptable answer to just test file it again to see if the “ *** FILE ALTERED ***” stamp appears, so we really need a new menu function to be added to GoFiler, to verify that the submission log file is not altered. To do that, we have a second script file we’re going to discuss, named “VerifyLogIntegrity.ms”. The purpose of this script will be to simply open a log file, and test to see if it contains the string “ ***FILE ALTERED ***”, and if not, it checks to see that the hash code at the top of the file matches the content of the rest of the file.
The VerifyLogIntegrity.ms file has 3 functions, setup, run, and main. The setup and main functions are the same as they always are in our scripts really, so we're just going to focus on the run function.
/****************************************/ int run(int f_id, string mode){ /* set the run */ /****************************************/ ... variable declarations omitted ... /* */ if(mode!="preprocess"){ /* check for preprocess mode */ return ERROR_NONE; /* return */ } /* */ logfile = BrowseOpenFile("Select log file","*.log|*.log"); /* browse for a log file */ if(logfile!="" && CanAccessFile(logfile)){ /* if the logfile is real */ mtext = OpenMappedTextFile(logfile); /* open the log file */ rc = GetLastError(); /* get last error code */ if(IsError(rc)){ /* if we cannot open the file */ MessageBox('x',"Cannot open file %s",logfile); /* display error */ return rc; /* exit with error */ } /* */ size = GetLineCount(mtext); /* get the number of lines */ for(ix=0; ix<size; ix++){ /* for each line */ s1 = ReadLine(mtext,ix); /* read line */ if (s1 == "*** FILE ALTERED ***"){ /* if file is already marked */ MessageBox('x',"Log file altered"); /* display error */ return ERROR_NONE; /* return */ } /* */ }
The first thing our verify script is going to have to do is to ask the user to select a log file with the BrowseOpenFile function. If the function returns a valid file, and the system can access that file, then we can try to open it as a Mapped Text Object. If that doesn’t work, we need to return an error message and then return, because we cannot verify the integrity of a file we cannot open. After that, we can get the line count with GetLineCount, and iterate over each line. If we find one that equals our file altered stamp, then we can tell the file has been marked altered by our script, so we can return a message box, and then just return without having to do any more processing. If we don’t find that stamp, then we’ll have to actually verify the integrity of the hash code, because it’s entirely possible a user just deleted the altered stamp.
s1 = ReadLine(mtext,0); /* read first line */ ReplaceLine(mtext,0,""); /* replace first line from mtext */ MappedTextSave(mtext); /* save the mtext file */ CloseHandle(mtext); /* close handle to mtext */ mtext = OpenFile(logfile); /* get handle to log file */ s2 = MD5CreateDigest(mtext); /* get the handle to the file */ CloseHandle(mtext); /* close handle to file */ mtext = OpenMappedTextFile(logfile); /* open the mapped text object */ ReplaceLine(mtext,0,s1); /* put hash back */ MappedTextSave(mtext); /* save the mapped text object */ CloseHandle(mtext); /* close handle to mtext */ if(s1!=s2){ /* if we have a hash mismatch */ MessageBox('x',"Log file altered or invalid."); /* display error */ } /* */ else{ /* if they match */ MessageBox('i',"Valid log file."); /* display success */ } /* */ } /* */ return ERROR_NONE; /* display no error */ } /* */
To verify the integrity of the file, we’re going to follow a very similar process to what the other script does to check if the file has been altered. We need to first read the first line with ReadLine, to store the hash code, and then to replace that with a blank line using the ReplaceLine function. We can then save the file and close it, and re-open it with the OpenFile function to get a generic file handle to it. With our file handle, we can then use MD5CreateDigest to generate a hash value for it, then we can close our generic file handle, re-open the file as Mapped Data, and insert the original file hash back in before saving the file back out and finally closing it. Finally, we can compare the hash string we have now against the one we took from the file. If there is a mismatch, we can return that the log file was altered or invalid, and if it matches we can return that the log file was indeed valid.
This method allows us to tell pretty well if the log file was altered or if it has remained unchanged, but it’s important to realize that the altered stamp may not always be in the file. If a user deletes the altered stamp, then does a test filing, GoFiler will add a new altered stamp to the bottom of the file after the log runs (since the hash doesn’t match anymore), but it will only say it’s been altered once instead of twice. A very intelligent user could also figure out how we’re doing this, and write their own Legato script to generate a hash code, so they can edit the file and change the hash value. This could be prevented by putting a key phrase into the file at a certain point before hashing the file, then deleting the key phrase, so that you cannot generate the hash code without knowing the key. I think doing that is potentially overkill though, since a typical user is unlikely to know enough programming to defeat this logging authentication. Like everything else, we need to balance functionality with the time it takes to actually finish our script.
Here's a copy of our improved file logging script, with the new added security for testing for alterations.
#define BEGIN_SUB "Progress Open: Submitting Filing..." int log(int f_id, string mode); /* */ /* */ string begin; /* beginning file time */ string end; /* ending file time */ /* */ /****************************************/ void setup() { /* Called from Application Startup */ /****************************************/ string fn; /* file name */ string item[10]; /* menu item array */ /* */ fn = GetScriptFilename(); /* get the filename of the script */ MenuSetHook("EDGAR_SUBMIT_TEST", fn, "log"); /* add the warning */ MenuSetHook("EDGAR_SUBMIT_LIVE", fn, "log"); /* add the warning */ MenuSetHook("EDGAR_SUBMIT_TEST_AGENT", fn, "log"); /* add the warning */ } /* */ /****************************************/ int log(int f_id, string mode){ /* Log the function response */ /****************************************/ handle mtext; /* handle to mapped text object */ boolean found_accession; /* if we have found the accession num */ string s1,s2; /* temp strings */ string digest; /* digest */ string submissions; /* submissions header string */ int num_subs; /* number of submissions */ string gofiler_log; /* gofiler log path */ string app_name; /* application name */ string output; /* output string */ string response[]; /* sorted response */ string pre; /* prefix for submission lines */ string filename; /* filename of log */ int size,ix,lx; /* counter variables */ string line; /* line from mtext */ string sublog[]; /* submission log */ string row; /* a row of the response */ /* */ if (mode == "preprocess"){ /* if we're in preprocess */ begin = GetUTCTime(DS_ISO_8601); /* get UTC time */ return ERROR_NONE; /* return no error */ } /* */ if (mode=="postprocess"){ /* only get postprocess */ end = GetUTCTime(DS_ISO_8601); /* get end time */ } /* */ app_name = GetApplicationName(); /* get the name of the application */ app_name = app_name+".log"; /* get the log file name */ gofiler_log = GetApplicationDataFolder(); /* get the appdata folder path */ gofiler_log = AddPaths(gofiler_log,app_name); /* get the log file path */ s1 = GetMenuFunctionResponse(); /* get the response */ if(s1 == ""){ /* if no response */ return ERROR_NONE; /* return */ } /* */ response = ParametersToArray(s1); /* parse response */ lx = 0; /* reset counter */ mtext = OpenMappedTextFile(gofiler_log); /* open log file as mapped text */ size = GetLineCount(mtext); /* get lines in mtext */ found_accession = false; /* found accession is false */ for(ix=size-1;ix>=0;ix--){ /* for each line in mtext */ line = ReadLine(mtext,ix); /* read line */ if(FindInString(line,response["AccessionNumber"])>0){ /* if the line has an accession */ found_accession = true; /* mark that accession is found */ } /* */ if(found_accession){ /* if we've got an accession */ sublog[lx] = line; /* store line */ lx++; /* increment lx */ if(FindInString(line,BEGIN_SUB)>=0){ /* if this was the first submission line*/ break; /* break the for loop */ } /* */ } /* */ } /* */ CloseHandle(mtext); /* close handle to log */ response["StartProcess"] = begin+"Z"; /* set process begin time */ response["EndProcess"] = end+"Z"; /* set end time */ response["User"]=GetUserName(); /* get the username */ size = ArrayGetAxisDepth(response); /* get size of response */ filename = response["File"]; /* get the filename of the project */ filename = ClipFileExtension(filename); /* clip the extension off */ filename = filename + ".log"; /* add log extension */ if(IsFile(filename)){ /* if the file exists */ mtext = OpenMappedTextFile(filename); /* open file as mapped text */ s1 = ReadLine(mtext,0); /* read first line */ ReplaceLine(mtext,0,""); /* replace first line from mtext */ MappedTextSave(mtext); /* save the mtext file */ CloseHandle(mtext); /* close handle to mtext */ mtext = OpenFile(filename); /* get handle to log file */ s2 = MD5CreateDigest(mtext); /* get the handle to the file */ CloseHandle(mtext); /* close handle to file */ mtext = OpenMappedTextFile(filename); /* open the mapped text object */ ReplaceLine(mtext,0,s1); /* put hash back */ if(s1!=s2){ /* if we have a hash mismatch */ InsertLine(mtext,-1,"*** FILE ALTERED ***"); /* display error */ } /* */ line = ReadLine(mtext,0); /* read the first line */ if(IsInString(line,"Submissions")){ /* if this is an old unsigned file */ InsertLine(mtext,0,""); /* insert a blank line on the top */ } /* */ InsertLine(mtext,-1,""); /* insert blank line */ submissions = ReadLine(mtext,1); /* get first line of text */ submissions = GetParameter(submissions,"Submissions"); /* get number of submissions */ num_subs = TextToInteger(submissions); /* get the number of submissions */ num_subs++; /* increment number of submissions */ ReplaceLine(mtext,1,FormatString("Submissions:%d;",num_subs)); /* change number of submissions */ } /* */ else{ /* if mapped text doesn't exist */ mtext = CreateMappedTextFile(filename); /* create new mapped text file */ submissions = "Submissions:1;"; /* set submissions string */ InsertLine(mtext,-1,submissions); /* insert submissions string */ InsertLine(mtext,-1,""); /* insert blank line into mtext */ num_subs = 1; /* set number of submissions */ } /* */ pre = FormatString("Submission %d ",num_subs); /* set prefix for submission line */ if(response["LiveFiling"]=="1"){ /* if live */ response["LiveFiling"] = "Live"; /* set value */ } /* */ else{ /* if not live */ response["LiveFiling"] = "Test"; /* set value */ } /* */ /* */ response["StartProcess"] = UTCToLocal(response["StartProcess"]); /* convert time to local */ response["BeginTime"] = UTCToLocal(response["BeginTime"]); /* convert time to local */ response["CompleteTime"] = UTCToLocal(response["CompleteTime"]); /* convert time to local */ response["EndProcess"] = UTCToLocal(response["EndProcess"]); /* convert time to local */ /* */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "User",response["User"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "Type",response["LiveFiling"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "FormType",response["FormType"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "PrimaryCIK",response["PrimaryCIK"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "AgentCIK",response["AgentCIK"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "File",response["File"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "ValidatedOK",response["ValidatedOK"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "AccessionNumber",response["AccessionNumber"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "User Pressed File",response["StartProcess"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "Begin Transmit",response["BeginTime"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "End Transmit",response["CompleteTime"])); /* format string for mtext */ InsertLine(mtext,-1,FormatString("%s%s: %s",pre, /* insert into mtext */ "User Pressed End",response["EndProcess"])); /* format string for mtext */ size = ArrayGetAxisDepth(sublog); /* get submission log length */ for(ix=size-1; ix>=0; ix--){ /* for each sublog entry */ InsertLine(mtext,-1, /* insert into mtext */ FormatString("%s%s",pre,sublog[ix])); /* format string for mtext */ } /* */ ReplaceLine(mtext,0,""); /* strip out previous signature */ MappedTextSave(mtext); /* save mapped text object */ CloseHandle(mtext); /* close handle to mtext */ mtext = OpenFile(filename); /* open the file */ digest = MD5CreateDigest(mtext); /* create MD5 hash */ CloseHandle(mtext); /* close file */ mtext = OpenMappedTextFile(filename); /* open the file */ ReplaceLine(mtext,0,digest); /* read the line */ MappedTextSave(mtext); /* save the mapped text file */ return ERROR_NONE; /* return */ } /* */ /****************************************/ int main(){ /* main */ /****************************************/ setup(); /* run the setup function */ return ERROR_NONE; /* return */ }
Here is a full copy of the VerifyLogIntegrity.ms file:
int run(int f_id, string mode); /* */ /* */ /* */ /****************************************/ void setup() { /* Called from Application Startup */ /****************************************/ string fn; /* file name */ string item[10]; /* menu item array */ /* */ fn = GetScriptFilename(); /* get the filename of the script */ item["code"] = "VERIFY_LOG"; /* set menu code */ item["MenuText"] = "Verify a log file"; /* set menu text */ item["Description"] = "Verifies the check code of a log file."; /* set description */ MenuAddFunction(item); /* add to menu */ MenuSetHook(item["code"],fn,"run"); /* set the hook to this new item */ } /****************************************/ int run(int f_id, string mode){ /* set the run */ /****************************************/ handle mtext; /* mapped text handle */ string s1,s2; /* temp strings */ int ix; /* counter */ int rc; /* return code */ int size; /* size of the mapped text object */ string logfile; /* log file path */ /* */ if(mode!="preprocess"){ /* check for preprocess mode */ return ERROR_NONE; /* return */ } /* */ logfile = BrowseOpenFile("Select log file","*.log|*.log"); /* browse for a log file */ if(logfile!="" && CanAccessFile(logfile)){ /* if the logfile is real */ mtext = OpenMappedTextFile(logfile); /* open the log file */ rc = GetLastError(); /* get last error code */ if(IsError(rc)){ /* if we cannot open the file */ MessageBox('x',"Cannot open file %s",logfile); /* display error */ return rc; /* exit with error */ } /* */ size = GetLineCount(mtext); /* get the number of lines */ for(ix=0; ix<size; ix++){ /* for each line */ s1 = ReadLine(mtext,ix); /* read line */ if (s1 == "*** FILE ALTERED ***"){ /* if file is already marked */ MessageBox('x',"Log file altered"); /* display error */ return ERROR_NONE; /* return */ } /* */ } s1 = ReadLine(mtext,0); /* read first line */ ReplaceLine(mtext,0,""); /* replace first line from mtext */ MappedTextSave(mtext); /* save the mtext file */ CloseHandle(mtext); /* close handle to mtext */ mtext = OpenFile(logfile); /* get handle to log file */ s2 = MD5CreateDigest(mtext); /* get the handle to the file */ CloseHandle(mtext); /* close handle to file */ mtext = OpenMappedTextFile(logfile); /* open the mapped text object */ ReplaceLine(mtext,0,s1); /* put hash back */ MappedTextSave(mtext); /* save the mapped text object */ CloseHandle(mtext); /* close handle to mtext */ if(s1!=s2){ /* if we have a hash mismatch */ MessageBox('x',"Log file altered or invalid."); /* display error */ } /* */ else{ /* if they match */ MessageBox('i',"Valid log file."); /* display success */ } /* */ } /* */ return ERROR_NONE; /* display no error */ } /* */ /****************************************/ int main(){ /* main */ /****************************************/ setup(); /* run setup */ if(GetScriptParent()=="LegatoIDE"){ /* if run from IDE */ run(0,"preprocess"); /* run for testing */ } /* */ return ERROR_NONE; /* display no error */ } /* */
Steven Horowitz has been working for Novaworks for over five years as a technical expert with a focus on EDGAR HTML and XBRL. Since the creation of the Legato language in 2015, Steven has been developing scripts to improve the GoFiler user experience. He is currently working toward a Bachelor of Sciences in Software Engineering at RIT and MCC. |
Additional Resources
Legato Script Developers LinkedIn Group
Primer: An Introduction to Legato