For the most part, scripts are used to make things easier for the user. Why search and replace in HTML code to fix a common problem when you can write a script that does it for you? This reduces the chances of human error as the script runs in a predictable fashion. These quick operations usually just take input and produce some output, but what do you do when you want a script to guide a user through a process? Do you let the user stop in the middle of the process? If so, does your script remember the user’s position so they can pick up where they left off? This week we are going to discuss state persistence.
Friday, August 04. 2017
LDC #46: Persistence... Do you have what it takes?
There are a number of ways your script can save its state. The easiest to implement are global variables. These are persistent throughout the life of the script. When coding a menu hook, the life of the script is the entire time the menu function is hooked. This means you can use global variables to save information between calls to your hook or even calls to multiple hooks if they are in the same file. It is important to note that, while developing these scripts, every time the script file is edited the script is reloaded and all global data is reset. This is unlikely to effect end users though since they aren’t normally editing the script.
The script below is an example of a global variable within a hook. In this sample, the global variable count is updated and displayed every time the open function is run. If the application is closed, the count is reset.
int count; int on_open(int f_id, string mode) { if (mode == "postprocess") { MessageBox('i', "You have used the open command %d time(s)", ++count); } return ERROR_NONE; } int main() { MenuSetHook(MenuFindFunctionID("FILE_OPEN"), GetScriptFilename(), "on_open"); return ERROR_NONE; }
Global variables are great for storing data within the session. But how can you store data between sessions? There are many options to store data between sessions, but the basic concept is the same for most of these methods: save the data to a location the script can read next time it is run.
The simplest functions to do this are PutSetting and GetSetting. Behind the scenes, these functions use Windows INI files to save settings.
Let’s edit the above script to use INI functions. Here we will make the script read which function is hooked from an INI file as well as how many times that function has been run. Then we can save the run count every time the message box is displayed.
int count; string function; int on_open(int f_id, string mode) { if (mode == "postprocess") { MessageBox('i', "You have used the %s command %d time(s)", function, ++count); PutSetting(function, "Count", count); } return ERROR_NONE; } int main() { function = GetSetting("Hook", "Function"); if (function == "") { function = "FILE_OPEN"; } count = GetSetting(function, "Count"); MenuSetHook(MenuFindFunctionID(function), GetScriptFilename(), "on_open"); return ERROR_NONE; }
As you can see there is now a new global variable that is the function we are hooked into. The main function now reads the function using GetSetting. If that fails it defaults to “FILE_OPEN”. The on_open function now also saves the count using PutSetting. Both of these functions are using the default INI name which is based off the script filename. It also means the settings are on a per user basis.
Since the data is now stored in an INI file, it persists between sessions and also could be read by other scripts. The downside to these functions is they save and read only key value pairs. This makes them optimal for your script’s own settings such as options on how the script will run or the location of data files. For more complicated data we can convert the data to strings using functions like ArrayToParameters or even HexBufferToString for binary data.
If we want to save even more complicated data, we can see that the versatility of Legato truly shines. The options to do this are nearly endless. We could save the variables directly to a file and then read them. We could write an XML, HTML or JSON file and parse the data out as needed. We can even use ODBC to connect to a database and save and load the data using SQL. Some of these options have more considerations such as file locations, permissions, passwords, etc. but almost anything is possible.
Now that we’ve seen a non practical use of settings and global variables, how about a real life example? The script below adds a menu function to the Document ribbon that searches an HTML file for numeric data that is not XDX tagged. It also hooks the Find Next function so the user can press F3 or use Find Next to easily find more numeric items that need to be tagged. This script is missing some advanced features, such as looping at the end of file, but these features are not required to show the use of persistent data.
// Predefines int setup (); int start_find (int f_id, string mode); int on_find (int f_id, string mode); int on_find_next (int f_id, string mode); int find_numeric (); int main() { setup(); return ERROR_NONE; } // Set up int setup() { string fnScript; string item[10]; int rc; // Get Script fnScript = GetScriptFilename(); if (fnScript == "") { MessageBox('x', "Script file must not be untitled."); return ERROR_NONE; } // Set up Our Hook item["Class"] = "DocumentExtension"; item["Code"] = "XDX_FIND_NUMBERS"; item["MenuText"] = "&Find Next Numeric"; item["Description"] = "<B>Find Next Numeric</B>\r\rFinds the next non-tabular numeric item that is not XDX-tagged."; // Test if it exists rc = MenuFindFunctionID(item["Code"]); if (IsNotError(rc)) { return ERROR_NONE; } // Add it rc = MenuAddFunction(item); if (IsError(rc)) { return ERROR_NONE; } MenuSetHook(item["Code"], fnScript, "start_find"); // Hook Find Next MenuSetHook(MenuFindFunctionID("EDIT_FIND"), fnScript, "on_find"); MenuSetHook(MenuFindFunctionID("EDIT_FIND_NEXT"), fnScript, "on_find_next"); return ERROR_NONE; } // Global Data int cx, cy; handle current_window; // Starts the find Numeric Process (or resumes) int start_find(int f_id, string mode) { handle edit_window; int rc; // Stop if not preprocess if (mode != "preprocess") { return ERROR_NONE; } // Check Window edit_window = GetActiveEditWindow(); if (IsError(edit_window)) { MessageBox('x', "Cannot get edit window."); return ERROR_EXIT; } if (edit_window == current_window) { return on_find_next(f_id, mode); } // New Window current_window = edit_window; cx = 0; cy = 0; find_numeric(); return ERROR_NONE; } // User wants to use normal find int on_find(int f_id, string mode) { // Reset this current_window = NULL_HANDLE; return ERROR_NONE; } // Finds the next item int on_find_next(int f_id, string mode) { handle edit_window; int rc; // Stop if not preprocess if (mode != "preprocess") { return ERROR_NONE; } // Check Window edit_window = GetActiveEditWindow(); if (IsError(edit_window)) { return ERROR_NONE; } if (edit_window != current_window) { return ERROR_NONE; } // Run our processor find_numeric(); return ERROR_EXIT; } int find_numeric() { handle edit_object; handle sgml; string item, id, last_item, next_item; int type; int ex, ey, sx, sy; int table_count; int no_dates; int no_notes; // Check Window Type type = GetEditWindowType(current_window) & EDX_TYPE_ID_MASK; if ((type != EDX_TYPE_PSG_PAGE_VIEW) && (type != EDX_TYPE_PSG_TEXT_VIEW)) { MessageBox('x', "This function only works on HTML windows."); return ERROR_EXIT; } // Set up Variables table_count = 0; last_item = ""; SetCursor(); no_dates = IsTrue(GetSetting("Options", "Suppress Dates")); no_notes = IsTrue(GetSetting("Options", "Suppress Notes")); // Set up Parse edit_object = GetEditObject(current_window); sgml = SGMLCreate(edit_object); SGMLSetPosition(sgml, cx, cy); // Parse item = SGMLNextNonSpaceItem(sgml); while (item != "") { if (IsError(item)) { return ERROR_EXIT; } type = SGMLGetItemType(sgml); // Element if (type == SPI_TYPE_TAG) { // Test for Table if (FindInString(item, "<table", 0, false) > (-1)) { table_count++; } if (FindInString(item, "</table", 0, false) > (-1)) { table_count--; if (table_count < 0) { table_count = 0; } } // Skip already inline tagged items id = SGMLGetParameter(sgml, HA_ID); if (IsRegexMatch(id, "xdx_90[0-9A-F]_.+")) { SGMLFindClosingElement(sgml, SP_FCE_NONE); item = SGMLNextNonSpaceItem(sgml); continue; } } // Test for Numeric if not in table if ((table_count == 0) && (type == SPI_TYPE_TEXT)) { if (IsAccounting(item) && HasNumeric(item)) { sx = SGMLGetItemPosSX(sgml); sy = SGMLGetItemPosSY(sgml); ex = SGMLGetItemPosEX(sgml); ey = SGMLGetItemPosEY(sgml); // Test if next is even more next_item = SGMLNextNonSpaceItem(sgml); type = SGMLGetItemType(sgml); if ((type == SPI_TYPE_TEXT) && IsAccounting(next_item) && HasNumeric(next_item)) { ex = SGMLGetItemPosEX(sgml); ey = SGMLGetItemPosEY(sgml); item += " " + next_item; } // Suppress Dates if (no_dates == true) { if (IsRegexMatch(MakeLowerCase(last_item + " " + item), "(january|february|march|april|may|june|july|august|september|october|november|december) [0-9]{1,2}, [0-9]{4}[,\\.]{0,1}")) { item = SGMLNextNonSpaceItem(sgml); continue; } if (IsRegexMatch(MakeLowerCase(last_item + " " + item), "(january|february|march|april|may|june|july|august|september|october|november|december|and) [0-9]{4}[,\\.]{0,1}")) { item = SGMLNextNonSpaceItem(sgml); continue; } } // Suppress Notes if (no_notes == true) { if (IsRegexMatch(MakeLowerCase(last_item + " " + item), "note [0-9]+[-–—]{0,1}")) { item = SGMLNextNonSpaceItem(sgml); continue; } } SetCaretPosition(edit_object, sx, sy); SetSelectArea(edit_object, sx, sy, ex, ey); cx = ex; cy = ey; break; } last_item = item; } item = SGMLNextNonSpaceItem(sgml); } UpdateEditWindow(edit_object); CloseHandle(edit_object); CloseHandle(sgml); return ERROR_NONE; }
The majority of the functions used by this script have been covered in the past since so we will highlight only how the script uses persistent data to change its behavior. The core functionality of this script is how the menu hooks interact with the global data. The current_window variable and the cx and cy variables are used to track the last window and the last numeric item that was found in that window. The next time the hook is run, it picks up where it left off using these global variables. If the window is no longer the same window, the function resets the variables and starts again.
The script also hooks Find and Find Next. The Find hook resets the current_window variable. This changes the behavior of Find Next. The Find Next uses the current_window variable to determine whether the user wants to find the next numeric item or just use Find Next. If the current window doesn’t match current_window, the application’s Find Next is called. This seamlessly integrates the script into the application’s interface.
The meat of the script is the find_numeric function that uses the SGML parser to find numbers that are not inline tagged. It is important to note that this routine uses GetSetting to check if the user wants to suppress dates and note identifiers.
The last thing to note about persistent data is the security aspect. We have covered this in previous blogs, but it is important to mention that by using persistent data your script becomes vulnerable to bad data so it is important to compensate for that. However, by using persistent data, scripts can simplify the user experience and optimize work flow for better productivity.
David Theis has been developing software for Windows operating systems for over fifteen years. He has a Bachelor of Sciences in Computer Science from the Rochester Institute of Technology and co-founded Novaworks in 2006. He is the Vice President of Development and is one of the primary developers of GoFiler, a financial reporting software package designed to create and file EDGAR XML, HTML, and XBRL documents to the U.S. Securities and Exchange Commission. |
Additional Resources