Legato
Legato

GoFiler Legato Script Reference

 

Legato v 1.5d

Application v 5.25a

  

 

Chapter EightData Functions (continued)

8.3 Text Data Tools

8.3.1 Overview

Test data tools aid in converting information from structured text into structured data. Consider the following (EDGAR archive index, altered for width):

Description:           Master Index of EDGAR Dissemination Feed by Company Name
Last Data Received:    August 20, 2014
Comments:              webmaster@sec.gov
Anonymous FTP:         ftp://ftp.sec.gov/edgar/

 
 
 
Company Name                       Form Type   CIK         Date Filed  File Name
------------------------------------------------------------------------------------------------------------------
'mktg, inc.'                       10-Q        886475      2014-08-07  edgar/data/886475/0001019056-14-001019.txt          
'mktg, inc.'                       4           886475      2014-07-15  edgar/data/886475/0001144204-14-043105.txt          
'mktg, inc.'                       4           886475      2014-07-15  edgar/data/886475/0001144204-14-043107.txt          
'mktg, inc.'                       4           886475      2014-07-15  edgar/data/886475/0001405086-14-000226.txt          
'mktg, inc.'                       4           886475      2014-07-15  edgar/data/886475/0001405086-14-000227.txt          
'mktg, inc.'                       4           886475      2014-07-15  edgar/data/886475/0001405086-14-000228.txt          
'mktg, inc.'                       DEFM14A     886475      2014-07-16  edgar/data/886475/0001019056-14-000922.txt          
1 800 FLOWERS COM INC              SC 13G/A    1084869     2014-07-10  edgar/data/1084869/0000921895-14-001519.txt         
1 800 FLOWERS COM INC              SC 13G/A    1084869     2014-08-07  edgar/data/1084869/0001086364-14-002508.txt         
10-15 ASSOCIATES, INC.             13F-HR      1511144     2014-08-12  edgar/data/1511144/0001145443-14-001041.txt         
11 East 1st St. LLC                D           1538381     2014-08-04  edgar/data/1538381/0001515971-14-000332.txt         
11.2 Capital I, L.P.               D/A         1591336     2014-07-28  edgar/data/1591336/0001591336-14-000001.txt         
1111 Superior Investors LLC        D           1613090     2014-07-10  edgar/data/1613090/0001116354-14-000002.txt         
1125 North Fairfax LLC             D           1614102     2014-07-17  edgar/data/1614102/0001614102-14-000001.txt         
12 West Capital Management LP      13F-HR      1540531     2014-08-14  edgar/data/1540531/0000905718-14-000528.txt         
12 West Capital Management LP      SC 13D      1540531     2014-08-08  edgar/data/1540531/0000905718-14-000499.txt         
12 West Capital Management LP      SC 13G      1540531     2014-07-28  edgar/data/1540531/0000905718-14-000480.txt         
1200 Ely Street Holdings Co. LLC   S-4         1555840     2014-08-11  edgar/data/1555840/0001571049-14-003885.txt         
123 Camino Carmelita LLC           D           1612736     2014-07-07  edgar/data/1612736/0001612736-14-000001.txt         

To convert this information into CSV, one could use Excel or another tool. Another option would be a brut force read in a line and then use the GetStringSegment function to capture each section.

Text data tools such as the FormattedTextToArray can be used to perform a conversion.

//
//      Legato - Convert EDGAR Company IDX to CSV
//      -----------------------------------------
//
//      This script demonstrates CSV functions and the FormattedTextToArray function.
//      EDGAR IDX files are located in the EDGAR Archive at ftp://edgar.sec.gov.
//
//      Rev     08/21/2014      
//
//      (c) Novaworks, LLC
//


        string          fnSRC, fnDST;
        string          fields[20];
        string          s1;
        handle          hFile, hOut;
        int             pos[20];
        int             rc, lx, size;
        
                                                                // -- Set Up Files
        fnSRC = BrowseOpenFile("Select EDGAR IDX File to Make CSV");
        if (fnSRC == "") { exit; }
        
        hFile = OpenMappedTextFile(fnSRC);
        if (IsError(hFile)) {
          rc = GetLastError(); ReportFileError("Source File", rc);
          exit;
          }
        fnDST = ClipFileExtension(fnSRC);
        fnDST += ".csv";
        hOut = CreateFile(fnDST);
        if (IsError(hOut)) {
          rc = GetLastError(); ReportFileError("CSV Output File", rc);
          exit;
          }
                                                                // -- Perform Conversion
        pos[0] = 62;                    // Company Name
        pos[1] = 74;                    // Form Type
        pos[2] = 86;                    // CIK
        pos[3] = 98;                    // Date Filed
        pos[4] = 0;                     // File Name

        ProgressOpen("Convert to CSV");
        lx = 10;                                                // Data Start
        size = GetLineCount(hFile);
        while (lx < size) {
          ProgressUpdate(lx, size);
          s1 = ReadLine(hFile, lx);
          fields = FormattedTextToArray(s1, pos);        
          WriteLine(hOut, CSVArrayToString(fields));
          lx++;
          }

8.3.2 Functions