Legato
Legato

GoFiler Legato Script Reference

 

Legato v 1.5d

Application v 5.25a

  

 

Chapter SixFile Functions (continued)

6.10 File Types

6.10.1 Overview

It is frequently desirable to know what type of data a file contains. The file’s extension can give a glue but can also be ambiguous. To dig deeper, a program must examine the content of the file and look either for an explicit signature or certain clues as to the content.

The file type testing only covers a fraction of all the file types using in Windows, Unix and other systems. Further, some tester operate on file extensions only such as Word or Excel.

6.10.2 File Type Codes

The underlying application uses file type codes to represent a file’s type. File type codes, in binary form, are actually a complex set of bits that help to classify the file type and identify it’s content. When the content is examined, more type information is available and certain content data is also extracted. The codes can also be represented as a string token.

The file type codes consists of a series of bitwise values. Note that the top bit (0x80000000) is never set so is to avoid a file type code being confused with an formatted error code. The first portion of interest is the class or FT_CLASS_MASK (0x0000F000). This classifies files such as spreadsheets and document files. The specific type of file is contain in the type FT_TYPE_ORDINAL_MASK (0x00007FFF). A flag is set on various file types, FT_A (0x00008000) to indicate whether the file is ANSI or ASCII based (it could also be UTF but can be opened in a text editor).

Certain types of a variation as set by the FT_VARIATION_MASK (0x0E000000) to indicate sub types such as an XBRL schema file.

Finally, a file type may contain version information. For this reason, direct comparison of codes may not be practical.

  Definition/Token   Bitwise Codes   Description/Comments  
  Masks          
  Main Types          
    FT_TYPE_FLAGS   0xF0000000   File Format Flags (not clipboard)  
    FT_TYPE_MASK   0x0000FFFF   File Type Mask (less version)  
    FT_CLASS_MASK   0x0000F000   File Class Indicator Mask  
    FT_TYPE_ORDINAL_MASK   0x000007FF   File Type Ordinal Mask  
    FT_SUB_TYPE_MASK   0x00000700   File Type Group Mask  
    FT_VARIATION_MASK   0x0E000000   File Type Major Ordinal Mask  
  Versioning          
    FT_VERSION_MASK   0x00FF0000   File Type Ordinal Mask  
  Flags          
    FT_A   0x00000800   ASCII Base (can be opened as text)  
  Unknown          
    FT_UNKNOWN   0x00000000   Unknown File Type  
    FT_MIXED   0xFFFFFFFE   Mixed File Type  
  Text Formats          
    FT_TYPE_TEXT   0x00001000   Format Indicator Group  
    FT_ANSI       + 0x001 + FT_A   ANSI Format  
    FT_OEM       + 0x002 + FT_A   OEM Format  
    FT_UNICODE       + 0x003 + FT_A   Unicode Text  
    FT_ASCII       + 0x004 + FT_A   ASCII Text 7-bit  
    FT_UTF_8       + 0x005 + FT_A   Unicode UTF-8 Text  
    FT_MAC       + 0x011 + FT_A   Mac Text  
    FT_TEXT       + 0x012 + FT_A   Text Format (Coding Unknown)  
    FT_HTML       + 0x101 + FT_A   HTML Native (CB/File Type)  
    FT_HTML_CODE       + 0x102 + FT_A   HTML Native Code Only (in Code View)†  
    FT_HTML_CLEANED       + 0x103 + FT_A   HTML Converted (Cleaned)†  
    FT_RTF       + 0x104 + FT_A   Rich Text Format  
    FT_CSS       + 0x105 + FT_A   Cascading Style Sheet  
    FT_LOG       + 0x106 + FT_A   Log File (Text)  
    FT_SASS       + 0x107 + FT_A   Syntactically Awesome Style Sheets  
    FT_SCSS       + 0x108 + FT_A   SASS Cascading Style Sheet  
    FT_WORD       + 0x201   Microsoft Word  
    FT_POWERPOINT       + 0x202   Microsoft PowerPoint  
    FT_PDF       + 0x211   Portable Document Format  
    FT_POSTSCRIPT       + 0x212 + FT_A   Postscript Format  
    FT_WORDPERFECT       + 0x213   WordPerfect  
    FT_PAGEMAKER       + 0x221   Adobe PageMaker*  
    FT_INDB       + 0x222   Adobe InDesign Book (INDB)  
    FT_INDD       + 0x223   Adobe InDesign Document (INDD)  
    FT_IDML       + 0x224   Adobe InDesign XML (IDML)  
    FT_QUARKXPRESS       + 0x223   Quark XPress  
    FT_SEC_MESSAGE       + 0x301 + FT_A   SEC Acceptance/Suspense Message  
  Data (Spreadsheet, etc)          
    FT_TYPE_DATA   0x00002000   Format Indicator Group  
    FT_CSV       + 0x001 + FT_A   CSV  
    FT_DIF       + 0x002   DIF  
    FT_SYLK       + 0x003   SYLK  
    FT_MAP       + 0x004 + FT_A   Visual Studio Map  
    FT_DAT       + 0x005 + FT_A   General Data File (text)  
    FT_XML       + 0x101 + FT_A   XML (non-specific)  
    FT_XSD       + 0x102 + FT_A   XML Style Data (non-specific)  
    FT_NETSCAPE_BOOKMARK       + 0x103 + FT_A   Netscape Bookmark File  
    FT_RSD       + 0x104 + FT_A   Really Simple Discovery XML Data  
    FT_RSS       + 0x105 + FT_A   Really Simple Syndication XML Data  
    FT_DTD       + 0x106 + FT_A   Document Type Definition (SGML)  
    FT_EXCEL       + 0x201   Microsoft Excel  
    FT_XBRL       + 0x301 + FT_A   XBRL File Group Member (mv has file)  
      FT_XBRL_INS               + 0x01000000     – Instance (main)  
      FT_XBRL_SCH               + 0x02000000     – Schema  
      FT_XBRL_CAL               + 0x03000000     – Calculation  
      FT_XBRL_DEF               + 0x04000000     – Definition  
      FT_XBRL_LAB               + 0x05000000     – Label  
      FT_XBRL_PRE               + 0x06000000     – Presentation  
      FT_XBRL_REF               + 0x07000000     – Reference  
    FT_XFR       + 0x302 + FT_A   XBRL Financial Report (XDS/XFR)  
    FT_XFDL       + 0x303 + FT_A   XFDL (EDGAR and Sec16 Filing)  
    FT_XML_SECTION_16       + 0x304 + FT_A   Section 16 XML (EDGAR)  
    FT_XML_FORM_13F       + 0x305 + FT_A   Form 13F XML (EDGAR)  
    FT_XML_FORM_13F_TAB       + 0x306 + FT_A   Form 13F Table XML (EDGAR)  
    FT_XML_FORM_13H       + 0x307 + FT_A   Form 13H XML (EDGAR)  
    FT_XML_FORM_17A       + 0x319 + FT_A   Form X-17A-5 XML (EDGAR)  
    FT_XML_FORM_17H       + 0x31D + FT_A   Form 17H XML (EDGAR)  
    FT_XML_FORM_C       + 0x31A + FT_A   Form C XML (EDGAR)  
    FT_XML_FORM_CFP       + 0x31B + FT_A   Form CFPORTAL XML (EDGAR)  
    FT_XML_FORM_D       + 0x308 + FT_A   Form D XML (EDGAR)  
    FT_XML_FORM_MA       + 0x309 + FT_A   Form MA XML (EDGAR)  
    FT_XML_FORM_N_CEN       + 0x320 + FT_A   Form N-CEN XML (EDGAR)  
    FT_XML_FORM_N_MFP       + 0x310 + FT_A   Form N-MFP XML (EDGAR)  
    FT_XML_FORM_N_MFP1       + 0x31C + FT_A   Form N-MFP1 XML (EDGAR)  
    FT_XML_FORM_N_PORT       + 0x323 + FT_A   Form N-PORT XML (EDGAR)  
    FT_XML_FORM_N_SAR       + 0x311 + FT_A   Form N-SAR XML (EDGAR)*  
    FT_XML_FORM_SDR       + 0x316 + FT_A   Form SDR XML (EDGAR)  
    FT_XML_FORM_SDR_EXHIBIT       + 0x317 + FT_A   Form SDR XML (EDGAR Exhibit)  
      FT_XML_FORM_SDR_EX_A               + 0x01000000     – Exhibit A - Controlling Persons  
      FT_XML_FORM_SDR_EX_B               + 0x01000000     – Exhibit B - Chief Compliance Off  
      FT_XML_FORM_SDR_EX_C               + 0x01000000     – Exhibit C - Director Governors  
      FT_XML_FORM_SDR_EX_G               + 0x01000000     – Exhibit G - Affiliates  
      FT_XML_FORM_SDR_EX_I               + 0x01000000     – Exhibit I - Service Provider Con  
      FT_XML_FORM_SDR_EX_T               + 0x01000000     – Exhibit T - Subscriber Information  
    FT_XML_FORM_TA       Form TA XML (EDGAR, all)  
    FT_XML_EDGAR       + 0x312 + FT_A   EDGARLink Online (EDGAR XML)  
      FT_XML_EDGAR_S16               + 0x0E000000     – EDGARLink Online (Section 16 Only)  
    FT_XML_EDGAR_COMPRESSED       + 0x313   EDGARLink Online (EDGAR Compressed)  
      FT_XML_EDGAR_COM_ELO               + 0x01000000     – Normal EDGAR Link Online  
      FT_XML_EDGAR_COM_13F               + 0x02000000     – Form 13F  
      FT_XML_EDGAR_COM_13H               + 0x04000000     – Form 13H  
      FT_XML_EDGAR_COM_MA               + 0x06000000     – Form MA  
      FT_XML_EDGAR_COM_SDR               + 0x06000000     – Form SDR  
      FT_XML_EDGAR_COM_RGA               + 0x06000000     – Regulation A  
      FT_XML_EDGAR_COM_17A               + 0x07000000     – Form X-17A-5  
      FT_XML_EDGAR_COM_C               + 0x08000000     – Form C  
      FT_XML_EDGAR_COM_CFP               + 0x09000000     – Form CFPORTAL  
      FT_XML_EDGAR_COM_17H               + 0x0A000000     – Form 17H  
      FT_XML_EDGAR_COM_TA               + 0x0B000000     – Form TA  
      FT_XML_EDGAR_COM_CEN               + 0x0C000000     – Form N-CEN  
      FT_XML_EDGAR_COM_NPT               + 0x0D000000     – Form N-PORT  
      FT_XML_EDGAR_COM_S16               + 0x0E000000     – EDGARLink Online (Section 16 Only)  
    FT_XFDL_COMPRESSED       + 0x314 + FT_A   XFDL (EDGAR Compressed)  
    FT_XML_FORM_ABS       + 0x315 + FT_A   Form ABS XML (EDGAR)  
      FT_XML_ABS_AUTOLEASE               + 0x01000000     – Auto Lease  
      FT_XML_ABS_AUTOLOAN               + 0x02000000     – Auto Loan  
      FT_XML_ABS_CMBS               + 0x03000000     – Commercial Mortgage  
      FT_XML_ABS_DS               + 0x04000000     – Debt Securities  
      FT_XML_ABS_RMBS               + 0x05000000     – Residential Mortgage  
      FT_XML_ABS_NOTES               + 0x06000000     – Disclosure Notes (Ex-103)‡  
    FT_XML_REG_A       + 0x318 + FT_A   Regulation XML (EDGAR)  
    FT_XDS       + 0x401 + FT_A   XML Data Sheet (Data View)*‡  
      FT_XDS_II               + 0x01000000   XML Data Sheet (Mark II)‡  
    FT_XDT       + 0x402 + FT_A   XML Data Template (Data View)‡  
    FT_XFT       + 0x403 + FT_A   XML Forms Template (Forms View)‡  
    FT_NSAR       + 0x501 + FT_A   NSAR Data (answer.fil)*  
    FT_SIF       + 0x502 + FT_A   Submission Information File  
    FT_PDML       + 0x503 + FT_A   EDGARizer Project File (PDML)  
  Images          
    FT_TYPE_IMAGE   0x00003000   Format Indicator Group  
    FT_BITMAP       + 0x001   Bitmap  
    FT_DIB       + 0x002   Device Independent Bitmap  
    FT_META       + 0x003   Windows Meta  
    FT_ENHMETA       + 0x004   Windows Enhance Meta  
    FT_GIF       + 0x005   Graphics Interchange Format  
    FT_JPEG       + 0x006   JPEG Image Format  
    FT_PNG       + 0x007   Portable Network Graphic  
    FT_TIFF       + 0x008   Tag Image Format  
    FT_PCX       + 0x009   Quick Draw Mac  
    FT_EXIF       + 0x00A   Exchangeable Image File Format (EXIF)  
    FT_EMZ       + 0x00B   Compression Windows Enhanced Meta  
    FT_EPS       + 0x00C   Encapsulated Postscript  
    FT_ICON       + 0x010   Icon  
  Multi Media          
    FT_TYPE_MEDIA   0x00004000   Format Indicator Group  
    FT_AVI       + 0x001   Audio Video  
    FT_FLASH       + 0x002   Flash (Shockwave)  
    FT_MIDI       + 0x003   MIDI File  
    FT_MOVIE       + 0x004   Movie  
    FT_MP3       + 0x005   MPEG-1 Audio Layer 3  
    FT_WAVE       + 0x006   Wave  
    FT_WMA       + 0x007   Windows Media Player  
    FT_FLAC       + 0x008   Free Lossless Audio Codec  
    FT_OBJECT       + 0x009   Generic Object  
  Exchange/Server Types          
    FT_TYPE_EXCHANGE   0x00005000   Format Indicator Group  
    FT_FILES       + 0x001   Files/Directories  
    FT_DROP       + 0x002   Dropped Files/Object  
    FT_ZIP       + 0x003   Zipped/Compressed  
    FT_BAK       + 0x004   Backup File  
    FT_MHT       + 0x005 + FT_A   Mime Encoded HTML File  
    FT_MHT_EXTRACTED       + 0x007   Mime Encoded HTML File (extracted)  
    FT_FOLDER       + 0x006   Folder Only (pseudo type)  
    FT_FOLDER_UP       + 0x007   Folder Only Up (pseudo type)  
    FT_GZIP       + 0x008   GZip Compressed  
    FT_HTTP       + 0x103   Web HTTP  
    FT_HTTPS       + 0x104   Web HTTPS  
    FT_FTP       + 0x105   Web FTP  
    FT_MAIL       + 0x106   Web Mail  
    FT_GFBINARY       + 0x201   GoFiler Binary File (generic)  
    FT_PDFZONE       + 0x208   PDF Zoning File  
    FT_XML_LOG_DATA       + 0x210 + FT_A   XML Log Data (Info View)  
  Program/Script Types          
    FT_TYPE_PROGRAM   0x00006000   Format Indicator Group  
    FT_TYPE_PROGRAM_BINARY   0x00000100   Binary File Group (flag)  
    FT_BATCH       + 0x001 + FT_A   Batch File (MSDOS, Command)  
    FT_C       + 0x002 + FT_A   C  
    FT_C_PLUSPLUS       + 0x003 + FT_A   C++  
    FT_C_HEADER       + 0x004 + FT_A   C Header  
    FT_C_SHARP       + 0x005 + FT_A   C#  
    FT_JAVA       + 0x006 + FT_A   Java Application  
    FT_JAVASCRIPT       + 0x007 + FT_A   Java Script/JScript  
    FT_PERL       + 0x008 + FT_A   Perl Script  
    FT_PHP       + 0x009 + FT_A   PHP: Hypertext Preprocessor  
    FT_VBSCRIPT       + 0x00A + FT_A   Visual Basic Script  
    FT_SQL       + 0x00D + FT_A   Structured Query Language  
    FT_RESOURCE_SCRIPT       + 0x00E + FT_A   Resource Script (windows)  
    FT_ERB       + 0x00F + FT_A   Ruby on Rails  
    FT_LEGATO       + 0x010 + FT_A   Legato Script (MS or LS)  
    FT_LEGATO_C       + 0x011   Legato Script Crunched/Encrypted  
    FT_EXE       + 0x10B   Executable  
    FT_DLL       + 0x10C   Executable Extension  
  Project/File List Types          
    FT_TYPE_PROJECT   0x00007000   Format Indicator Group  
    FT_PRIME_PROJECT       + 0x001 + FT_A   Prime Project File  
    FT_EDGAR_FLASH_PROJECT       + 0x101 + FT_A   EDGAR Flash Project File  
    FT_HTML_EASE_PROJECT       + 0x102 + FT_A   HTML Ease Project File  
    FT_EDGAR_EASE_PROJECT       + 0x103 + FT_A   EDGAR Ease Project File  
    FT_GOFILER_PROJECT       + 0x104 + FT_A   GoFiler Project File (v 1.x & 2.x)  
    FT_GOFILER_PROJECT_3X       + 0x105 + FT_A   GoFiler Project File (v 3.x)  
      FT_GFP_3X_ELO               + 0x01000000     – Normal EDGAR Link Online  
      FT_GFP_3X_13H               + 0x02000000     – Form 13H  
      FT_GFP_3X_13F               + 0x03000000     – Form 13F  
      FT_GFP_3X_MA               + 0x04000000     – Form MA  
      FT_GFP_3X_SDR               + 0x06000000     – Form SDR  
      FT_GFP_3X_RGA               + 0x06000000     – Regulation A  
      FT_GFP_3X_17A               + 0x07000000     – Form X-17A-5  
      FT_GFP_3X_C               + 0x08000000     – Form C  
      FT_GFP_3X_CFP               + 0x09000000     – Form CFPORTAL  
      FT_GFP_3X_17H               + 0x0A000000     – Form 17H  
      FT_GFP_3X_TA               + 0x0B000000     – Form TA  
      FT_GFP_3X_CEN               + 0x0C000000     – Form N-CEN  
      FT_GFP_3X_NPT               + 0x0D000000     – Form N-PORT  
      FT_GFP_3X_S16               + 0x0E000000     – EDGARLink Online (Section 16 Only)  
    FT_MSVS_PROJECT       + 0x201 + FT_A   MS Visual Studio Project  
    FT_MP3_PLAYLIST       + 0x301 + FT_A   MP3 Playlist  
    FT_CAB_FILELIST       + 0x401   File List for Install  
    FT_MSI_FILELIST       + 0x402   File List for Install  
    FT_SEC_COMPOSITE_EDGAR       + 0x501 + FT_A   SEC Composite/PDS Archive File  
    FT_SEC_RETURN_COPY       + 0x502 + FT_A   SEC Return Copy  
    FT_WINDOWS_LINK       + 0x601   Windows Shortcut (Link)  
    FT_PSG_EDIT_OBJECT       + 0x602   Edit Object†  
  Places (Drives, etc)†          
    FT_TYPE_PLACE   0x00008000   Format Indicator Group  
    FT_CLOUD       + 0x001   Cloud (VFC or other)  
    FT_COMPUTER       + 0x002   Computer (local drives)  
    FT_DESKTOP       + 0x003   User Desktop  
    FT_LIBRARIES       + 0x004   User Libraries  
    FT_LOCAL_CD_DVD       + 0x005   Local CD/DVD Disk  
    FT_LOCAL_CLOUD       + 0x006   Local Attached to Cloud  
    FT_LOCAL_DISK       + 0x007   Local Fixed Disk  
    FT_LOCAL_FLOPPY       + 0x008   Local Floppy Disk  
    FT_LOCAL_NETWORK       + 0x009   Local Network Mapped Drive  
    FT_LOCAL_RAMDISK       + 0x00A   Local RAM Drive  
    FT_LOCAL_REMOVABLE       + 0x00B   Local Flash Drive  
    FT_MY_DOCUMENTS       + 0x00C   User "My Documents"  
    FT_NETWORK       + 0x00D   Unmapped Network Places  
    FT_PROJECT       + 0x00E   Application Recent Projects  
    FT_RECENT       + 0x00F   Application Recent non-Projects  

 

* Discontinued or depreciated.

† Pseudo type.

‡ There is not a standard XML schema. This type only matches data generated from Novaworks applications.

 

6.10.3 Content Types

Related to file types are content types, as known as MIME encoding types. Legato supports a limited number of translation for file types to content types.

6.10.4 File Type Functions

File Type Testers:

GetFileTypeCode — Gets file type by file content or extension and return type as code dword.

GetFileTypeString — Gets file type by file content or extension and return type as string.

GetFileTypeData — Gets file type and data characteristics.

File Type Code Support:

FileTypeCodeToString — Translates a file type code dword to a string.

FileTypeStringToCode — Translates a file type code string to a dword.

Content Type:

ContentTypeToFileType — Translate a MIME Content Type to a file type.

FileTypeToContentType — Translate file type or code to a MIME content type.