Changelog

v4.40.0 - 2026-03-31

Bugfixes

  • Updated the vivisect dependency used for lnk and macho parsing.

v4.39.0 - 2026-03-26

Features and Enhancements

  • Updated the pillow library used for image parsing.

v4.38.1 - 2026-03-09

Bugfixes

  • Fixed an issue where the ZIP parser computed modified timestamps differently based on host timezone. The same archive now yields consistent epoch values across environments by treating the ZIP timestamp as UTC.

v4.38.0 - 2026-02-07

Features and Enhancements

  • Updated the service to build from Synapse v2.233.0.

v4.37.1 - 2025-12-10

Bugfixes

  • Updated HTML parser to strip leading/trailing whitespace when extracting inet:url nodes.

v4.37.0 - 2025-11-14

Features and Enhancements

  • Added checks to prevent reuploading a file that already exists in the configured Axon.

v4.36.0 - 2025-08-22

Features and Enhancements

  • Updated the service to build from Synapse v2.219.0.

Bugfixes

  • Fixed an issue where the RFC822 parser could fail to parse emails with attachments but no text/plain or text/html body.

v4.35.0 - 2025-08-01

Features and Enhancements

  • Updated X509 parser to add the application/x-x509-cert mime type to file:bytes nodes that are detected to be an X509 certificate but don’t include any X509 extensions.

  • Added SHA256 of file currently being processed to beginning of warning messages.

  • Updated the service to build from Synapse v2.218.1.

v4.34.0 - 2025-06-13

Features and Enhancements

  • Added free disk space checking to the archive parsers to gracefully handle low disk space conditions.

  • Updated the service to build from Synapse v2.213.0.

v4.33.1 - 2025-05-15

Bugfixes

  • Fixed an issue where the Yara parser could fail to parse rules which were actually valid.

  • Fixed an issue where the Yara parser could fail to include module imports when extracting rules.

v4.33.0 - 2025-04-23

Features and Enhancements

  • Added parsers for JSON and JSON lines so that individual keys and values are scraped.

v4.32.0 - 2025-03-14

Features and Enhancements

  • Updated parsers to always deduplicate scraped values that are modeled as references from the file.

  • Added a configurable max line size, with a default of 100MiB, to the text parser. Lines over this size will be broken up into chunks for parsing.

  • Updated the HTML URL parsing to better handle mailto: and tel: URIs.

v4.31.0 - 2025-02-07

Features and Enhancements

  • Added support for setting archive entry file :name property if unset.

  • Added a fileparser.cancel command to cancel a running task, with an optional timeout parameter for the command.

Bugfixes

  • Fixed an issue where fileparser tasks would not cancel when the max queue size was reached preventing future tasks from being started.

v4.30.0 - 2025-01-24

Features and Enhancements

  • Update the service to build from Synapse v2.195.0.

v4.29.0 - 2025-01-17

Features and Enhancements

  • Updated the email parser to also populate the inet:email:message:id property.

  • Updated deprecated $lib.list() usage to JSON style syntax.

  • Updated fileparser.parse command definition to remove deprecated usage of the forms key.

v4.28.1 - 2025-01-03

Bugfixes

  • Restrict CSV delimiter detection to the first 30 lines to prevent MIME detection from hanging.

v4.28.0 - 2024-12-02

Features and Enhancements

  • Added a default 1MB size limit for email body parsing.

  • Update the service to build from Synapse v2.190.0.

v4.27.0 - 2024-10-29

Features and Enhancements

  • Update the GZip parser to make file:archive:entry nodes instead of file:subfile nodes when parsing archives.

Bugfixes

  • Use updated APIs to extract timestamps from X509 certificates to remove deprecation warnings.

v4.26.1 - 2024-10-04

Bugfixes

  • The Synapse-FileParser now produces a specific warning message when invalid sha256 values are provided to it.

v4.26.0 - 2024-09-27

Features and Enhancements

  • Add support for parsing Microsoft Shortcut (also called LNK) files.

Bugfixes

  • Add JSON and JSONLINES file type detection to deal with performance issues in CSV detection.

v4.25.0 - 2024-08-20

Features and Enhancements

  • Update the service to build from Synapse v2.178.0.

  • The X509 parser no longer creates refs light edges between the source file:bytes node and the crypto:x509:cert node. These are already linked by the crypto:x509:cert:sha256 property.

v4.24.0 - 2024-08-05

Features and Enhancements

  • Replace plyara usage in the YARA parser with a new Lark parser.

v4.23.0 - 2024-07-11

Bugfixes

  • Remove the version information from the FileParser meta:source:name property.

  • When parsing files with unknown mime types, decrease the timeout used when attempting to identify if a file is a YARA rule.

v4.22.0 - 2024-06-14

Features and Enhancements

  • When parsing 7ZIP and ZIP archives, automatically try a few common passwords when the archive is password protected and no password is provided. The common passwords are currently infected, infected666, password123, and malware.

v4.21.0 - 2024-05-24

Features and Enhancements

  • Move exported Storm API documentation to the package definition.

v4.20.0 - 2024-03-29

Features and Enhancements

  • Set the text property to the OCR output for MIME specific forms that inherit from file:mime:image.

  • Adjust Tesseract OCR options to reduce noisy output, and expose them in the image parser confdef.

v4.19.0 - 2024-03-26

Features and Enhancements

  • Update the service to build from Synapse v2.165.0.

v4.18.0 - 2024-03-01

Features and Enhancements

  • Update $lib.bytes usage with $lib.axon APIs.

  • Update the service to build from Synapse v2.164.0.

Bugfixes

  • Fix an issue where mime detection could take several minutes for files that appeared to be text.

  • Remove a container build artifact at /tmp/parsetab.py which caused warnings upon startup.

v4.17.0 - 2024-02-06

Features and Enhancements

  • Update 7zip and Zip parsers to use the new file:archive:entry model elements.

  • Add fileparser.text command to output text extracted from file samples.

Bugfixes

  • Fix a typo in the fileparser.strings command help where --filters was listed as --filter.

  • Always show 7zip warnings, even if --debug is not specified.

  • Fix an issue where image OCR would output duplicative text.

  • Fix an issue where image parsing could fail to set node properties.

v4.16.0 - 2023-12-01

Features and Enhancements

  • Update the pillow library used for image EXIF data processing.

v4.15.0 - 2023-11-15

Features and Enhancements

  • Update the service to build from Synapse v2.154.1.

Bugfixes

  • Fix an issue where the parent file:bytes node did not have a seen edge to the meta:source node.

v4.14.0 - 2023-10-13

Features and Enhancements

  • Update the service to build from Synapse v2.151.0.

Bugfixes

  • Fix an issue where SynErr exceptions did not have their messages represented properly.

v4.13.1 - 2023-09-14

Bugfixes

  • Fix an issue where the file MIME could be set to application/pdf before it was successfully opened as a PDF file.

  • Fix an issue where the X.509 parser could not handle serial numbers larger than 159 bits.

v4.13.0 - 2023-08-21

Features and Enhancements

  • Update the service to build from Synapse v2.144.0.

v4.12.0 - 2023-08-11

Features and Enhancements

  • Add support for parsing RAR files.

v4.11.0 - 2023-08-04

Features and Enhancements

  • Add string parsing support to the exe mime parser.

  • Update the fileparser.strings command to add a --filters argument. This can be used to filter the strings found via regular expressions.

  • Add mime parser configuration options via the --conf command line option to fileparser.parse.

  • Add byte swapping detecting and handling to C2 config parser.

  • Document exported Storm APIs.

  • Update the service to build from Synapse v2.143.0.

v4.10.0 - 2023-07-07

Features and Enhancements

  • Update the service to build from Synapse v2.141.0.

v4.9.0 - 2023-06-13

Features and Enhancements

  • Populate the rsa:key:bits property when parsing X509 certificates.

v4.8.0 - 2023-05-19

Features and Enhancements

  • Add a --no-recurse option to prevent subparsing files.

  • Update the service to build from Synapse v2.134.0.

Bugfixes

  • Fix an issue which could cause a worker process to hang.

v4.7.1 - 2023-05-11

Bugfixes

  • Fix a permission issue that prevented the service from starting when run as a non-root user.

v4.7.0 - 2023-05-09

Features and Enhancements

  • Add support for parsing CobaltStrike C2 configuration profiles, with auto-detection using bundled YARA rules

  • Add support for zip files that utilize the WinZIP-style AES encryption feature.

  • Add conf option to CSV parser to disable scrape.

  • Add the filename in the subparser init message info if available.

  • Add support for parsing MS Outlook files.

  • Update the service to build from Synapse v2.133.0.

  • Disable debug and info logging for plyara.core messages.

v4.6.0 - 2023-02-02

Features and Enhancements

  • Improve handling of Mach-O files parsing.

  • Update the service to build from Synapse v2.122.0.

Bugfixes

  • Java class files with the 0xCAFEBABE magic string were incorrectly identified as Mach-O binaries. These files are now correctly identified.

  • Fix an issue with newline handling when extracting YARA rules from text.

  • Fix the HTMLtoJSON example in the documentation.

v4.5.1 - 2022-12-05

Bugfixes

  • Fix a packaging issue.

v4.5.0 - 2022-12-05

Features and Enhancements

  • Add boot hooks to the container entrypoint. Move the entrypoint script to /vertex/synapse/entrypoint.sh.

  • Temporary file usage, which can occur when retrieving a file from the Axon, is now stored in /vertex/storage/tmp.

v4.4.0 - 2022-10-09

Features and Enhancements

  • Update the service to build from Synapse v2.110.0.

v4.3.0 - 2022-09-08

Features and Enhancements

  • Attempt to continue parsing after encountering an invalid X.509 extension.

  • Add iterText() function to the Storm module API.

v4.2.1 - 2022-08-31

Bugfixes

  • Address an internal CI configuration issue.

v4.2.0 - 2022-08-31

Features and Enhancements

  • Rename the Storm package to synapse-fileparser.

v4.1.0 - 2022-08-29

Features and Enhancements

  • Add support for parsing Tar archives.

  • Add support for parsing MBOX files.

  • Parse crypto:x509:cert:serial values as hex values.

Bugfixes

  • Update Pillow and lxml libraries.

v4.0.0 - 2022-06-01

Features and Enhancements

  • Update permissions to use power-ups.fileparser.user instead of requiring asroot.

  • Move fileparser.ext Storm APIs into main fileparser module.

  • Remove deprecated fileparser.wget command (use wget | -> file:bytes | fileparser.parse instead).

  • Updated node action to use wget.

Documentation

  • Update documentation for AHA provisioning.

v3.17.0 - 2022-05-17

Features and Enhancements

  • Update to the newest Synapse v2.93.0 to support AHA provisioning.

  • Parse additional metadata fields from MS office document formats.

Bugfixes

  • Load storm package readonly to allow containers to run as non-root user.

  • Fix help output visibility in Optic.

v3.16.0 - 2022-03-31

Features and Enhancements

  • Add support for parsing YARA rules without automatically creating nodes.

v3.15.0 - 2022-03-28

Features and Enhancements

  • Add support for recognizing and extracting contents from 7zip archives.

  • The fileparser.parse --passwd option is now applied to zip when parsing them.

  • Add additional parsing of HTML attributes to identify more inet:url, inet:email, and tel:phone values.

v3.14.0 - 2022-03-14

Features and Enhancements

  • Removed unit tests from package distribution to avoid security alerts on test files.

v3.13.0 - 2022-01-04

Features and Enhancements

  • Support extracting GZIP archives and recursively parsing the contents.

v3.12.0 - 2021-10-28

Features and Enhancements

  • Support passing HTMLtoJSON configuration to RFC822 body parsing.

  • Embed some of the documentation into the Storm Package delivered by the Storm Service so it is available in Optic.

Bugfixes

  • Address an internal CI configuration issue.

  • Fix exception handling in X.509 parser due to change in cryptography library.

v3.11.1 - 2021-09-24

Bugfixes

  • Fix an issue with excessive logging while parsing Yara rules.

v3.11.0 - 2021-09-13

Features and Enhancements

  • Expand Windows PE file parsing to create file:mime:pe:export, file:mime:pe:resource, file:mime:pe:section, and file:mime:pe:vsvers:info nodes; as well as populating :mime:pe:exports:time, :mime:pe:exports:libname, :mime:pe:richdr, and mime:pe:size properties on file:bytes nodes.

  • Add a groups type to the htmltojson API. This is useful for selecting groups of siblings as a list, since CSS selectors do not allow selecting siblings of an element.

Bugfixes

  • Fix an issue where the Window PE imphash was not being set.

v3.10.1 - 2021-08-25

Bugfixes

  • Fix recursion error in parser teardown on early exit.

v3.10.0 - 2021-08-23

Features and Enhancements

  • Improve Storm warn message when parsing an unsupported olefile.

  • Add findall, strip, and split to HTMLtoJSON template.

  • Add HTMLtoJSON template documentation.

  • Support codec:errors configuration for CSV parser.

Bugfixes

  • Fix race conditions in timeout-based tests.

v3.9.0 - 2021-07-20

Features and Enhancements

  • Update the service to use tini as a container entrypoint.

v3.8.0 - 2021-07-14

Features and Enhancements

  • Update the service to ensure that the meta:source node is always made in the current View prior to creating any nodes which may be linked to it via a light edge. This removes the service onload event as a result.

  • Mark the Storm command fileparser.wget as deprecated. Users should move to using the Synapse wget command.

Bugfixes

  • Correct the Storm command form hinting to correctly represent nodes which may be yielded from commands.

v3.7.0 - 2021-06-21

Features and Enhancements

  • Update the service to build from Synapse v2.43.0.

v3.6.0 - 2021-06-09

Features and Enhancements

  • Wait for a free worker instead of aborting parse operation. (#78)

  • Set default max workers to number of CPUs. (#78)

  • Parse email body as HTML if available. (#77)

  • Improve HTML MIME detection. (#77)

  • Support running HTMLtoJSON within HTML parser. (#76)

  • Populate EXIF forms. (#70)

v3.5.0 - 2021-05-17

The minimum required Cortex version for using this version of the Synapse Fileparser is v2.38.0.

Features and Enhancements

  • Parse X509 certificate chains from X509 files. (#67)

  • Set hash properties on file:bytes nodes when they previously were not set. (#73)

  • Update the service to build from Synapse v2.38.0.

Bugfixes

  • Fix an issue where the Aha client was not torn down properly in spawned processes. (#72)

Improved Documentation

  • Improve the overall documentation for the available parsers. (#71)

v3.4.0 - 2021-04-26

Features and Enhancements

  • Update the service to include information for the getCellInfo() API.

Bugfixes

  • Fix the version information being returned by the Storm Service. Previously it was returned as a string, instead of a tuple. (#68)

v3.3.0 - 2021-04-19

Features and Enhancements

  • Add fileparser.status command with detailed debugging information. (#63)

  • Update default timeout to None (“last seen” can be inspected with fileparser.status). (#63)

  • For X.509 parsing, try matching crypto:x509:cert by SHA-256 before creating new node. (#64)

  • Add --parser-timeout to wget command. (#65)

Bugfixes

  • Fix active job tracking when a job exits early. (#63)

  • Prevent service from starting if no Axon is configured. (#63)

  • Remove inet:email:message refs (they are directly pivotable from file:bytes). (#64)

  • Fix parsing YARA rules that contain an import. (#66)

v3.2.0 - 2021-03-22

Features and Enhancements

  • Add RTF parser. (#60)

  • Add Storm function to fire parsed text to a consumer. (#61)

  • Assign additional mimes and aliases to zip parser. (#62)

  • Emit subparser start message for easier debugging. (#62)

Bugfixes

  • Fix RTD documentation builds. (#62)

v3.1.0 - 2021-03-02

Features and Enhancements

  • Populate file:subfile:path when parsing subfiles. (#58)

  • Parse multipart email attachments. (#57)

  • Add default size to fileparser.strings help. (#57)

  • Improve the RFC822 mime detection for parsing email files. (#59)

Bugfixes

  • Fix handling of HTML titles that return None. (#57)

v3.0.1 - 2021-01-28

Bugfixes

  • Fix private PyPI setup in CI. (#56)

v3.0.0 - 2021-01-27

Features and Enhancements

  • Add new MS Office parser which extracts text and images from Powerpoint, Excel, and Word files.

  • Add new image parser which extracts EXIF data and parses text using OCR.

  • Add new Zip file parser which extracts subfiles and parses them individually.

  • Add new executable parser which will set mime:pe properties and extract certificates.

  • Add new Yara parser which will extract rules from a file and create it:app:yara:rule nodes.

  • Add new parsers for CSV files and XML files.

  • Significantly broaden and improve MIME detection.

  • Add a command to brute-force detect ASCII strings in a file.

  • Add ability to unlock password protected PDF files.

  • Run parsing operations in subprocesses to improve performance.

  • Full changeset for this update can be found in #51

v2.5.0 - 2021-01-08

Features and Enhancements

  • Build new Docker tags for the latest release in a given major version. For example, this adds the v2.x.x Docker tag. (#52)

  • Remove -noproxy flag to ensure that network traffic goes through the configured proxy, if set. (#49)

Bugfixes

  • Populate file:bytes secondary properties when missing. (#48)

v2.4.0 - 2020-12-03

Features and Enhancements

  • Update to Synapse 2.12.3

v2.3.0 - 2020-09-28

Features and Enhancements

  • Populate available parsed data on the inbound file:bytes node (including mime:x509:cn). (#41)

Bugfixes

  • Fix parse.bysha256 success check when multiple inbound nodes provided. (#40)

  • Fix parse.byurl yield output when multiple inbound nodes provided. (#43)

  • Fix malformed HTML test for BeautifulSoup4 4.9.2. (#44)

v2.2.3 - 2020-08-14

Bugfixes

  • Improve parse.byurl yield behavior. By default this command will output the file:bytes node instead of the inbound node, and if --yield is specified the nodes from the sub-parsers will be outputted (e.g. crypto:x509:cert nodes if the downloaded file used the X.509 parser). (#39)

v2.2.2 - 2020-08-05

Features and Enhancements

  • Update the minimum required Synapse version for building (not using) the Synapse Fileparser image to v2.5.1.

v2.2.0 - 2020-08-05

Features and Enhancements

  • Move inet:email:headers to be an array property on teh inet:email:message node, instead of using refs light edges. This requires the Cortex connected to the service to be Synapse version v2.4.0 or greater. (#38)

v2.1.1 - 2020-07-20

Bugfixes

  • Fix issue where SOCKS proxy credentials were not being parsed. (#37)

v2.1.0 - 2020-07-01

Features and Enhancements

  • Update fileparser to have consistent --yield mechanics. (#35)

Improved Documentation

  • Add Initial Documentation. (#34)

v2.0.6 - 2020-06-11

Features and Enhancements

  • Parse emails with [@] defanging (#33)

Bugfixes

  • Use DER SHA256 for crypto:x509:cert guid (#33)

v2.0.5 - 2020-06-08

Features and Enhancements

  • Initial release of the Synapse Fileparser.