Changelog

v4.20.0 - 2024-03-29

Features and Enhancements

  • Set the text property to the OCR output for MIME specific forms that inherit from file:mime:image.

  • Adjust Tesseract OCR options to reduce noisy output, and expose them in the image parser confdef.

v4.19.0 - 2024-03-26

Features and Enhancements

  • Update the service to build from Synapse v2.165.0.

v4.18.0 - 2024-03-01

Features and Enhancements

  • Update $lib.bytes usage with $lib.axon APIs.

  • Update the service to build from Synapse v2.164.0.

Bugfixes

  • Fix an issue where mime detection could take several minutes for files that appeared to be text.

  • Remove a container build artifact at /tmp/parsetab.py which caused warnings upon startup.

v4.17.0 - 2024-02-06

Features and Enhancements

  • Update 7zip and Zip parsers to use the new file:archive:entry model elements.

  • Add fileparser.text command to output text extracted from file samples.

Bugfixes

  • Fix a typo in the fileparser.strings command help where --filters was listed as --filter.

  • Always show 7zip warnings, even if --debug is not specified.

  • Fix an issue where image OCR would output duplicative text.

  • Fix an issue where image parsing could fail to set node properties.

v4.16.0 - 2023-12-01

Features and Enhancements

  • Update the pillow library used for image EXIF data processing.

v4.15.0 - 2023-11-15

Features and Enhancements

  • Update the service to build from Synapse v2.154.1.

Bugfixes

  • Fix an issue where the parent file:bytes node did not have a seen edge to the meta:source node.

v4.14.0 - 2023-10-13

Features and Enhancements

  • Update the service to build from Synapse v2.151.0.

Bugfixes

  • Fix an issue where SynErr exceptions did not have their messages represented properly.

v4.13.1 - 2023-09-14

Bugfixes

  • Fix an issue where the file MIME could be set to application/pdf before it was successfully opened as a PDF file.

  • Fix an issue where the X.509 parser could not handle serial numbers larger than 159 bits.

v4.13.0 - 2023-08-21

Features and Enhancements

  • Update the service to build from Synapse v2.144.0.

v4.12.0 - 2023-08-11

Features and Enhancements

  • Add support for parsing RAR files.

v4.11.0 - 2023-08-04

Features and Enhancements

  • Add string parsing support to the exe mime parser.

  • Update the fileparser.strings command to add a --filters argument. This can be used to filter the strings found via regular expressions.

  • Add mime parser configuration options via the --conf command line option to fileparser.parse.

  • Add byte swapping detecting and handling to C2 config parser.

  • Document exported Storm APIs.

  • Update the service to build from Synapse v2.143.0.

v4.10.0 - 2023-07-07

Features and Enhancements

  • Update the service to build from Synapse v2.141.0.

v4.9.0 - 2023-06-13

Features and Enhancements

  • Populate the rsa:key:bits property when parsing X509 certificates.

v4.8.0 - 2023-05-19

Features and Enhancements

  • Add a --no-recurse option to prevent subparsing files.

  • Update the service to build from Synapse v2.134.0.

Bugfixes

  • Fix an issue which could cause a worker process to hang.

v4.7.1 - 2023-05-11

Bugfixes

  • Fix a permission issue that prevented the service from starting when run as a non-root user.

v4.7.0 - 2023-05-09

Features and Enhancements

  • Add support for parsing CobaltStrike C2 configuration profiles, with auto-detection using bundled YARA rules

  • Add support for zip files that utilize the WinZIP-style AES encryption feature.

  • Add conf option to CSV parser to disable scrape.

  • Add the filename in the subparser init message info if available.

  • Add support for parsing MS Outlook files.

  • Update the service to build from Synapse v2.133.0.

  • Disable debug and info logging for plyara.core messages.

v4.6.0 - 2023-02-02

Features and Enhancements

  • Improve handling of Mach-O files parsing.

  • Update the service to build from Synapse v2.122.0.

Bugfixes

  • Java class files with the 0xCAFEBABE magic string were incorrectly identified as Mach-O binaries. These files are now correctly identified.

  • Fix an issue with newline handling when extracting YARA rules from text.

  • Fix the HTMLtoJSON example in the documentation.

v4.5.1 - 2022-12-05

Bugfixes

  • Fix a packaging issue.

v4.5.0 - 2022-12-05

Features and Enhancements

  • Add boot hooks to the container entrypoint. Move the entrypoint script to /vertex/synapse/entrypoint.sh.

  • Temporary file usage, which can occur when retrieving a file from the Axon, is now stored in /vertex/storage/tmp.

v4.4.0 - 2022-10-09

Features and Enhancements

  • Update the service to build from Synapse v2.110.0.

v4.3.0 - 2022-09-08

Features and Enhancements

  • Attempt to continue parsing after encountering an invalid X.509 extension.

  • Add iterText() function to the Storm module API.

v4.2.1 - 2022-08-31

Bugfixes

  • Address an internal CI configuration issue.

v4.2.0 - 2022-08-31

Features and Enhancements

  • Rename the Storm package to synapse-fileparser.

v4.1.0 - 2022-08-29

Features and Enhancements

  • Add support for parsing Tar archives.

  • Add support for parsing MBOX files.

  • Parse crypto:x509:cert:serial values as hex values.

Bugfixes

  • Update Pillow and lxml libraries.

v4.0.0 - 2022-06-01

Features and Enhancements

  • Update permissions to use power-ups.fileparser.user instead of requiring asroot.

  • Move fileparser.ext Storm APIs into main fileparser module.

  • Remove deprecated fileparser.wget command (use wget | -> file:bytes | fileparser.parse instead).

  • Updated node action to use wget.

Documentation

  • Update documentation for AHA provisioning.

v3.17.0 - 2022-05-17

Features and Enhancements

  • Update to the newest Synapse v2.93.0 to support AHA provisioning.

  • Parse additional metadata fields from MS office document formats.

Bugfixes

  • Load storm package readonly to allow containers to run as non-root user.

  • Fix help output visibility in Optic.

v3.16.0 - 2022-03-31

Features and Enhancements

  • Add support for parsing YARA rules without automatically creating nodes.

v3.15.0 - 2022-03-28

Features and Enhancements

  • Add support for recognizing and extracting contents from 7zip archives.

  • The fileparser.parse --passwd option is now applied to zip when parsing them.

  • Add additional parsing of HTML attributes to identify more inet:url, inet:email, and tel:phone values.

v3.14.0 - 2022-03-14

Features and Enhancements

  • Removed unit tests from package distribution to avoid security alerts on test files.

v3.13.0 - 2022-01-04

Features and Enhancements

  • Support extracting GZIP archives and recursively parsing the contents.

v3.12.0 - 2021-10-28

Features and Enhancements

  • Support passing HTMLtoJSON configuration to RFC822 body parsing.

  • Embed some of the documentation into the Storm Package delivered by the Storm Service so it is available in Optic.

Bugfixes

  • Address an internal CI configuration issue.

  • Fix exception handling in X.509 parser due to change in cryptography library.

v3.11.1 - 2021-09-24

Bugfixes

  • Fix an issue with excessive logging while parsing Yara rules.

v3.11.0 - 2021-09-13

Features and Enhancements

  • Expand Windows PE file parsing to create file:mime:pe:export, file:mime:pe:resource, file:mime:pe:section, and file:mime:pe:vsvers:info nodes; as well as populating :mime:pe:exports:time, :mime:pe:exports:libname, :mime:pe:richdr, and mime:pe:size properties on file:bytes nodes.

  • Add a groups type to the htmltojson API. This is useful for selecting groups of siblings as a list, since CSS selectors do not allow selecting siblings of an element.

Bugfixes

  • Fix an issue where the Window PE imphash was not being set.

v3.10.1 - 2021-08-25

Bugfixes

  • Fix recursion error in parser teardown on early exit.

v3.10.0 - 2021-08-23

Features and Enhancements

  • Improve Storm warn message when parsing an unsupported olefile.

  • Add findall, strip, and split to HTMLtoJSON template.

  • Add HTMLtoJSON template documentation.

  • Support codec:errors configuration for CSV parser.

Bugfixes

  • Fix race conditions in timeout-based tests.

v3.9.0 - 2021-07-20

Features and Enhancements

  • Update the service to use tini as a container entrypoint.

v3.8.0 - 2021-07-14

Features and Enhancements

  • Update the service to ensure that the meta:source node is always made in the current View prior to creating any nodes which may be linked to it via a light edge. This removes the service onload event as a result.

  • Mark the Storm command fileparser.wget as deprecated. Users should move to using the Synapse wget command.

Bugfixes

  • Correct the Storm command form hinting to correctly represent nodes which may be yielded from commands.

v3.7.0 - 2021-06-21

Features and Enhancements

  • Update the service to build from Synapse v2.43.0.

v3.6.0 - 2021-06-09

Features and Enhancements

  • Wait for a free worker instead of aborting parse operation. (#78)

  • Set default max workers to number of CPUs. (#78)

  • Parse email body as HTML if available. (#77)

  • Improve HTML MIME detection. (#77)

  • Support running HTMLtoJSON within HTML parser. (#76)

  • Populate EXIF forms. (#70)

v3.5.0 - 2021-05-17

The minimum required Cortex version for using this version of the Synapse Fileparser is v2.38.0.

Features and Enhancements

  • Parse X509 certificate chains from X509 files. (#67)

  • Set hash properties on file:bytes nodes when they previously were not set. (#73)

  • Update the service to build from Synapse v2.38.0.

Bugfixes

  • Fix an issue where the Aha client was not torn down properly in spawned processes. (#72)

Improved Documentation

  • Improve the overall documentation for the available parsers. (#71)

v3.4.0 - 2021-04-26

Features and Enhancements

  • Update the service to include information for the getCellInfo() API.

Bugfixes

  • Fix the version information being returned by the Storm Service. Previously it was returned as a string, instead of a tuple. (#68)

v3.3.0 - 2021-04-19

Features and Enhancements

  • Add fileparser.status command with detailed debugging information. (#63)

  • Update default timeout to None (“last seen” can be inspected with fileparser.status). (#63)

  • For X.509 parsing, try matching crypto:x509:cert by SHA-256 before creating new node. (#64)

  • Add --parser-timeout to wget command. (#65)

Bugfixes

  • Fix active job tracking when a job exits early. (#63)

  • Prevent service from starting if no Axon is configured. (#63)

  • Remove inet:email:message refs (they are directly pivotable from file:bytes). (#64)

  • Fix parsing YARA rules that contain an import. (#66)

v3.2.0 - 2021-03-22

Features and Enhancements

  • Add RTF parser. (#60)

  • Add Storm function to fire parsed text to a consumer. (#61)

  • Assign additional mimes and aliases to zip parser. (#62)

  • Emit subparser start message for easier debugging. (#62)

Bugfixes

  • Fix RTD documentation builds. (#62)

v3.1.0 - 2021-03-02

Features and Enhancements

  • Populate file:subfile:path when parsing subfiles. (#58)

  • Parse multipart email attachments. (#57)

  • Add default size to fileparser.strings help. (#57)

  • Improve the RFC822 mime detection for parsing email files. (#59)

Bugfixes

  • Fix handling of HTML titles that return None. (#57)

v3.0.1 - 2021-01-28

Bugfixes

  • Fix private PyPI setup in CI. (#56)

v3.0.0 - 2021-01-27

Features and Enhancements

  • Add new MS Office parser which extracts text and images from Powerpoint, Excel, and Word files.

  • Add new image parser which extracts EXIF data and parses text using OCR.

  • Add new Zip file parser which extracts subfiles and parses them individually.

  • Add new executable parser which will set mime:pe properties and extract certificates.

  • Add new Yara parser which will extract rules from a file and create it:app:yara:rule nodes.

  • Add new parsers for CSV files and XML files.

  • Significantly broaden and improve MIME detection.

  • Add a command to brute-force detect ASCII strings in a file.

  • Add ability to unlock password protected PDF files.

  • Run parsing operations in subprocesses to improve performance.

  • Full changeset for this update can be found in #51

v2.5.0 - 2021-01-08

Features and Enhancements

  • Build new Docker tags for the latest release in a given major version. For example, this adds the v2.x.x Docker tag. (#52)

  • Remove -noproxy flag to ensure that network traffic goes through the configured proxy, if set. (#49)

Bugfixes

  • Populate file:bytes secondary properties when missing. (#48)

v2.4.0 - 2020-12-03

Features and Enhancements

  • Update to Synapse 2.12.3

v2.3.0 - 2020-09-28

Features and Enhancements

  • Populate available parsed data on the inbound file:bytes node (including mime:x509:cn). (#41)

Bugfixes

  • Fix parse.bysha256 success check when multiple inbound nodes provided. (#40)

  • Fix parse.byurl yield output when multiple inbound nodes provided. (#43)

  • Fix malformed HTML test for BeautifulSoup4 4.9.2. (#44)

v2.2.3 - 2020-08-14

Bugfixes

  • Improve parse.byurl yield behavior. By default this command will output the file:bytes node instead of the inbound node, and if --yield is specified the nodes from the sub-parsers will be outputted (e.g. crypto:x509:cert nodes if the downloaded file used the X.509 parser). (#39)

v2.2.2 - 2020-08-05

Features and Enhancements

  • Update the minimum required Synapse version for building (not using) the Synapse Fileparser image to v2.5.1.

v2.2.0 - 2020-08-05

Features and Enhancements

  • Move inet:email:headers to be an array property on teh inet:email:message node, instead of using refs light edges. This requires the Cortex connected to the service to be Synapse version v2.4.0 or greater. (#38)

v2.1.1 - 2020-07-20

Bugfixes

  • Fix issue where SOCKS proxy credentials were not being parsed. (#37)

v2.1.0 - 2020-07-01

Features and Enhancements

  • Update fileparser to have consistent --yield mechanics. (#35)

Improved Documentation

  • Add Initial Documentation. (#34)

v2.0.6 - 2020-06-11

Features and Enhancements

  • Parse emails with [@] defanging (#33)

Bugfixes

  • Use DER SHA256 for crypto:x509:cert guid (#33)

v2.0.5 - 2020-06-08

Features and Enhancements

  • Initial release of the Synapse Fileparser.