The Open Cybersecurity Schema Framework (OCSF) is a game-changer for security operations. By standardizing security telemetry, it promises a world where you can query data from any source using a single, unified schema. But as many security architects and engineers have discovered, the journey from raw, third-party logs to clean, compliant OCSF events can be fraught with complexity.
Mappings are often brittle, excessively verbose, and difficult to maintain. The rich, descriptive nature of OCSF, while powerful, can cause event sizes to balloon, straining storage and slowing down queries.
At Tenzir, we believe that embracing an open standard shouldn't be a chore. Your focus should be on detecting threats, not wrestling with data transformation logic. That's why we're excited to introduce three new operators in Tenzir Node v5.10.0 designed to radically simplify the process of mapping data to OCSF: ocsf::derive
, ocsf::apply
, and ocsf::trim
.
These operators work in concert to make your OCSF mappings more concise, reliable, and efficient. Let's explore how.
The Challenge: The Hidden Labor of OCSF Mapping
If you've ever written a mapping file, you've likely encountered these challenges:
Verbose & Repetitive Logic: OCSF schemas often have sibling fields that need to be populated together. For example, if you have a
file.path
, you also need to extractfile.name
andfile.directory
. This leads to repetitive and bloated mapping code that is a pain to write and maintain.Schema Conformance & Type Safety: How do you guarantee that the output of your mapping is a valid OCSF event? Without a strict enforcement mechanism, you can end up with incorrect data types or other subtle errors that break downstream consumers like your SIEM or data lake.
Data Bloat: A single-line log can expand into a 50-field object with nested records and lists. While this richness is great for context, it's not always necessary. Storing and querying this "heavy" data can be prohibitively expensive and slow, but manually removing fields is a complex, error-prone process.
Introducing a Trio of Powerful Operators
Our new operators are designed to solve these exact problems, allowing you to build robust, efficient OCSF pipelines with minimal effort.
1. ocsf::derive
: Keep Your Mappings Terse
The ocsf::derive
operator is your new best friend for writing concise mappings. It intelligently populates OCSF object fields based on the values of their siblings. For example, by providing a class_uid
and an activity_id
, ocsf::derive
can automatically fill in the corresponding class_name
, category_name
, and activity_name
.
Example
Input: Simply provide the numeric identifiers.
Output: The operator enriches the event with the corresponding string names.
2. ocsf::apply
: Guarantee Schema-Compliant Output
Think of ocsf::apply
as a quality gate for your OCSF events. This operator takes your mapped record and an OCSF class (e.g., File System Activity
), and rigorously conforms the data to the official schema. It validates fields, casts them to their correct OCSF data types, and reorders them into the canonical structure.
Example
Input: An event that you've carefully mapped.
Output: A schema-compliant OCSF event, with all types adhering to the schema. (Types are not visible in the output below.) This typically adds a lot of additional null fields, but these are typed nulls.
NB: We encode the unmapped
field as NDJSON by default, because most engines have early or no support for VARIANT types. Pass preserve_variants=true
if you'd like to keep the data structured, but be aware that this might cause schema explosion downstream.
3. ocsf::trim
: Manage Event Size with Precision
OCSF events can be large, but not all fields are created equal. The ocsf::trim
operator gives you control over this data bloat by allowing you to select a "profile" for your events. Based on our official OCSF Trimming Package, you can easily strip optional or recommended fields.
Example
Input: A rich Authentication
event containing fields marked as optional
in OCSF, such as user.display_name
.
Output: The operator removes the optional user.display_name
and class_name
fields, reducing the event size while keeping the essentials.
The additional boolean options drop_optional
and drop_recommend
correspond to two of the three attribute requirement flags in the schema: required, recommended, optional.
Conclusion: Focus on What Matters
By combining ocsf::derive
, ocsf::apply
, and ocsf::trim
, you can create OCSF mapping pipelines that are not only simpler and more reliable but also produce data that is optimized for your specific operational needs. This shift from tedious data plumbing to automated validation results in higher-quality, reliable OCSF data, empowering your security teams to build more effective analytics and detections.
Ready to simplify your OCSF workflow? Check out our documentation to get started.
And if you're attending Black Hat USA, come say hello! Tenzir is proud to sponsor the OCSF Reception. We'll have a table where you can see these operators in action and get a live demo. Register for the event and we'll see you there! 🎩