# Understanding the ComplianceAsCode build system ## Introduction This section aims to provide an introduction to the ComplianceAsCode build system to developers interested in extending or debugging it. Before beginning, it is generally expected that some familiarity with the relevant standards in this space are understood. Among others, these are: - [XCCDF](https://csrc.nist.gov/projects/security-content-automation-protocol/specifications/xccdf), the eXtensible Configuration Checklist Description Format; this is a textual representation format of various steps in hardening a particular system. - [OVAL](https://oval.cisecurity.org/), the Open Vulnerability and Assessment Language; this is a standardized mechanism for auditing various compliance checks. - [OCIL](https://csrc.nist.gov/projects/security-content-automation-protocol/specifications/ocil), the Open Checklist Interactive Language, an expressive language for handling manual compliance checks. - [CPE](https://nvd.nist.gov/products/cpe), the Common Platform Enumeration; a scheme for identifying software and systems. - SCAP source data stream format, a mechanism for combining the above into a single redistributable file. Additionally, some familiarity with the content layout (as discussed in previous chapters) is also implied. However, while this document serves as a guide, ultimately the build system is changing and thus inspecting the code is the only way to find the answers to many questions. ## High-Level Overview ComplianceAsCode's content project is ultimately the combination of three things: - A collection of content in a format-agnostic manner, - A build system for collecting this content and combining it to form artifacts understood by other systems, - A test system for validating both the compliance of these artifacts to various standards and the correctness of the content in the repo. As previous sections describe in detail the expectations around content in the repo, this section aims to describe the build system. For understanding of the test systems, it is suggested to look at the README under `tests/` in the repo. The build system is generated by CMake and combines local Python utilities with XML tools (such as `xmllint` and `xsltproc`) and OpenSCAP's `oscap` CLI executable. These Python utilities transform the various input files into a more standardized format and apply Jinja macros to them. Ultimately many of the artifacts we generate are XML-based so extensive XSLT processing occurs after building the initial structure in Python. Finally, OpenSCAP combines and references several files for us to build the finished artifacts. ### CMake Structure CMake requires projects have an entry point called `/CMakeLists.txt`. This uses the CMake language and drives building and installing the project. This file contains several things: - The many build-time options for customizing the types of content generated, - The hand-off for generating each product's specific content, - Common installation, testing, and distribution targets. However, the specifics of building a particular product are contained in the shared module located at `cmake/SSGCommon.cmake`. This file contains all of the CMake logic to build a particular product and exposes the top-level macro `ssg_build_product(...)`. This macro generates per-product build, installation, and testing targets. While the specifics should be understood from this file directly, in general this takes the following outline of steps in rough order of occurrence: - Generate SCE content and metadata. - Generate the product dictionary. - Resolve rules, profiles, groups, static checks and static remediations to the product-specific resolved form (also known as compiled form). - Generate templated checks and remediations from the templates. - Collect all available remediations. - Combine all available OVAL checks into a single unlinked OVAL document. - Load resolved rules, profiles, groups, collected remediations and the unlinked OVAL document and generate XCCDF, OVAL and OCIL documents from this data. - Generate CPE OVAL and CPE dictionary. - Combining the OVAL, OCIL, CPE and XCCDF documents into a single SCAP source data stream. - Generate CEL content YAML for Kubernetes/OpenShift compliance checks (if enabled for the product). - Generate content for derived products (such as CentOS and Scientific Linux). - Generate HTML tables, Bash scripts, Ansible Playbooks and other secondary artifacts. ### Python Build Scripts Various Python utilities under `/build-scripts` contribute to this process; refer to their help text for more information and usage: - `build_all_guides.py` -- generates separate HTML guides for every profile in an XCCDF document. - `build_cel_content.py` -- generates CEL (Common Expression Language) content YAML for Kubernetes/OpenShift compliance checks. See [CEL Content](13_cel_content.md) for detailed information about CEL rules and profiles. - `build_rule_playbooks.py` -- generates per-rule per-profile playbooks in Ansible content. - `build_sce.py` -- outputs SCE content and combined metadata. - `build_templated_content.py` -- generates templated audit and remediation content. - `build_xccdf.py` -- generate XCCDF, OVAL and OCIL documents from resolved content - `collect_remediations.py` -- finds the separate (per-rule and templated) remediations and places them into a single directory. - `combine_ovals.py` -- combines separate (per-rule, shared, and templated) OVAL XML trees into a single larger OVAL XML document. - `compile_all.py` -- resolves rules, groups, profiles static checks and remediations to the product-specific resolved form (also known as compiled form) - `compile_product.py` -- resolves the product.yml and distributed product attributes - `compose_ds.py` -- composes an SCAP source data stream from individual SCAP components - `cpe_generate.py` -- generates the product-specific CPE dictionary and checks. - `enable_derivatives.py` -- generates derivative product content from a base product. - `expand_jinja.py` -- helper script used by the BATS (Bash unit test framework) to expand Jinja in test scripts. - `generate_guides.py` -- Generate HTML guides and HTML index for every profile in the built SCAP source data stream. - `generate_man_page.py` -- generates the ComplianceAsCode man page. - `generate_profile_remediations.py` -- Generate profile oriented Bash remediations (Bash scripts or Ansible Playbooks or Bash scripts for Hummingbird) from the built SCAP source data stream. The output is similar to the output of the `oscap xccdf generate fix` command, but the tool `generate_profile_remediations.py` generates the output for all profiles in the given SCAP source data stream at once. - `profile_tool.py` -- utility script to generate statistics about profiles in a specific XCCDF/data stream file. - `verify_references.py` -- used by the test system to verify cross-linkage of identifiers between XCCDF and OVAL/OCIL documents. Many of these utilities are simply front-ends over code in the SSG Python module located under `ssg/`. ## How OVAL is Built The build of the OVAL document takes place in two steps. ### 1. Combination of OVALs In the first step, all available and applicable OVAL checks are built into a single unlinked OVAL document stored in the `build/${PRODUCT}/oval-unlinked.xml` directory. The `oval-unlinked.xml` document is generated using the `combine_ovals.py` script. The OVAL shorthands are loaded into the OVAL Document object in the order that the benchmark checks are loaded first, followed by the shared directory checks. If the shorthand is already loaded into the OVAL Document object, it is skipped. Steps of loading the OVAL shorthand: 1. The OVAL Shorthand file is loaded as a string, and in the case of not templated Shorthand, it is expanded using Jinja macros before loading. 2. The OVAL Shorthand string is processed by the OVAL Document object. 1. The OVAL Shorthand string is loaded into the OVAL Shorthand object. 2. The OVAL Shorthand object is validated. The following properties are checked: - Whether the OVAL definitions are applicable to the product. - If there is an OVAL definition in the shorthand with the same id as the given rule_id. 3. If the OVAL Shorthand object is valid, it is added to the OVAL Document object. After all OVAL Shorthands are loaded, the affected platforms of the loaded OVAL definitions are completed. And then the OVAL document is saved as an XML file in `build/${PRODUCT}/oval-unlinked.xml`. ### 2. Linking OVAL Document The second step is performed when building an XCCDF document using the `build_xccdf.py` script. In this step, the `oval-unlinked.xml` document from the previous step is linked (IDs between rules and checks are aligned) to the XCCDF document being built. Steps to link an OVAL document to an XCCDF document: 1. The unlinked OVAL document `oval-unlinked.xml` is loaded into the OVAL Document object. 2. The integrity of the references to the components of the OVAL Document object is verified. 3. For each XCCDF rule that has a CCE identification and has an OVAL check implemented, a new `` element with the CCE ID is added to the OVAL definition. 4. The OVAL definition referenced by the XCCDF is checked to be defined in the OVAL document. 5. Verify if `` `type` to corresponding OVAL variable `datatype` export matching [constraint](http://csrc.nist.gov/publications/nistpubs/800-126-rev2/SP800-126r2.pdf#page=30&zoom=auto,69,313) is met. Also correct the `type` attribute of those `` elements where necessary in order the produced content to meet this constraint. 6. Verify that the referenced CCE identifiers are correct. 7. Translate the identifiers in the OVAL Document object using `IDTranslator`. 8. The OVAL Document object is stored as an XML file `build/ssg-${PRODUCT}-oval.xml`. 9. For each XCCDF rule, a minimal OVAL Documents document is generated as an artifact 10. For each reference of OVAL check in XCCDF, a link to the `check-content` and a `check-export` element is added. ## How CEL Content is Built CEL (Common Expression Language) content provides an alternative scanning mechanism to OVAL specifically designed for Kubernetes and OpenShift API resource evaluation. Unlike OVAL which requires shell access and evaluates system state, CEL rules evaluate Kubernetes resources directly through the API server. CEL content generation is optional and must be explicitly enabled for each product. ### Enabling CEL Content CEL content generation is enabled by setting `PRODUCT_CEL_ENABLED` in the product's `CMakeLists.txt`: ```cmake set(PRODUCT "ocp4") set(PRODUCT_REMEDIATION_LANGUAGES "ignition;kubernetes") set(PRODUCT_CEL_ENABLED TRUE) ssg_build_product(${PRODUCT}) ``` ### Build Process When CEL content is enabled for a product, the build system performs the following steps: 1. **Rule and Profile Resolution** - All rules and profiles are compiled to their product-specific resolved form (same as for SCAP content). 2. **CEL Rule Loading** - The `build_cel_content.py` script loads all rules with CEL checks (identified by having both `expression` and `inputs` fields from `cel/shared.yml`) from the `build/${PRODUCT}/rules/` directory. 3. **CEL Profile Loading** - The script loads all profiles with `scanner_type: CEL` from the `build/${PRODUCT}/profiles/` directory. 4. **Validation** - The build system validates CEL content: - Rules must have `expression` field (non-empty CEL expression) - Rules must have `inputs` field (non-empty list of Kubernetes resources) - Profiles must have `selected` field with at least one rule - No duplicate rule names (after conversion to hyphenated format) - All profile rule references must exist in the CEL rules 5. **Content Generation** - The script generates a single CEL content YAML file at `build/${PRODUCT}-cel-content.yaml`. ### CEL Content Structure The generated CEL content YAML has two main sections: ```yaml profiles: - id: cis_vm_extension # Profile ID (with underscores) name: cis-vm-extension # Profile name (with hyphens) title: Profile Title description: Profile description productType: Platform rules: # List of rule names (hyphenated) - rule-name-one - rule-name-two rules: - id: rule_name_one # Rule ID (with underscores) name: rule-name-one # Rule name (with hyphens) title: Rule Title description: Rule description rationale: Rule rationale severity: medium checkType: Platform expression: | # CEL expression resource.spec.enabled == true inputs: # Kubernetes resource inputs - name: resource kubernetesInputSpec: apiVersion: v1 resource: pods instructions: Manual check steps # From ocil field controls: # From references field cis@ocp4: - 1.2.3 nist: - CM-6 ``` ### Differences from SCAP CEL content is processed differently from traditional SCAP content: | Aspect | SCAP| CEL Content | |--------|-----------|-------------| | **Rules Included** | All rules except CEL | Only CEL rules | | **Profiles Included** | All profiles except CEL | Only CEL profiles | | **Output Format** | XML (DataStream) | YAML | | **Output Location** | `build/ssg-${PRODUCT}-ds.xml` | `build/${PRODUCT}-cel-content.yaml` | | **Scanner** | OpenSCAP | compliance-operator | | **Evaluation** | Shell commands, file checks | Kubernetes API queries | Rules with no check implemented appear **only** in the SCAP content. Rules with only CEL check implemented are **excluded** from SCAP content generation and **only** appear in the CEL content YAML. Rules with both OVAL and CEL check implemented are **included** in both SCAP and CEL content. Profiles with `scanner_type: CEL` are **excluded** from SCAP content and **only** appear in the CEL content YAML. For detailed information about creating CEL rules and profiles, see [CEL Content](13_cel_content.md).