Course 25 | Module 6 of 12

Product Data, Interoperability, and Engineering Information Architecture

Design trustworthy information flows across CAD, CAE, PLM, requirements, test, and measurement systems.

MAP

Module map

Learning outcomes

  • Map engineering tools to authoritative data and lifecycle responsibilities.
  • Specify metadata, provenance, version, configuration, and schema rules.
  • Explain what STEP / ISO 10303 supports and what data exchange still cannot guarantee.
  • Use JSON, CSV, and SQL as teaching tools for a queryable, AI-ready evidence architecture.

Evidence standard

Complete all four lessons, reproduce the worked checks, run the lab, and correct the weekly quiz. Treat AI output as candidate evidence until independently verified.

6.1

CAD, CAE, PLM, requirements, and test-data landscapes

Why this lesson matters

Engineering information fragments because each tool optimizes a local job, organization, and data model. Integration begins with responsibility, not software procurement.

Learning objectives

  • Define and distinguish PDM/PLM and Repository.
  • Apply the lesson method to the worked cad, cae, plm, requirements, and test-data landscapes case.
  • Evaluate evidence, uncertainty, and AI-assisted output before making a claim.

Readiness check

Before continuing, explain what decision this topic supports and name one upstream source that must be controlled.

Check your response

A sound answer names a specific engineering decision, its configuration, and a controlled requirement, model, dataset, interface, or standard that constrains the work.

Core idea

Map tools by the engineering objects they author, consume, transform, approve, and archive. Then assign authority and interfaces for requirements, geometry, materials, analyses, manufacturing, test, measurement, and decisions.

Key concepts

PDM/PLMSystems that control product definitions, configurations, changes, and lifecycle records.
RepositoryA managed location for artifacts, not automatically an authoritative or semantically integrated system.
System of recordThe governed source for a defined information class and process.
Data lineageThe recorded path from source through transformations to derived data and decisions.

Step-by-step explanation

  1. Inventory engineering objects and decisions before listing tools.
  2. For each object, name author, approver, consumers, update frequency, and retention need.
  3. Assign systems of record by scope and lifecycle state.
  4. Map transformations and manual handoffs, including unit and identifier changes.
  5. Prioritize integration risks where high-consequence decisions depend on repeated re-entry or ambiguous ownership.

Worked example

A company stores requirements in spreadsheets, CAD in PDM, analyses on personal drives, test data on a lab server, and decisions in meeting minutes.

  1. 1

    Identify controlled objects in each location and distinguish storage from authority.

  2. 2

    Trace one release decision across requirement, CAD revision, analysis run, test dataset, and review minute.

  3. 3

    Mark manual ID translation, copied inputs, missing owner, inaccessible raw data, and ambiguous status.

  4. 4

    Prioritize the release chain before attempting a company-wide data lake.

Result. The first architecture target is a narrow, high-value evidence chain with explicit ownership and stable links, not migration of every file.

Independent check. Every critical object has one defined authority for its scope and every transformation has an owner and validation rule.

Common misconceptions

MisconceptionCorrection
PLM is automatically the single source of truthAuthority depends on governed scope and actual process. PLM may link to specialist sources rather than contain every detail.
A tool output closes the questionA result remains a candidate until its inputs, method, configuration, uncertainty, and relevance have been checked.

Diagnostic questions

What should be integrated first?

Information flows with high decision consequence, frequent change, weak ownership, or costly manual reconciliation.

What would make this work reproducible?

Controlled inputs, method or code, versions, assumptions, outputs, and a stated interpretation tied to the decision.

Practice ladder

Basic

Build a tool-to-object responsibility matrix for a five-tool project.

Intermediate

Trace one requirement through CAD, analysis, test, and decision systems.

Advanced

Prioritize three integration investments using consequence, re-entry rate, detectability, and change frequency.

AI-assisted engineering task

Ask AI to classify a tool inventory by authored objects and candidate authorities. Do not allow it to infer current governance from product names.

How to prove the AI output yourself

  1. Interview actual data owners.
  2. Walk one real change end to end.
  3. Compare documented workflow with files and approvals actually used.

Retrieval and spaced review

Answer closed-notes today, then again after 1, 3, 7, and 30 days.

Define PDM/PLM.

Systems that control product definitions, configurations, changes, and lifecycle records.

What role does Repository play here?

A managed location for artifacts, not automatically an authoritative or semantically integrated system.

What must a reviewer be able to reconstruct?

Every critical object has one defined authority for its scope and every transformation has an owner and validation rule.

End-of-lesson summary

Map tools by the engineering objects they author, consume, transform, approve, and archive. Then assign authority and interfaces for requirements, geometry, materials, analyses, manufacturing, test, measurement, and decisions.

Student notes

For each tool, write what it owns, what it copies, what it transforms, and which decision would fail if it were wrong.

Recommended readings

Instructor notes

Use the current project environment, including shared drives and email, instead of an ideal enterprise architecture.

6.2

Metadata, versioning, provenance, and configuration management

Why this lesson matters

A valid file can still be applied to the wrong product, revision, operating condition, or decision.

Learning objectives

  • Define and distinguish Metadata and Version.
  • Apply the lesson method to the worked metadata, versioning, provenance, and configuration management case.
  • Evaluate evidence, uncertainty, and AI-assisted output before making a claim.

Readiness check

Before continuing, explain what decision this topic supports and name one upstream source that must be controlled.

Check your response

A sound answer names a specific engineering decision, its configuration, and a controlled requirement, model, dataset, interface, or standard that constrains the work.

Core idea

Metadata makes identity and context queryable. Versioning records artifact evolution; configuration management identifies compatible sets; provenance records origin and transformation. These controls overlap but are not interchangeable.

Key concepts

MetadataStructured data describing identity, meaning, ownership, status, context, and relationships.
VersionA distinguishable state in an artifact's history.
BaselineAn approved configuration used as a reference for controlled change.
Configuration itemAn element placed under configuration control because its identity and changes matter.

Step-by-step explanation

  1. Define stable identity separately from version.
  2. Record semantic fields such as units, reference frame, coordinate system, sample rate, and quantity definition.
  3. Create baselines that name compatible requirement, design, model, test, and software versions.
  4. Record derivation and transformation lineage.
  5. Use change control to review impact, approve disposition, and preserve history.

Worked example

Simulation run R44 used CAD-C, material card MAT-6, solver 2025.2, mesh script v3, and load case LC-07B. CAD-D exists, but R44 remains displayed as the latest stress result.

  1. 1

    Do not call R44 current merely because its timestamp is newest.

  2. 2

    Compare the release baseline with R44's configuration tuple.

  3. 3

    Mark R44 valid for baseline C and suspect for baseline D until geometry impact is assessed.

  4. 4

    Preserve the old result for decision history and create a new run rather than relabeling R44.

Result. Version and configuration semantics prevent a correct historical result from becoming incorrect current evidence.

Independent check. A query for baseline D cannot silently return R44 as applicable without an explicit reviewed equivalence.

Common misconceptions

MisconceptionCorrection
Latest timestamp means applicableApplicability depends on configuration and validity conditions, not recency alone.
A tool output closes the questionA result remains a candidate until its inputs, method, configuration, uncertainty, and relevance have been checked.

Diagnostic questions

Why preserve an obsolete result?

It explains earlier decisions, supports audit, and may remain valid for the historical baseline.

What would make this work reproducible?

Controlled inputs, method or code, versions, assumptions, outputs, and a stated interpretation tied to the decision.

Practice ladder

Basic

Distinguish file version, product revision, baseline, and run identifier.

Intermediate

Create a configuration tuple for an FEA run and a physical test.

Advanced

Design equivalence rules for reusing a test after a minor drawing change.

AI-assisted engineering task

Ask AI to extract candidate configuration metadata from run logs, returning unknown for missing values.

How to prove the AI output yourself

  1. Compare extracted fields with native tool metadata.
  2. Check controlled baseline membership.
  3. Reproduce the run or validate a checksum for critical inputs.

Retrieval and spaced review

Answer closed-notes today, then again after 1, 3, 7, and 30 days.

Define Metadata.

Structured data describing identity, meaning, ownership, status, context, and relationships.

What role does Version play here?

A distinguishable state in an artifact's history.

What must a reviewer be able to reconstruct?

A query for baseline D cannot silently return R44 as applicable without an explicit reviewed equivalence.

End-of-lesson summary

Metadata makes identity and context queryable. Versioning records artifact evolution; configuration management identifies compatible sets; provenance records origin and transformation. These controls overlap but are not interchangeable.

Student notes

Write the complete configuration tuple at the top of every analysis and test note.

Recommended readings

Instructor notes

Use several meanings of version in one example. Students should feel why a single version column is insufficient.

6.3

STEP / ISO 10303 and product-data interoperability

Why this lesson matters

Geometry that opens successfully in another tool may still lose product and manufacturing information, assembly structure, units, validation properties, or design intent.

Learning objectives

  • Define and distinguish STEP and Application protocol.
  • Apply the lesson method to the worked step / iso 10303 and product-data interoperability case.
  • Evaluate evidence, uncertainty, and AI-assisted output before making a claim.

Readiness check

Before continuing, explain what decision this topic supports and name one upstream source that must be controlled.

Check your response

A sound answer names a specific engineering decision, its configuration, and a controlled requirement, model, dataset, interface, or standard that constrains the work.

Core idea

STEP, formally ISO 10303, provides standardized product-model data representations and exchange mechanisms used across CAD, CAE, PDM, manufacturing, and inspection. Conformance to a format improves interoperability but does not guarantee complete or correct transfer for every intended use.

Key concepts

STEPThe ISO 10303 family for computer-interpretable representation and exchange of product data.
Application protocolA scoped information model for a class of product-data use, such as AP242.
PMIProduct and manufacturing information such as dimensions, tolerances, datums, and annotations.
Validation propertyA reference value, such as area, volume, or centroid, used to check translation fidelity.

Step-by-step explanation

  1. Define the downstream use before selecting exchange content.
  2. Specify the required STEP application protocol, edition, geometry, assembly, PMI, and metadata.
  3. Export with units and validation properties.
  4. Import into the target and perform semantic and geometric conformance checks.
  5. Record translator versions, warnings, deviations, and acceptance disposition.

Worked example

A bracket STEP file imports with the expected shape. The receiving system reports volume 178.6 cm³, while the source CAD reports 181.0 cm³.

  1. 1

    Compute relative volume difference: (178.6 - 181.0)/181.0 = -1.33%.

  2. 2

    Check unit conversion, suppressed features, geometry healing, and translator warnings.

  3. 3

    Compare mass properties, face count, PMI association, assembly identity, and critical dimensions against the intended downstream use.

  4. 4

    Reject or conditionally accept based on predeclared tolerances and consequence, not visual similarity.

Result. A viewable solid is not sufficient evidence of a faithful product-data exchange. The 1.33% volume gap triggers investigation under the translation acceptance plan.

Independent check. Source and target agree on required validation properties and semantic PMI within controlled criteria.

Common misconceptions

MisconceptionCorrection
Successful import proves interoperabilityThe file may open while losing semantics, identity, precision, PMI association, or validation properties.
A tool output closes the questionA result remains a candidate until its inputs, method, configuration, uncertainty, and relevance have been checked.

Diagnostic questions

What should determine exchange checks?

The downstream engineering use and consequence of missing or changed information.

What would make this work reproducible?

Controlled inputs, method or code, versions, assumptions, outputs, and a stated interpretation tied to the decision.

Practice ladder

Basic

Explain why STL and STEP are not interchangeable for controlled product definition.

Intermediate

Design a translation acceptance checklist for FEA preprocessing.

Advanced

Decide which PMI and validation properties must survive a supplier-to-inspection workflow.

AI-assisted engineering task

Ask AI to summarize translator logs and group warnings by geometry, PMI, units, assembly, and metadata. It may not waive warnings.

How to prove the AI output yourself

  1. Compare validation properties numerically.
  2. Inspect critical features and PMI association.
  3. Run downstream reference tests and record translator versions.

Retrieval and spaced review

Answer closed-notes today, then again after 1, 3, 7, and 30 days.

Define STEP.

The ISO 10303 family for computer-interpretable representation and exchange of product data.

What role does Application protocol play here?

A scoped information model for a class of product-data use, such as AP242.

What must a reviewer be able to reconstruct?

Source and target agree on required validation properties and semantic PMI within controlled criteria.

End-of-lesson summary

STEP, formally ISO 10303, provides standardized product-model data representations and exchange mechanisms used across CAD, CAE, PDM, manufacturing, and inspection. Conformance to a format improves interoperability but does not guarantee complete or correct transfer for every intended use.

Student notes

Never write 'imported successfully' without listing the content and properties actually checked.

Recommended readings

Instructor notes

If STEP tools are unavailable, use before-and-after metadata and validation-property tables. The lesson is evidence of exchange fidelity.

6.4

Schemas, JSON, CSV, SQL, and AI-ready engineering data

Why this lesson matters

AI systems amplify whatever structure, omissions, and semantic ambiguity they receive. Data architecture quality bounds AI-assisted engineering quality.

Learning objectives

  • Define and distinguish Schema and Primary key.
  • Apply the lesson method to the worked schemas, json, csv, sql, and ai-ready engineering data case.
  • Evaluate evidence, uncertainty, and AI-assisted output before making a claim.

Readiness check

Before continuing, explain what decision this topic supports and name one upstream source that must be controlled.

Check your response

A sound answer names a specific engineering decision, its configuration, and a controlled requirement, model, dataset, interface, or standard that constrains the work.

Core idea

A schema defines allowed entities, fields, types, units, relationships, and constraints. JSON is useful for nested interchange, CSV for flat tables, and SQL for constrained persistent relationships. Choose representations by data behavior, then validate at boundaries.

Key concepts

SchemaA machine-readable or documented contract for data structure and semantics.
Primary keyA stable value that uniquely identifies a database row or entity.
Foreign keyA constrained reference to another entity that enforces relationship integrity.
Data contractAgreed structure, meaning, quality rules, ownership, and change policy for exchanged data.

Step-by-step explanation

  1. Model entities and relationships from engineering questions.
  2. Assign stable keys, types, units, enumerations, required fields, and validity ranges.
  3. Normalize persistent data enough to avoid inconsistent duplicates while preserving usable queries.
  4. Version schemas and provide migrations for controlled changes.
  5. Expose AI only to authorized, relevant fields and require outputs that preserve source identifiers.

Worked example

A CSV uses columns `value`, `units`, and `test`, but `value` sometimes means force, temperature, or pass/fail code. An AI summary compares rows numerically.

  1. 1

    Stop the comparison because the schema does not identify quantity kind or data type.

  2. 2

    Create fields for observation_id, quantity, value_numeric, value_text, unit, location, time, uncertainty, test_id, and configuration.

  3. 3

    Use controlled quantity and unit vocabularies with type validation.

  4. 4

    Preserve raw source and transformation lineage before generating summaries.

Result. AI readiness comes from explicit semantics and traceability, not from converting files to a format accepted by a model API.

Independent check. Invalid quantity-unit combinations and missing configuration fail validation before analysis or AI use.

Common misconceptions

MisconceptionCorrection
Structured data is automatically meaningfulFields can be perfectly structured and semantically ambiguous, inconsistent, or disconnected from authority.
A tool output closes the questionA result remains a candidate until its inputs, method, configuration, uncertainty, and relevance have been checked.

Diagnostic questions

What makes data AI-ready?

Relevant controlled semantics, quality, provenance, permissions, configuration, and source-linked outputs.

What would make this work reproducible?

Controlled inputs, method or code, versions, assumptions, outputs, and a stated interpretation tied to the decision.

Practice ladder

Basic

Choose JSON, CSV, or SQL for five engineering data cases and justify each choice.

Intermediate

Design a schema for calibrated temperature observations.

Advanced

Plan a backward-compatible schema migration that adds uncertainty and location to existing test rows.

AI-assisted engineering task

Ask AI to propose a schema from sample records, then challenge it with nulls, mixed units, revisions, and one-to-many relationships.

How to prove the AI output yourself

  1. Validate against real query needs.
  2. Use database constraints and test cases.
  3. Review semantics with domain owners.
  4. Measure information loss during migration.

Retrieval and spaced review

Answer closed-notes today, then again after 1, 3, 7, and 30 days.

Define Schema.

A machine-readable or documented contract for data structure and semantics.

What role does Primary key play here?

A stable value that uniquely identifies a database row or entity.

What must a reviewer be able to reconstruct?

Invalid quantity-unit combinations and missing configuration fail validation before analysis or AI use.

End-of-lesson summary

A schema defines allowed entities, fields, types, units, relationships, and constraints. JSON is useful for nested interchange, CSV for flat tables, and SQL for constrained persistent relationships. Choose representations by data behavior, then validate at boundaries.

Student notes

Write three questions the data must answer before choosing tables or files.

Recommended readings

Instructor notes

Avoid presenting SQL normalization as the course goal. The goal is reliable engineering queries and controlled semantics.

LAB 6

Lab 6: Build a simple evidence dashboard report

Lab objective

Query structured evidence records and generate a static HTML report that shows coverage, status, configuration, and uncertainty flags.

Engineering context

A design review needs a reproducible status view without treating the dashboard as the authority.

Input data

  • Requirement and evidence dictionaries
  • Typed link records
  • A current baseline

Step-by-step task

  1. Compute coverage
  2. Flag stale or draft evidence
  3. Generate an HTML table
  4. Include source identifiers and a generation timestamp

Python code

from datetime import datetime, timezone
from html import escape

requirements = {"BRK-001": "mass <= 0.50 kg", "BRK-002": "limit load 2.0 kN"}
evidence = {
    "E-MASS": {"status": "approved", "baseline": "C", "result": "0.46 kg"},
    "E-STRESS": {"status": "reviewed", "baseline": "B", "result": "112 MPa"},
}
links = {"BRK-001": ["E-MASS"], "BRK-002": ["E-STRESS"]}
baseline = "C"
rows = []
for rid, text in requirements.items():
    linked = links.get(rid, [])
    flags = []
    for eid in linked:
        if evidence[eid]["baseline"] != baseline:
            flags.append(f"{eid}: stale baseline")
    rows.append((rid, text, ", ".join(linked) or "none", "; ".join(flags) or "none"))

body = "".join("<tr>" + "".join(f"<td>{escape(cell)}</td>" for cell in row) + "</tr>"
               for row in rows)
stamp = datetime.now(timezone.utc).isoformat()
report = f"<h1>Evidence review</h1><p>Generated {stamp}</p><table>{body}</table>"
print(report)

Explanation of code

Step 1 compute coverage Step 2 flag stale or draft evidence Step 3 generate an HTML table Step 4 include source identifiers and a generation timestamp

Expected output

An HTML fragment with two requirement rows and a stale-baseline flag on E-STRESS.

Interpretation

The report is a generated view. Reviewers follow identifiers to authoritative evidence before accepting a claim.

Common errors

  • Hiding the baseline rule
  • Displaying pass without uncertainty or status
  • Letting the report become an editable system of record

Extension tasks

  • Write the report to a file
  • Add orphan evidence
  • Add risk-prioritized sorting and provenance links

Reflection questions

  • What is authoritative here?
  • Why include generation time?
  • What would make the dashboard misleading?
PROJECT

Mini-project 2: Product information architecture

Deliverable

A tool and authority map, schema, configuration tuple, STEP exchange acceptance plan, and generated evidence report for a small mechanical product.

Required checks

At least five systems or repositories, three controlled transformations, schema constraints, one translation validation property, and one deliberate failure test.

WEEK 6

Weekly quiz and concept check

Closed notes. Answer each item, then use the key to correct in a different color.

  1. What should precede a tool inventory?
  2. Distinguish version and configuration.
  3. What does STEP address?
  4. Why is visual geometry comparison insufficient?
  5. What is a schema?
  6. Why does poor data architecture weaken AI?
Answer key
  1. 1. The engineering objects, decisions, owners, and lifecycle responsibilities.
  2. 2. Version is an artifact state; configuration is a compatible set representing a product or analysis state.
  3. 3. Standardized computer-interpretable representation and exchange of product-model data.
  4. 4. It can miss semantic, PMI, assembly, unit, and validation-property differences.
  5. 5. A contract for data entities, fields, types, semantics, relationships, and constraints.
  6. 6. It supplies ambiguous, stale, untraceable, or unauthorized inputs and prevents evidence-linked verification.
SOURCES

Module source map

SourceHow it is used
NIST STEP / ISO 10303 resourcesProduct-model data exchange, CAD/CAE/PDM interoperability, PMI, and long-term product information.
NIST Digital Thread for Smart ManufacturingDesign, manufacturing, inspection, and product-support interoperability and feedback.
DoDI 5000.97, Digital EngineeringOperational definitions of digital engineering, digital models, digital artifacts, authoritative data, test, and sustainment.
Digital Engineering Body of Knowledge (DEBoK)Terminology, knowledge areas, implementation practices, tools, people, and organizational considerations.

Access labels and full-course source notes are on the course home page. Paywalled standards are not paraphrased as if their full text were accessed.