Skip to content

Add structural schemas #517

@jdegoes

Description

@jdegoes

Structural Type Schema Support

Overview

Extend Schema[A] to support structural types, enabling schema derivation for types defined by their structure rather than their nominal identity. This allows for duck-typed schema validation and conversion between nominal and structural representations.

Core Concepts

Direct Structural Schema Derivation

Schemas can be derived directly for structural types:

// Scala 3
type Person = { def name: String; def age: Int }
val schema = Schema.derived[Person]

// Scala 2
type Person = { def name: String; def age: Int }
val schema = Schema.derived[Person]

Note: Both Scala 2 and Scala 3 use def for uniformity, even though Scala 3 supports val in structural types.

Implementation: Schemas have bindings, which allow construction / deconstruction of values. Values for structural types are backed by:

  • Scala 3: Selectable
  • Scala 2: Dynamic

Nominal to Structural Conversion

Convert nominal type schemas to their structural equivalents:

case class Person(name: String, age: Int)

// Get the structural schema corresponding to Person's shape
val structuralSchema: Schema[{ def name: String; def age: Int }] = 
  Schema.derived[Person].structural

Schema API Extension

case class Schema[A](/* existing fields */) {
  /**
   * Convert this schema to a structural type schema.
   * 
   * The structural type represents the "shape" of A without its nominal identity.
   * This enables duck typing and structural validation.
   * 
   * @param toStructural Macro-generated conversion to structural representation
   * @return Schema for the structural type corresponding to A
   */
  def structural(implicit toStructural: ToStructural[A]): Schema[toStructural.StructuralType] = 
    toStructural.apply(this)
}

/**
 * Type class for converting nominal schemas to structural schemas.
 * Generated by macro for all supported types. Macro fails if a structural
 * type cannot be generated.
 *
 * NOTE: This approach has to be tested to yield inferrable types, and revised
 * if necessary. Inferrable types (from calling Schema#structural) are a must-have.
 */
trait ToStructural[A] {
  type StructuralType
  def apply(schema: Schema[A]): Schema[StructuralType]
}

object ToStructural {
  type Aux[A, S] = ToStructural[A] { type StructuralType = S }
  
  // Scala 3
  transparent inline given [A]: ToStructural[A] = ${toStructuralMacro[A]}
  
  // Scala 2
  implicit def materialize[A]: ToStructural[A] = macro toStructuralImpl[A]
}

Examples

1. Simple Product Types

Case Class to Structural

// Both Scala 2 and Scala 3
case class Person(name: String, age: Int)

// Original nominal schema
val nominalSchema: Schema[Person] = Schema.derived[Person]

// Convert to structural
val structuralSchema: Schema[{ def name: String; def age: Int }] = 
  nominalSchema.structural

// Direct structural derivation (equivalent)
val directStructural: Schema[{ def name: String; def age: Int }] = 
  Schema.derived[{ def name: String; def age: Int }]

2. Nested Structures

case class Address(street: String, city: String, zip: Int)
case class Person(name: String, age: Int, address: Address)

val structuralSchema = Schema.derived[Person].structural
// Type: Schema[{ 
//   def name: String
//   def age: Int
//   def address: { def street: String; def city: String; def zip: Int }
// }]

3. Collections and Options

case class Team(name: String, members: List[String], leader: Option[String])

val structuralSchema = Schema.derived[Team].structural
// Type: Schema[{
//   def name: String
//   def members: List[String]
//   def leader: Option[String]
// }]

4. Tuples to Structural

// Tuples can be converted to structural types
val tupleSchema: Schema[(String, Int, Boolean)] = Schema.derived[(String, Int, Boolean)]

val structuralSchema = tupleSchema.structural
// Type: Schema[{ def _1: String; def _2: Int; def _3: Boolean }]

5. Sum Types (Sealed Traits) - Scala 3 Only

Sealed traits become union types of structural representations, with tag information stored at the type level:

// Scala 3 only
sealed trait Result
case class Success(value: Int) extends Result
case class Failure(error: String) extends Result

val structuralSchema = Schema.derived[Result].structural
// Type: Schema[
//   { type Tag = "Success"; def value: Int } | { type Tag = "Failure"; def error: String }
// ]

Note: Sum type to structural conversion is not supported in Scala 2 because it requires union types. Attempting to call .structural on a sealed trait schema in Scala 2 will result in a compile-time error.

6. Enums (Scala 3 Only)

enum Status:
  case Active, Inactive, Suspended

val structuralSchema = Schema.derived[Status].structural
// Type: Schema[{type Tag = "Active"} | {type Tag = "Inactive"} | {type Tag = "Suspended"}]

7. Opaque Types (Scala 3)

opaque type UserId = String
object UserId:
  def apply(value: String): Either[String, UserId] = 
    if value.nonEmpty then Right(value) else Left("Empty user ID")

case class User(id: UserId, name: String)

val structuralSchema = Schema.derived[User].structural
// Type: Schema[{ def id: String; def name: String }]
// Opaque type is unwrapped to its underlying type

8. Bidirectional Conversion

Structural schemas work seamlessly with Into/As (if this ticket is implemented after that ticket):

case class Person(name: String, age: Int)

val structuralSchema = Schema.derived[Person].structural

// Create structural value (Scala 3)
val structuralPerson = new Selectable {
  def selectDynamic(field: String): Any = field match {
    case "name" => "Alice"
    case "age" => 30
  }
}

// Convert structural to nominal using Into
val person: Either[SchemaError, Person] = 
  Into[{ def name: String; def age: Int }, Person].into(structuralPerson)
// => Right(Person("Alice", 30))

// Convert nominal to structural
val backToStructural: Either[SchemaError, { def name: String; def age: Int }] = 
  Into[Person, { def name: String; def age: Int }].into(Person("Bob", 25))

9. Empty and Single-Field Products

// Empty case class
case class Empty()
val emptyStructural = Schema.derived[Empty].structural
// Type: Schema[{}]

// Single field
case class Id(value: String)
val idStructural = Schema.derived[Id].structural
// Type: Schema[{ def value: String }]

10. Large Products

case class LargeRecord(
  f1: String, f2: Int, f3: Boolean, f4: Double, f5: Long,
  f6: String, f7: Int, f8: Boolean, f9: Double, f10: Long,
  f11: String, f12: Int, f13: Boolean, f14: Double, f15: Long,
  f16: String, f17: Int, f18: Boolean, f19: Double, f20: Long,
  f21: String
)

val structuralSchema = Schema.derived[LargeRecord].structural
// Type: Schema[{
//   def f1: String; def f2: Int; def f3: Boolean; ...
//   def f21: String
// }]

Type Name Handling

Current Limitation

Schemas currently use TypeName[A] to identify types. Structural types don't have meaningful nominal type names, which creates a mismatch.

Temporary Solution

Until TypeName[A] is replaced with TypeId[A] (see issue #471), structural schemas will use a normalized string representation of the structural type as a fake type name:

case class Person(name: String, age: Int)

val schema = Schema.derived[Person]
schema.typeName // => TypeName for "Person"

val structural = schema.structural
structural.typeName // => TypeName for "{age:Int,name:String}"
// Normalized: fields sorted alphabetically, types fully qualified

Normalization Rules

  1. Field ordering: Alphabetical by field name
  2. Type qualification: Use simple names for primitives and standard library types
  3. Whitespace: No whitespace in generated names
  4. Collections: Standard notation (e.g., List[Int])
  5. Options: Explicit Option[T] notation
  6. Nested structures: Recursive application of rules
  7. Deterministic: Same structure always produces same normalized name

Examples

// Simple product
case class Point(x: Int, y: Int)
Schema.derived[Point].structural.typeName 
// => "{x:Int,y:Int}"

// Nested product
case class Address(street: String, zip: Int)
case class Person(name: String, address: Address)
Schema.derived[Person].structural.typeName
// => "{address:{street:String,zip:Int},name:String}"

// With collections
case class Team(name: String, members: List[String])
Schema.derived[Team].structural.typeName
// => "{members:List[String],name:String}"

// Union type (Scala 3)
sealed trait Result
case class Success(value: Int) extends Result
case class Failure(error: String) extends Result
Schema.derived[Result].structural.typeName
// => "{error:String}|{value:Int}"

Future: TypeId[A]

The upcoming TypeId[A] replacement will properly handle structural types by representing them by their structure rather than a string-based hack. See issue #471 for details.


Limitations and Edge Cases

1. Generic Types

Behavior depends on existing Schema derivation support for generic types.

If Schema.derived[Container[Int]] already works, then structural conversion should work:

case class Container[T](value: T)

// If this works:
val schema = Schema.derived[Container[Int]]

// Then this should work:
val structural = schema.structural
// Type: Schema[{ def value: Int }]

If generic type derivation is not currently supported, this ticket does not require implementing it. The macro should produce a clear compile-time error for unsupported generic types.

2. Recursive Types

Recursive types will fail at compile-time because Scala does not support infinite types:

case class Tree(value: Int, children: List[Tree])

// This will FAIL at compile-time:
val structural = Schema.derived[Tree].structural
// Compile error: Cannot generate infinite structural type

// The structural type would need to be:
// { def value: Int; def children: List[{ def value: Int; def children: List[...] }] }
// which is infinite and unsupported

The macro must detect recursive types and produce a helpful error message:

Compile error: Cannot generate structural type for recursive type Tree.
Structural types cannot represent recursive structures.

3. Mutually Recursive Types

Similarly, mutually recursive types are unsupported:

case class Node(id: Int, edges: List[Edge])
case class Edge(from: Int, to: Node)

// This will FAIL at compile-time:
val nodeStructural = Schema.derived[Node].structural
// Compile error: Cannot generate structural type for mutually recursive types

4. Sum Types in Scala 2

Sealed traits and sum types cannot be converted to structural types in Scala 2 because they require union types:

// Scala 2
sealed trait Result
case class Success(value: Int) extends Result
case class Failure(error: String) extends Result

// This will FAIL at compile-time in Scala 2:
val structural = Schema.derived[Result].structural
// Compile error: Cannot generate structural type for sum types in Scala 2.
// Union types are required, which are only available in Scala 3.

The macro must detect sum types in Scala 2 and produce a clear error.

5. Case Objects

Case objects become empty structural types:

case object Singleton

val structural = Schema.derived[Singleton.type].structural
// Type: Schema[{}]

For sum types with case objects (Scala 3):

sealed trait Status
case object Active extends Status
case object Inactive extends Status

val structural = Schema.derived[Status].structural
// Type: Schema[{} | {}]
// Not particularly useful, but valid

6. Structural Types as Source

Deriving schemas directly for structural types is supported:

type PersonStructure = { def name: String; def age: Int }

val schema = Schema.derived[PersonStructure]
// Should work if structural type derivation is implemented

The schema's bindings will use Selectable (Scala 3) or Dynamic (Scala 2) to construct and deconstruct values.


Integration with Into/As

Structural schemas compose naturally with Into/As conversions.

Nominal → Structural

case class Person(name: String, age: Int)

// Auto-derived conversion
val nominalToStructural: Into[Person, { def name: String; def age: Int }] = 
  Into.derived

val person = Person("Alice", 30)
val structural = nominalToStructural.into(person)
// => Right(<Selectable/Dynamic instance>)

Structural → Nominal

type PersonStructure = { def name: String; def age: Int }
case class Person(name: String, age: Int)

// Auto-derived conversion
val structuralToNominal: Into[PersonStructure, Person] = 
  Into.derived

val structural: PersonStructure = new Selectable {
  def selectDynamic(field: String): Any = field match {
    case "name" => "Bob"
    case "age" => 25
  }
}

val person = structuralToNominal.into(structural)
// => Right(Person("Bob", 25))

Bidirectional (As)

case class Person(name: String, age: Int)

// Bidirectional conversion
val personAs: As[Person, { def name: String; def age: Int }] = 
  As.derived

// Nominal → Structural
val structural = personAs.into(Person("Alice", 30))

// Structural → Nominal
val nominal = structural.flatMap(personAs.from)
// Round-trip successful

Schema-Guided Conversion

case class PersonV1(firstName: String, lastName: String, age: Int)
case class PersonV2(name: String, age: Int)

// Use structural type as intermediary
type PersonStructure = { def name: String; def age: Int }

// Step 1: Transform V1 to structural (custom logic)
val v1ToStructural: Into[PersonV1, PersonStructure] = 
  new Into[PersonV1, PersonStructure] {
    def into(v1: PersonV1): Either[SchemaError, PersonStructure] = {
      Right(new Selectable {
        def selectDynamic(field: String): Any = field match {
          case "name" => s"${v1.firstName} ${v1.lastName}"
          case "age" => v1.age
        }
      })
    }
  }

// Step 2: Auto-convert structural to V2
val structuralToV2: Into[PersonStructure, PersonV2] = Into.derived

// Composed migration
def migrate(v1: PersonV1): Either[SchemaError, PersonV2] = {
  v1ToStructural.into(v1).flatMap(structuralToV2.into)
}

Testing Requirements

Test Matrix

  1. Direct Structural Derivation

    • Simple products (case classes)
    • Nested products
    • Collections (List, Vector, Set, Map, Option, Either)
    • Tuples (2-22 elements)
    • Empty case classes
    • Single-field case classes
    • Large products (20+ fields)
    • Case objects
  2. Nominal to Structural Conversion

    • Case class → structural
    • Tuple → structural
    • Nested case classes → nested structural
    • Case class with collections → structural with collections
    • Empty case class → empty structural
  3. Sum Types (Scala 3 Only)

    • Sealed trait → union type structural
    • Sealed trait with case objects
    • Enum → union type structural
    • Nested sum types
  4. Type Name Generation

    • Simple product normalized name
    • Nested product normalized name
    • Name determinism (same structure = same name)
    • Alphabetical field ordering in names
    • Union type names (Scala 3)
  5. Selectable/Dynamic Implementation

    • Scala 3 Selectable field access
    • Scala 2 Dynamic field access
    • Field access correctness
    • Missing field behavior
    • Extra field behavior
  6. Integration with Into/As

    • Nominal → Structural via Into
    • Structural → Nominal via Into
    • Round-trip via As
    • Composed conversions with structural intermediary
  7. Error Cases (Compile-Time)

    • Recursive types produce error
    • Mutually recursive types produce error
    • Sum types in Scala 2 produce error
    • Unsupported types produce helpful errors
  8. Generic Types (if supported by existing Schema derivation)

    • Fully applied generic → structural
    • Generic with nested structural fields

Scala 2 vs Scala 3 Test Separation

src/test/scala/
  structural/
    common/
      SimpleProductSpec.scala
      NestedProductSpec.scala
      CollectionsSpec.scala
      TuplesSpec.scala
      EmptyProductSpec.scala
      SingleFieldSpec.scala
      LargeProductSpec.scala
      TypeNameNormalizationSpec.scala
      IntoIntegrationSpec.scala
      AsIntegrationSpec.scala
      
    scala3/
      UnionTypesSpec.scala
      SealedTraitToUnionSpec.scala
      EnumToUnionSpec.scala
      SelectableImplementationSpec.scala
      
    scala2/
      DynamicImplementationSpec.scala
      SumTypeErrorSpec.scala (verifies compile error)
      
    errors/
      RecursiveTypeErrorSpec.scala
      MutualRecursionErrorSpec.scala
      UnsupportedTypeErrorSpec.scala

Test Examples

// Test: Simple product to structural
test("case class converts to structural schema") {
  case class Person(name: String, age: Int)
  
  val structural = Schema.derived[Person].structural
  
  // Type check (this is a compile-time test)
  val _: Schema[{ def name: String; def age: Int }] = structural
  
  assert(structural.typeName.toString.contains("name"))
  assert(structural.typeName.toString.contains("age"))
}

// Test: Nested products
test("nested case classes convert to nested structural") {
  case class Address(street: String, zip: Int)
  case class Person(name: String, address: Address)
  
  val structural = Schema.derived[Person].structural
  
  val _: Schema[{ 
    def name: String
    def address: { def street: String; def zip: Int }
  }] = structural
}

// Test: Tuple to structural
test("tuple converts to structural with _N fields") {
  val structural = Schema.derived[(String, Int, Boolean)].structural
  
  val _: Schema[{ def _1: String; def _2: Int; def _3: Boolean }] = structural
}

// Test: Union type (Scala 3 only)
test("sealed trait converts to union type structural") {
  sealed trait Result
  case class Success(value: Int) extends Result
  case class Failure(error: String) extends Result
  
  val structural = Schema.derived[Result].structural
  
  val _: Schema[{ def value: Int } | { def error: String }] = structural
}

// Test: Type name normalization
test("structural type names are normalized and deterministic") {
  case class Person(name: String, age: Int)
  case class User(age: Int, name: String) // Different field order
  
  val personStructural = Schema.derived[Person].structural
  val userStructural = Schema.derived[User].structural
  
  // Same structure, same normalized name
  assert(personStructural.typeName == userStructural.typeName)
  
  // Alphabetical ordering
  assert(personStructural.typeName.toString.contains("age"))
  assert(personStructural.typeName.toString.indexOf("age") < 
         personStructural.typeName.toString.indexOf("name"))
}

// Test: Integration with Into
test("structural to nominal conversion via Into") {
  case class Person(name: String, age: Int)
  type PersonStructure = { def name: String; def age: Int }
  
  val structural: PersonStructure = new Selectable {
    def selectDynamic(field: String): Any = field match {
      case "name" => "Alice"
      case "age" => 30
    }
  }
  
  val person = Into[PersonStructure, Person].into(structural)
  assert(person == Right(Person("Alice", 30)))
}

// Test: Round-trip via As
test("nominal to structural and back preserves data") {
  case class Person(name: String, age: Int)
  type PersonStructure = { def name: String; def age: Int }
  
  val original = Person("Alice", 30)
  
  val toStructural = As[Person, PersonStructure].into(original)
  val backToNominal = toStructural.flatMap(As[Person, PersonStructure].from)
  
  assert(backToNominal == Right(original))
}

// Test: Recursive type compile error
test("recursive types produce compile error") {
  case class Tree(value: Int, children: List[Tree])
  
  assertDoesNotCompile("Schema.derived[Tree].structural")
}

// Test: Sum type in Scala 2 compile error
test("sum types in Scala 2 produce compile error") {
  sealed trait Result
  case class Success(value: Int) extends Result
  
  // Scala 2 only
  assertDoesNotCompile("Schema.derived[Result].structural")
}

Implementation Notes

Macro Behavior

The macro must:

  1. Detect product types (case classes, tuples) and generate structural types with def members
  2. Detect sum types (sealed traits, enums) and:
    • In Scala 3: Generate union types of structural representations
    • In Scala 2: Fail with clear error message
  3. Detect recursive types and fail with clear error message
  4. Normalize structural type representations for type name generation
  5. Generate ToStructural instance with:
    • StructuralType type member set to the generated structural type
    • apply method that transforms the schema appropriately
  6. Preserve field metadata from original schema where applicable
  7. Generate appropriate bindings using Selectable (Scala 3) or Dynamic (Scala 2)

Schema Transformation

When converting Schema[A] to Schema[StructuralType]:

  1. Preserve field information: Field names, types, optional/required status
  2. Update type name: Use normalized structural representation
  3. Transform bindings: Replace nominal constructors/deconstructors with structural equivalents
  4. Preserve validation: Maintain any validation logic that applies to field values
  5. Handle nested schemas: Recursively transform nested product types

Error Messages

Provide clear compile-time errors:

// Recursive type
case class Tree(value: Int, children: List[Tree])
Schema.derived[Tree].structural

// Error:
"""
Cannot generate structural type for recursive type Tree.
Structural types cannot represent recursive structures.
Scala's type system does not support infinite types.
"""

// Sum type in Scala 2
sealed trait Result
case class Success(value: Int) extends Result
Schema.derived[Result].structural

// Error (Scala 2 only):
"""
Cannot generate structural type for sum type Result.
Structural representation of sum types requires union types,
which are only available in Scala 3.
Consider upgrading to Scala 3 or using a different approach.
"""

Deliverables

  1. ToStructural[A] trait and macro for Scala 2.13
  2. ToStructural[A] trait and macro for Scala 3.5
  3. structural method on Schema[A]
  4. ✅ Support for product types (case classes, tuples)
  5. ✅ Support for sum types (sealed traits, enums) in Scala 3 only
  6. ✅ Normalized type name generation
  7. Selectable bindings (Scala 3) and Dynamic bindings (Scala 2)
  8. ✅ Integration with Into/As for structural ↔ nominal conversions
  9. ✅ Comprehensive test suite (300+ test cases)
  10. ✅ Clear error messages for unsupported cases
  11. ✅ Documentation with examples

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions