Skip to content

person.sexType change options.includeGeneric default value to true  #3694

@xDivisionByZerox

Description

@xDivisionByZerox

Description

In #3259 the sexType functionality within the PersonModule was reworked. The main change is that sexType can now return a new value: "generic", IF the newly, optional parameter options.includeGeneric is set to true. Currently, this parameter defaults to false to not break existing implementations.


For the next sections it is important to note that most of the person definitions (first names, last names, etc) are defined as an object that can have three states:

  • the base cases for all locale entries - it is undefined or is null (undefined = still missing in this locale or provided by the parent locale; null = non applicable for this locale)
  • it is an object that does have a data list for generic entries and no data for male AND female - for single entries or entire locales that do not have the concept of gendered, person related data
  • it is an object that does have a data list for male AND female; it might have a list of data for generic

Previous Behavior(before #3259)

The definition for what generic person definitions entries are, was fuzzy. Most locales implement this data set as a merged list from the male and female lists. This does not provide any value for nearly all use cases.

When selecting definitions based on the provided sex type, users had the following options:

  • leave the option as undefined
    • get an entry from the generic list
    • if no generic list exists, take a random entry from a merged set of male AND female definitions
  • provide the option as 'male'
    • get an entry from the male list
    • if no male list exists, take a random entry from the generic list
  • provide the option as 'female'
    • get an entry from the female list
    • if no female list exists, take a random entry from the generic list

This is not optimal, since Developers can not intentionally request gender‑neutral person data.

Current Behavior (v10 after #3259)

The definition for what generic person definitions entries are, have been refined. Docs for all definition entry keys were added:

  • female: Values that are primarily attributable to only females.
  • male: Values that are primarily attributable to only males.
  • generic: Values that cannot clearly be attributed to a specific sex or are used for both sexes.

Note

As of writing this issue, the data sets for "generic" still need to be adjusted to reflect these descriptions!

When selecting definitions based on the provided sex type, users have the following options:

  • leave the option as undefined
    • the option will default to eighter 'male' or 'female'
    • the same as if the user would provide 'male' or 'female'
  • provide the option as 'male'
    • get a random entry from a eighter the male or generic list - the distribution is weighted in bias of male with the length of the lists being taken into consideration
    • if no generic list exists, get a random entry from the male list
    • if no male list exists (generic must exist in this case), get a random entry from the generic list
  • provide the option as 'female'
    • get a random entry from a eighter the female or generic list - the distribution is weighted in bias of female with the length of the lists being taken into consideration
    • if no generic list exists, get a random entry from the female list
    • if no female list exists (generic must exist in this case), get a random entry from the generic list
  • provide the option as 'generic'
    • get a random entry from the generic
    • if no generic list exists, get a random entry from a merged set of the female and male list

This enables Developers to intentionally request gender‑neutral person data by explicitly providing 'generic' as sex type argument. These values could be especially useful for production more inclusive test data and avoiding binary assumptions in generated data.

Target Behavior (v11)

The definition for what generic person definitions entries stay the same and all locales align with the documented definition of the entries.

When selecting definitions based on the provided sex type, users have the following options:

  • leave the option as undefined
    • the option will default to eighter 'male', 'female' or 'generic'
    • the same as if the user would provide 'male', 'female' or 'generic'
  • provide the option as 'male', 'female' or 'generic'
    • same as in v10

This aligns Faker with the goal of generating real world data. By removing binary assumptions from generated values we can more realistically reflect the complex personality of persons in the modern world.


Originally found in #3259 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    breaking changeCannot be merged when next version is not a major releasec: refactorPR that affects the runtime behavior, but doesn't add new features or fixes bugs

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions