-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Description
In #3259 the sexType functionality within the PersonModule was reworked. The main change is that sexType can now return a new value: "generic", IF the newly, optional parameter options.includeGeneric is set to true. Currently, this parameter defaults to false to not break existing implementations.
For the next sections it is important to note that most of the person definitions (first names, last names, etc) are defined as an object that can have three states:
- the base cases for all locale entries - it is
undefinedor isnull(undefined= still missing in this locale or provided by the parent locale;null= non applicable for this locale) - it is an object that does have a data list for
genericentries and no data formaleANDfemale- for single entries or entire locales that do not have the concept of gendered, person related data - it is an object that does have a data list for
maleANDfemale; it might have a list of data forgeneric
Previous Behavior(before #3259)
The definition for what generic person definitions entries are, was fuzzy. Most locales implement this data set as a merged list from the male and female lists. This does not provide any value for nearly all use cases.
When selecting definitions based on the provided sex type, users had the following options:
- leave the option as
undefined- get an entry from the
genericlist - if no
genericlist exists, take a random entry from a merged set of male AND female definitions
- get an entry from the
- provide the option as
'male'- get an entry from the
malelist - if no
malelist exists, take a random entry from thegenericlist
- get an entry from the
- provide the option as
'female'- get an entry from the
femalelist - if no
femalelist exists, take a random entry from thegenericlist
- get an entry from the
This is not optimal, since Developers can not intentionally request gender‑neutral person data.
Current Behavior (v10 after #3259)
The definition for what generic person definitions entries are, have been refined. Docs for all definition entry keys were added:
- female: Values that are primarily attributable to only females.
- male: Values that are primarily attributable to only males.
- generic: Values that cannot clearly be attributed to a specific sex or are used for both sexes.
Note
As of writing this issue, the data sets for "generic" still need to be adjusted to reflect these descriptions!
When selecting definitions based on the provided sex type, users have the following options:
- leave the option as
undefined- the option will default to eighter 'male' or 'female'
- the same as if the user would provide 'male' or 'female'
- provide the option as
'male'- get a random entry from a eighter the
maleorgenericlist - the distribution is weighted in bias of male with the length of the lists being taken into consideration - if no
genericlist exists, get a random entry from themalelist - if no
malelist exists (genericmust exist in this case), get a random entry from thegenericlist
- get a random entry from a eighter the
- provide the option as
'female'- get a random entry from a eighter the
femaleorgenericlist - the distribution is weighted in bias of female with the length of the lists being taken into consideration - if no
genericlist exists, get a random entry from thefemalelist - if no
femalelist exists (genericmust exist in this case), get a random entry from thegenericlist
- get a random entry from a eighter the
- provide the option as
'generic'- get a random entry from the
generic - if no
genericlist exists, get a random entry from a merged set of thefemaleandmalelist
- get a random entry from the
This enables Developers to intentionally request gender‑neutral person data by explicitly providing 'generic' as sex type argument. These values could be especially useful for production more inclusive test data and avoiding binary assumptions in generated data.
Target Behavior (v11)
The definition for what generic person definitions entries stay the same and all locales align with the documented definition of the entries.
When selecting definitions based on the provided sex type, users have the following options:
- leave the option as
undefined- the option will default to eighter 'male', 'female' or 'generic'
- the same as if the user would provide 'male', 'female' or 'generic'
- provide the option as
'male','female'or'generic'- same as in v10
This aligns Faker with the goal of generating real world data. By removing binary assumptions from generated values we can more realistically reflect the complex personality of persons in the modern world.
Originally found in #3259 (comment)