]
Diego Pires Plentz commented on HBX-276:
----------------------------------------
Christian, it looks like a HBX-215 dup ;-)
Generating data for small and large scale testing
-------------------------------------------------
Key: HBX-276
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HBX-276
Project: Hibernate Tools
Issue Type: Improvement
Components: datagen
Reporter: Christian Bauer
This is a placeholder item for the data generator kick off. It contains all information
available at start.
Intro
Overall goal is to provide a library that allows to programmatically (!):
- mass generation of sample objects filled with sample data
- objects can be
o JavaBeans style POJOs
o Table rows
- Generated objects are linked together as needed (associations, foreign keys)
- Generation of sample data can be configured, controlled or customized (custom
implementations)
Modules
The library is split into two separate Modules
- Data generation (values)
- Object/object graph creation (graphs)
Data generation is completely independent of the object/object graph creation.
Data Generation
- Definition of "random" to be used during generation
o Guaranteed random
o Pseudo-random with random seed
o Pseudo-random with provided seed (programmatically reproducible)
o Advanced: statistical randomness: gauss, peaks ...
- General settings applied to all value generators
o Duplicates allowed (max number of duplicates)
o NULL values allowed (min/max absolute number, percentage)
- Generation of string based values
o Random filled strings
- Min length
- Max length
- Upper/lower case configuration (1st upper/lower, rest via the defined character
set...)
- Character set to be used (lots of predefined...)
- Empty Strings allowed (in addition to min length property)
o Human readable sentences (e.g. "greeked" text, like lore ipsum...)
- Min length/Min num of words
- Max length/Max num of words
- Word repository to use (built-in or custom)
o Names
- Min length/Min num of words
- Max length/Max num of words
- Name type (first name, last name, computer usernames)
o File names
- Max length
- Name
- Extension (fixed, set of predefined, custom list)
o String concatenations (combination of string generators and fixed text)
- Generation of numeric values
o Precision of the results
o High value (including/excluding this value)
o Low value (including/excluding this value)
o Multiple hi/lo ranges
o Zeros allowed
o Sequences (g(n+1) OP g(n)+x; OP element of{<,>} and x any value)
- Generation of date and time value
o Min date value (absolute, for day, month and/or year)
o Max date value (absolute, for day, month and/or year)
o Min time value (absolute, for hour, minute and/or second)
o Max time value (absolute, for hour, minute and/or second)
o Allowed day of week
o Excluded days (predefined set of typical excludes, like Christmas (fix) dynamic
(eastern))
- Binary
o Min number of bytes
o Max number of bytes
o Binary type (set of predefined: gif, jpeg, etc.)
o Custom set of physical binary files to be used
- Special Types
o Boolean
o Currency values
o UUIDs (probably realizable with string generator)
o Lists/sets/bags of generated values
- Combination of generators
- Collection type to be used
Output formatter
For each generated type there can be various output formatters that convert the type into
the needed format.
- Various numeric formats (BigDecimal, int, long, float, double, etc.)
- Various date formats
- Streams
Output postModifier
For each generator there can be a custom handler that can execute some custom operations
on the generated output, before processing the output formatter.
Dependent Generation
Some generators might need the value of other properties of an object to generate a
meaningful output. So there is a way to provide a corresponding context for these
generators. Context could be:
- previously generated value(s) of this generator (value repeater, no duplicates in the
last n runs, city/zip/country code generation)
- other generated values (or in general: other properties) for the object they are filled
in (note that generators do not know the meaning of an object, since their context is
"value generation")
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: