[hibernate-issues] [Hibernate-JIRA] Commented: (HBX-276) Generating data for small and large scale testing

Diego Pires Plentz (JIRA) noreply at atlassian.com
Sun Nov 18 18:58:58 EST 2007


    [ http://opensource.atlassian.com/projects/hibernate/browse/HBX-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_28882 ] 

Diego Pires Plentz commented on HBX-276:
----------------------------------------

Christian, it looks like a HBX-215 dup ;-)

> Generating data for small and large scale testing
> -------------------------------------------------
>
>                 Key: HBX-276
>                 URL: http://opensource.atlassian.com/projects/hibernate/browse/HBX-276
>             Project: Hibernate Tools
>          Issue Type: Improvement
>          Components: datagen
>            Reporter: Christian Bauer
>
> This is a placeholder item for the data generator kick off. It contains all information available at start.
> Intro
> Overall goal is to provide a library that allows to programmatically (!):
> -	mass generation of sample objects filled with sample data
> -	objects can be
> o	JavaBeans style POJOs
> o	Table rows
> -	Generated objects are linked together as needed (associations, foreign keys)
> -	Generation of sample data can be configured, controlled or customized (custom implementations)
> Modules
> The library is split into two separate Modules
> -	Data generation (values)
> -	Object/object graph creation (graphs)
> Data generation is completely independent of the object/object graph creation.
> Data Generation
> -	Definition of "random" to be used during generation
> o	Guaranteed random
> o	Pseudo-random with random seed
> o	Pseudo-random with provided seed (programmatically reproducible)
> o	Advanced: statistical randomness: gauss, peaks ...
> -	General settings applied to all value generators
> o	Duplicates allowed (max number of duplicates)
> o	NULL values allowed (min/max absolute number, percentage)
> -	Generation of string based values
> o	Random filled strings
> -	Min length
> -	Max length
> -	Upper/lower case configuration (1st upper/lower, rest via the defined character set...)
> -	Character set to be used (lots of predefined...)
> -	Empty Strings allowed (in addition to min length property)
> o	Human readable sentences (e.g. "greeked" text, like lore ipsum...)
> -	Min length/Min num of words
> -	Max length/Max num of words
> -	Word repository to use (built-in or custom)
> o	Names
> -	Min length/Min num of words
> -	Max length/Max num of words
> -	Name type (first name, last name, computer usernames)
> o	File names
> -	Max length
> -	Name
> -	Extension (fixed, set of predefined, custom list)
> o	String concatenations (combination of string generators and fixed text)
> -	Generation of numeric values
> o	Precision of the results
> o	High value (including/excluding this value)
> o	Low value (including/excluding this value)
> o	Multiple hi/lo ranges
> o	Zeros allowed
> o	Sequences (g(n+1) OP g(n)+x; OP element of{<,>} and x any value)
> -	Generation of date and time value
> o	Min date value (absolute, for day, month and/or year)
> o	Max date value (absolute, for day, month and/or year)
> o	Min time value (absolute, for hour, minute and/or second)
> o	Max time value (absolute, for hour, minute and/or second)
> o	Allowed day of week
> o	Excluded days (predefined set of typical excludes, like Christmas (fix) dynamic (eastern))
> -	Binary
> o	Min number of bytes
> o	Max number of bytes
> o	Binary type (set of predefined: gif, jpeg, etc.)
> o	Custom set of physical binary files to be used
> -	Special Types
> o	Boolean
> o	Currency values
> o	UUIDs (probably realizable with string generator)
> o	Lists/sets/bags of generated values
> -	Combination of generators
> -	Collection type to be used
> Output formatter
> For each generated type there can be various output formatters that convert the type into the needed format.
> -	Various numeric formats (BigDecimal, int, long, float, double, etc.)
> -	Various date formats
> -	Streams
> Output postModifier
> For each generator there can be a custom handler that can execute some custom operations on the generated output, before processing the output formatter.
> Dependent Generation
> Some generators might need the value of other properties of an object to generate a meaningful output. So there is a way to provide a corresponding context for these generators. Context could be:
> -	previously generated value(s) of this generator (value repeater, no duplicates in the last n runs, city/zip/country code generation)
> -	other generated values (or in general: other properties) for the object they are filled in (note that generators do not know the meaning of an object, since their context is "value generation")

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://opensource.atlassian.com/projects/hibernate/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the hibernate-issues mailing list