[hibernate-issues] [Hibernate-JIRA] Commented: (HBX-276) Generating data for small and large scale testing

Wallace Wadge (JIRA) noreply at atlassian.com
Fri Jan 4 03:28:56 EST 2008


    [ http://opensource.atlassian.com/projects/hibernate/browse/HBX-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_29230 ] 

Wallace Wadge commented on HBX-276:
-----------------------------------

Hi, 

You might wish to have a look at http://sourceforge.net/projects/hibernatepojoge/ for some code to get you started, specifically,  in skeleton/src/randomlib/data. There isn't much, but what's there is yours if you want to grab it. Included are a bunch of functions to generate random data of a given type/length eg: generateRandomNumericString(),  generateRandomLong(), generateRandomFloat(), etc.

The linked project makes use of them to populate the objects as is proposed in this issue, but probably the code is too tied down to it's code generation to be of direct use. 

This library I'm mentioning however should be a drop-in fit since it is not tied down to anything.


Regards






> Generating data for small and large scale testing
> -------------------------------------------------
>
>                 Key: HBX-276
>                 URL: http://opensource.atlassian.com/projects/hibernate/browse/HBX-276
>             Project: Hibernate Tools
>          Issue Type: Improvement
>          Components: datagen
>            Reporter: Christian Bauer
>
> This is a placeholder item for the data generator kick off. It contains all information available at start.
> Intro
> Overall goal is to provide a library that allows to programmatically (!):
> -	mass generation of sample objects filled with sample data
> -	objects can be
> o	JavaBeans style POJOs
> o	Table rows
> -	Generated objects are linked together as needed (associations, foreign keys)
> -	Generation of sample data can be configured, controlled or customized (custom implementations)
> Modules
> The library is split into two separate Modules
> -	Data generation (values)
> -	Object/object graph creation (graphs)
> Data generation is completely independent of the object/object graph creation.
> Data Generation
> -	Definition of "random" to be used during generation
> o	Guaranteed random
> o	Pseudo-random with random seed
> o	Pseudo-random with provided seed (programmatically reproducible)
> o	Advanced: statistical randomness: gauss, peaks ...
> -	General settings applied to all value generators
> o	Duplicates allowed (max number of duplicates)
> o	NULL values allowed (min/max absolute number, percentage)
> -	Generation of string based values
> o	Random filled strings
> -	Min length
> -	Max length
> -	Upper/lower case configuration (1st upper/lower, rest via the defined character set...)
> -	Character set to be used (lots of predefined...)
> -	Empty Strings allowed (in addition to min length property)
> o	Human readable sentences (e.g. "greeked" text, like lore ipsum...)
> -	Min length/Min num of words
> -	Max length/Max num of words
> -	Word repository to use (built-in or custom)
> o	Names
> -	Min length/Min num of words
> -	Max length/Max num of words
> -	Name type (first name, last name, computer usernames)
> o	File names
> -	Max length
> -	Name
> -	Extension (fixed, set of predefined, custom list)
> o	String concatenations (combination of string generators and fixed text)
> -	Generation of numeric values
> o	Precision of the results
> o	High value (including/excluding this value)
> o	Low value (including/excluding this value)
> o	Multiple hi/lo ranges
> o	Zeros allowed
> o	Sequences (g(n+1) OP g(n)+x; OP element of{<,>} and x any value)
> -	Generation of date and time value
> o	Min date value (absolute, for day, month and/or year)
> o	Max date value (absolute, for day, month and/or year)
> o	Min time value (absolute, for hour, minute and/or second)
> o	Max time value (absolute, for hour, minute and/or second)
> o	Allowed day of week
> o	Excluded days (predefined set of typical excludes, like Christmas (fix) dynamic (eastern))
> -	Binary
> o	Min number of bytes
> o	Max number of bytes
> o	Binary type (set of predefined: gif, jpeg, etc.)
> o	Custom set of physical binary files to be used
> -	Special Types
> o	Boolean
> o	Currency values
> o	UUIDs (probably realizable with string generator)
> o	Lists/sets/bags of generated values
> -	Combination of generators
> -	Collection type to be used
> Output formatter
> For each generated type there can be various output formatters that convert the type into the needed format.
> -	Various numeric formats (BigDecimal, int, long, float, double, etc.)
> -	Various date formats
> -	Streams
> Output postModifier
> For each generator there can be a custom handler that can execute some custom operations on the generated output, before processing the output formatter.
> Dependent Generation
> Some generators might need the value of other properties of an object to generate a meaningful output. So there is a way to provide a corresponding context for these generators. Context could be:
> -	previously generated value(s) of this generator (value repeater, no duplicates in the last n runs, city/zip/country code generation)
> -	other generated values (or in general: other properties) for the object they are filled in (note that generators do not know the meaning of an object, since their context is "value generation")

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://opensource.atlassian.com/projects/hibernate/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the hibernate-issues mailing list