[jboss-dev-forums] [Design of Messaging on JBoss (Messaging/JBoss)] - Re: Journal Cleanup and Journal Compactor

timfox do-not-reply at jboss.com
Wed Jun 10 05:41:42 EDT 2009


As far as I am concerned, compacting solves all these issues.

We *definitely* need to do compacting, and *maybe* we need to do linked list (although I think that is doubtful).

So we should do compacting first, then, if there are problems with that approach we can look at linked list, but linkedlist is not a priority right now.

Let me describe how I think the compacting should work.

Let's say we have a set of files in the journal:

F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10

min files is set to 10

The current file is F6 and F7, F8, F9, F10 are unused files waiting to be used

F0, F1, F2, F3, F4, F5 are files full of data and not open any more, however, data about their contents are stored in memory.

We need a compacting thread that intermittently (say every 30 seconds) scans the set of journal files in memory.

It scans F0... F5, and when it has completed that scan, it can decide which records are no longer needed in each file.

A record is no longer needed in a file if it has already been deleted in any file F0.. F5.

Once the scanner thread has computed which records are needed and which are not, it can compute a percentage of which records in total are dead space.

E.g. the scanner might compute that 72% of the data in files F0.. .F5 are dead space.

We then have a parameter that the user can configure compactorDeadSpaceThresholdPercentage, e.g. this might have a default of 75%.

If the amount of dead space computed by the scanner >= compactorDeadSpaceThresholdPercentage, then the scanner will compact those files.

The actual compacting approach goes as follows:

The scanner knows how many new files it will need for the compacted records, let's say it needs two new files - it can get these from the unused files (e.g. F7, F8, F9 or F10) if they are available.

It then opens the old files F0..F5 loads the wanted records into memory in blocks and copies them to the new files.

When this process is finished it will end up with, say, two new files containing the compacted records.

We can then save a marker file (empty in the journal directory) which says we are starting the rename.

The old files F0..F5 then need to be renamed so they are no longer used by the journal (but can be recovered in case of a crash).

Then the two new files need to be renamed so they will be picked up by the journal.

When that process completes the marker file can be deleted.

If the server crashes after renaming the old files but before renaming the new files, then the marker file will still exist in the journal. The journal can detect this on startup, and finish the process.

This ensures the journal will still startup after such a crash.

Also the JournalFile objects also need to be updated in memory when the compact is complete.

View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4236630#4236630

Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4236630



More information about the jboss-dev-forums mailing list