[jboss-jira] [JBoss JIRA] (JGRP-1448) FILE_PING: Fail to read node file
Peter Nerg (JIRA)
jira-events at lists.jboss.org
Sun May 20 13:13:17 EDT 2012
[ https://issues.jboss.org/browse/JGRP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694174#comment-12694174 ]
Peter Nerg edited comment on JGRP-1448 at 5/20/12 1:12 PM:
-----------------------------------------------------------
Patch for the JGRP-1448 tracker.
Contains modifications to the FILE_PING file.
I did the patch against the master branch.
This is my first official patch using Git (I'm more familiar with SVN so it's a learning curve) so hopefully I managed to create something useful.
was (Author: peter.nerg):
Patch for the JGRP-1448 tracker.
Contains modifications to the FILE_PING file.
> FILE_PING: Fail to read node file
> ---------------------------------
>
> Key: JGRP-1448
> URL: https://issues.jboss.org/browse/JGRP-1448
> Project: JGroups
> Issue Type: Patch
> Affects Versions: 2.12.3
> Environment: Any O/S with a NFS or other type of shared file system
> Reporter: Peter Nerg
> Assignee: Bela Ban
> Labels: FILE_PING, jgroups
> Fix For: 3.0.11, 3.1
>
> Attachments: FILE_PING.java, FILE_PING.java, JGRP-1448.patch
>
>
> When using the FILE_PING protocol it will periodically print the following in the log:
> 2012-03-19 16:20:41,057 [ Timer-5,<ADDR>] WARN [org.jgroups.protocols.FILE_PING] failed reading 83dc9dfe-8dd4-eff2-4474-d57dbaa96143.node: removing it
> This is most likely due to that all members write randomly to the same directory and reading is done without any synchronization to the writes.
> Hence running for long enough some point in time the read file will be corrupt.
> This occurs more often the slower the shared file system is (e.g. a slow NFS mount).
> I will uploaded a patch in which there are two modifications to the FILE_PING class.
> 1) Writing to files are done in two steps.
> First we write to a temporary file in order to avoid that the "readAll" methods picks up a half written file.
> Then we do a semi-atomic move of the tmp file to the proper node fil
> 2) Reading all node files will perform a few re-attempts should it fail to read a file.
> This is to provide a simple re-try mechanism should the file be half written and therefore not readable.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list