[jboss-jira] [JBoss JIRA] (JGRP-1448) FILE_PING: Fail to read node file

Mon Apr 23 07:23:18 EDT 2012

    [ https://issues.jboss.org/browse/JGRP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12686565#comment-12686565 ] 

Peter Nerg commented on JGRP-1448:
----------------------------------

Sure I can do that but I was a bit unsure how to go about the correct way.
I got the code from a Git repo but the SCM tag in the pom file points to a CVS so was not sure how to create the patch in the "correct" way.
Is there any documentation that specifies how JGroups want's the patch file to be created?

Sorry for using the comment field for this discussion, didn't know how else to respond.

> FILE_PING: Fail to read node file
> ---------------------------------
>
>                 Key: JGRP-1448
>                 URL: https://issues.jboss.org/browse/JGRP-1448
>             Project: JGroups
>          Issue Type: Patch
>    Affects Versions: 2.12.3
>         Environment: Any O/S with a NFS or other type of shared file system
>            Reporter: Peter Nerg
>            Assignee: Bela Ban
>              Labels: FILE_PING, jgroups
>             Fix For: 3.0.10, 3.1
>
>         Attachments: FILE_PING.java, FILE_PING.java
>
>
> When using the FILE_PING protocol it will periodically print the following in the log:
> 2012-03-19 16:20:41,057 [ Timer-5,<ADDR>] WARN  [org.jgroups.protocols.FILE_PING] failed reading 83dc9dfe-8dd4-eff2-4474-d57dbaa96143.node: removing it 
> This is most likely due to that all members write randomly to the same directory and reading is done without any synchronization to the writes.
> Hence running for long enough some point in time the read file will be corrupt.
> This occurs more often the slower the shared file system is (e.g. a slow NFS mount).
> I will uploaded a patch in which there are two modifications to the FILE_PING class.
> 1) Writing to files are done in two steps.
> First we write to a temporary file in order to avoid that the "readAll" methods picks up a half written file.
> Then we do a semi-atomic move of the tmp file to the proper node fil
> 2) Reading all node files will perform a few re-attempts should it fail to read a file.
> This is to provide a simple re-try mechanism should the file be half written and therefore not readable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira