[jboss-jira] [JBoss JIRA] (JGRP-1448) FILE_PING: Fail to read node file

Bela Ban (JIRA) jira-events at lists.jboss.org
Wed May 16 03:01:18 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693128#comment-12693128 ] 

Bela Ban commented on JGRP-1448:
--------------------------------

Hi Peter,

I pushed this into 3.0.11, as I had to release 3.0.10 today. It would be helpful if you could create that pull request soon, so I can change this.
You could also attach a patch against either 3.0.10 or 3.1 (master) if that's easier for you.

                
> FILE_PING: Fail to read node file
> ---------------------------------
>
>                 Key: JGRP-1448
>                 URL: https://issues.jboss.org/browse/JGRP-1448
>             Project: JGroups
>          Issue Type: Patch
>    Affects Versions: 2.12.3
>         Environment: Any O/S with a NFS or other type of shared file system
>            Reporter: Peter Nerg
>            Assignee: Bela Ban
>              Labels: FILE_PING, jgroups
>             Fix For: 3.0.11, 3.1
>
>         Attachments: FILE_PING.java, FILE_PING.java
>
>
> When using the FILE_PING protocol it will periodically print the following in the log:
> 2012-03-19 16:20:41,057 [ Timer-5,<ADDR>] WARN  [org.jgroups.protocols.FILE_PING] failed reading 83dc9dfe-8dd4-eff2-4474-d57dbaa96143.node: removing it 
> This is most likely due to that all members write randomly to the same directory and reading is done without any synchronization to the writes.
> Hence running for long enough some point in time the read file will be corrupt.
> This occurs more often the slower the shared file system is (e.g. a slow NFS mount).
> I will uploaded a patch in which there are two modifications to the FILE_PING class.
> 1) Writing to files are done in two steps.
> First we write to a temporary file in order to avoid that the "readAll" methods picks up a half written file.
> Then we do a semi-atomic move of the tmp file to the proper node fil
> 2) Reading all node files will perform a few re-attempts should it fail to read a file.
> This is to provide a simple re-try mechanism should the file be half written and therefore not readable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list