Jump to content


new leaf fail status 270734296 pgno 0


  • Please log in to reply
5 replies to this topic

#1 jamieg

jamieg

    Newbie

  • Members
  • Pip
  • 3 posts

Posted 25 February 2020 - 04:15 PM

Hi guys

I wonder if anyone can help me out. I have been looking through some logs and came across this:
new leaf fail status 270734296 pgno 0  

We have multiple occurrences of this. Although I am pretty sure this relates to the Btree structure of the DB and in particular might relate to this server not having had a checkpoint restore operation carried out recently.


Does anyone have any insight as to what this error means?  I can't find a single mention of it online, and even in the Berkley DB notes :/

Appreciate any help

Kind regards
Jamie

#2 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 1037 posts

Posted 25 February 2020 - 07:06 PM

I'm not super familiar with the db internals, but "new leaf fail" sounds like a db write required a new leaf node and it failed.  Could be a transient disk failure (although I'd expect to see the filesystem error in the log if that were the case), or could be some low-level inconsistency in the db file.  Have you done a "p4d -xv"?  If that reports errors you'll probably want to do a checkpoint/journal restore to get everything rebuilt in a consistent state.

#3 jamieg

jamieg

    Newbie

  • Members
  • Pip
  • 3 posts

Posted 27 February 2020 - 02:33 PM

Thanks for the reply, Sambwise. Not done a p4d -xv currently. ( what exactly does this do.... read-only info, i.e no changes made?) I do plan to run some DB verification on a replica. Having never seen this before I'm not sure what to make of it. In the past I've seen messages relating to 'Btree'... which clearly was a potential corruption issue.

Hypothetically, if a transaction is atomic ( as is the case with p4), and a DB write operation failed... surely the p4 command would also? We certainly have not seen any adverse behavior, simply that I noticed this is the log file.

Cheers
Jamie

#4 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 1037 posts

Posted 27 February 2020 - 03:11 PM

"p4d -xv" is the verification command; it's read-only.

The p4 command that failed to write to the db would fail, but if the failure is somehow transient users might not necessarily report it.  Or it could be that when this error condition happens the db itself does some kind of retry so it doesn't result in an actual failure.  Hopefully "p4d -xv" will have more info about what state things are in.

#5 jamieg

jamieg

    Newbie

  • Members
  • Pip
  • 3 posts

Posted 28 February 2020 - 09:44 AM

Thanks for your help, Sambwise.  

After digging into this a little more I found reference of this issue. We had reported it to Perforce support some time ago ( years) and it appeared that the issue was a strange fringe case with the JFS filesystem. I will gather more information about it, but right now it appears that it's something that has occured for a long time and doesn't seem to actually cause the transaction to fail, like you say it's transient.

Thanks again for your help. I'll update with more information as and when I do more investigation.


Cheers
Jamie

#6 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 182 posts

Posted 28 February 2020 - 05:46 PM

I did a bunch of research 7-8 years ago when I moved our servers from Solaris to Linux. We went with xfs, and haven't had a single hiccup. Counting replicas and test/dev servers, we now have over a dozen physical servers and VMs, most of which get beat on fairly hard.
Just a data point.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users