Jump to content


Protects table usage statistics

protect protects statistics visibilily

  • Please log in to reply
6 replies to this topic

#1 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 136 posts

Posted 15 August 2019 - 09:25 PM

Is the any visibility into protects table usage? We have a fairly complex protects table, which I want to optimize for performance (without breaking authorization). It would be nice to have real data as to which rules are applied the most, or what paths are tested the most times. I might be able to pull a very rough approximation out of the logs testing client names as well as depot and disk paths, but that's really kludgey and, I suspect, not real accurate.

Am I missing some built-in functionality? Is there another tool to help with this? I toyed with the p4 log analyzer but that is slow (we have large logs) and takes a lot of space if we want to track data over time. It's useful for some things, and I would consider it for this, but it doesn't excite me for this purpose.

Thanks,
Miles

#2 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 928 posts

Posted 16 August 2019 - 02:24 PM

I don't think there's any good tooling for this -- you might be able to do *something* with -vmap=N debugging flags but I don't know how useful it would be with respect to the task you're trying to accomplish.

Does your protection table have any lines with multiple wildcards?  If so, I can shortcut a lot of investigation for you and tell you that's the thing to optimize out.  :)  Here's a script I wrote in large part to make it easier to eliminate expressions like "//depot/*/foo/..." which tend to be the biggest performance killers:  https://swarm.worksh...main/protexp.pl

#3 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 136 posts

Posted 16 August 2019 - 05:13 PM

We do have some, some of which seem more inevitable than Thanos's confidence. I actually have a ticket open regarding particular cases; At some number of lines, it has to be more efficient to check (for example) //foo/bar_*/... than X number of lines with /foo/bar_<number>_final. But what is X? 3? 10? 100? (Not 3, that much I'm sure of.)

Far worse are the filters to keep us free of the myriad files third party tools generate that historically caused problems when checked in- either because of size or because of permission issues. I'm going to engage consulting on that because the list we got from the CAD team includes lines with two and three wildcards.

#4 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 136 posts

Posted 16 August 2019 - 05:19 PM

Where is -vmap documented?

#5 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 928 posts

Posted 16 August 2019 - 05:43 PM

View PostMiles O, on 16 August 2019 - 05:13 PM, said:

We do have some, some of which seem more inevitable than Thanos's confidence. I actually have a ticket open regarding particular cases; At some number of lines, it has to be more efficient to check (for example) //foo/bar_*/... than X number of lines with /foo/bar_<number>_final. But what is X? 3? 10? 100? (Not 3, that much I'm sure of.)

The ceiling on that number is pretty high (thousands if not millions).  The catch is the protection table isn't interpreted in a vacuum; it gets joined with every other mapping involved in running a given command (which usually includes the client view).  So in some situations your multiple wildcards behave fine, and then one person puts a couple multiple wildcards in their own client (or a branch view, or a command argument, or all of the above) and now suddenly the computed mapping is five million lines long because things went from O(n) to O(k^n) or whatever (I'm fuzzy on the math, but it's bad).  Every time someone hits a map.joinmax error (or crashes their server because they overrode map.joinmax and overflowed memory) it's because they had multiple wildcards in their protection table and it was fine until it suddenly wasn't.

Quote

Where is -vmap documented?

Posted Image

Here's what the different levels correspond to: https://swarm.worksh.../map/mapdebug.h

You can look at the debugging statements in the rest of the map/ folder that are gated on those flags to get an idea of what they're dumping out.  This line seems like the most likely to be useful for answering the question of "how often does this mapping line get referenced?":  https://swarm.worksh.../maphalf.cc#419

I'm not sure what you'd actually do with that data or if it would be in any way useful, but that's about the only window there is into what the mapping code is thinking.

#6 Matt Janulewicz

Matt Janulewicz

    Advanced Member

  • Members
  • PipPipPip
  • 176 posts
  • LocationSan Francisco, CA

Posted 19 August 2019 - 10:05 PM

I'm not sure if this helps any, but a few years ago we had some performance problems that we had a hunch were related to the protections table. We went through the pain of eliminating as many wildcards as possible, and especially lines with more than one. Or, in our case most importantly (I think), wildcards that had additional directories after them:

//some/path/*/another_path/...

Sam alludes to times when a user might do a goofy query that hammers that type of line in the protections table, so to speak. My memory is a bit shady but I feel like I remember that %% tags in a workspace view plus wildcards in the protections table could wreak havoc if wielded in just the right way.

I've got it down to five lines with an '*' in them, and all of them are at the end of the path/directory:

//some/path/*
//some/path_*/...

We may have been lucky in that the 'weird' lines we had weren't really necessary any more and were just old cruft. It was easy to get buy-in to remove them. The remaining wildcards contain no more than (checks notes) 47 directories each so it doesn't add up to much in our case.

Since we got through that I don't recall ever asking myself 'Maybe it's the protections table?'

We 'only' have around 2,800 lines in there. Anecdotally, 'p4 protects' is one of my more common daily commands that I run and I can't remember the last time I thought 'this is taking too long'. It returns in less than a second every time. That used to not be the case.

I don't think it does anything, but we also try to keep our protections table in alphabetical order by depot path. I think it generally has to take in the table as a whole every time anyway, but it's prettier this way and helps break it down by depot (now that we have real comments available!) and is easier for humans to read.

I also don't know if it would help any, but if you have a 'p4 protects' command you can run and it consistently takes a long time (or any other command for that matter), it helps to use 'p4 -Ztrack', which will dump out db table statistics about the transaction. I suspect complicated queries will have a higher row/scan value than expected. That's 100% conjecture, but whenever I get any random, odd performance issues, '-Ztrack' is the first line of attack (since I learned about it a few months ago.)
-Matt Janulewicz
Staff SCM Engineer, Perforce Administrator
Dolby Laboratories, Inc.
1275 Market St.
San Francisco, CA 94103, USA
majanu@dolby.com

#7 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 136 posts

Posted 19 August 2019 - 11:14 PM

"p4 protects" runs plenty fast. It's more of a case of watching the load average slowly creep up as things are added to the protects table when overall Helix usage isn't changing.
Right now all our projects are added in alphabetical order, which is about to change. We'll have a big section of older projects, since they get little activity, followed by a small section of active projects. But we still have a few dozen multi-wildcard lines near the bottom of the table that I've yet to find a way to eliminate. That's where I'm expecting to burn some prepaid consulting time.





Also tagged with one or more of these keywords: protect, protects, statistics, visibilily

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users