Jump to content


config: startup pull batch logic

batch min-size max-size configure pull startup

  • Please log in to reply
7 replies to this topic

#1 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 27 November 2019 - 05:26 PM

I can't find any documentation about how pull, --batch, --min-size, and --max-size work under the covers.

Let's use a modified version of the example from Perforce's documentation:

startup.2=pull -u -i 1 --batch=1000 --min-size=1    --max-size=2047
startup.3=pull -u -i 1 --batch=500  --min-size=2048 --max-size=4096
startup.4=pull -u -i 1 --batch=10   --min-size=4097

If pull thread 1 gets 100 files of the right size, but file 101 is too big, does pull thread 1 just go copy the 100 it already found, or skip file 101 and keep trying to get 1000 files into its job?

#2 Matt Janulewicz

Matt Janulewicz

    Advanced Member

  • Members
  • PipPipPip
  • 197 posts
  • LocationSan Francisco, CA

Posted 29 November 2019 - 08:42 AM

It's been about a year since I messed with this, but my observation at the time (I'm fairly certain) was that each thread plucks one file at a time off the top of the queue. So startup.2 will not wait until it has 1000 files to do anything. If this were not the case, then if you have 999 files smaller than 2048, they'd never transfer.

I don't know if they've improved this since a year or two ago, but I nixed trying to throttle my queues this way because it turns (turned?) out that the pull threads did not work independently of each other. Or maybe they did, depending on how you look at it.

An example scenario for this is say in the above configuration that the top 50 files in your pull queue (p4 pull -ls) were bigger than 4097. All those files would be earmarked for startup.4, but startup.2 and 3 would not go searching for smaller files further down in the queue. They'ed wait until those smaller files floated to the top before startup.2 and 3 would process them.

I think this means that the pull threads only look at the first/next file in the queue (in this case file #11) to figure out what thread to place it in. There appears (appeared) to be no hard-target search for the idle threads to find files further down the list to transfer. If this hasn't changed (I would have been trying this on p4d 2018.1), then it seems like it limits the usefulness of separating queues this way.
-Matt Janulewicz
Staff SCM Engineer, Perforce Administrator
Dolby Laboratories, Inc.
1275 Market St.
San Francisco, CA 94103, USA
majanu@dolby.com

#3 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 02 December 2019 - 09:48 PM

Yes, I assume that if there are no more files to transfer in the queue, it goes ahead and transfers. I'm thinking more for large queues. So if I have 50,000 files containing a mix of files skewed toward the small, after loading 500 tiny files into thread 1, does a large file going into thread 3 cause thread 1 to start transferring, or does it resume filling up in hopes of getting to 1000 files?

And does it wait until it's full (or as full as it will get this time around) to start compressing, or is it compressing along the way?

#4 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 02 December 2019 - 10:02 PM

Or does it skip the big file and try to keep filling up thread 1 before moving on to thread 2?
Either way, I'd hope each thread would get filled as quickly as possible, then transfer in parallel. I know that without batching the threads work in parallel just fine.
Running "top" with the batched config, it's rare to see more than one p4d running at a time, even now with over 5,000 files active, spanning 62MB, per "p4 pull -ls". Running "tail -f" on the log file seems to agree that this is the behavior.

#5 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 02 December 2019 - 11:55 PM

The behavior with large batch numbers was extremely bursty in terms of network I/O. It's much flatter without batch, and in the hours since I turned off batching, I'm seeing better overall throughput. Ugh.

#6 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 03 December 2019 - 11:13 PM

After discussions with support and lots of tweaking and monitoring, it appears that the pull threads are not independent in terms of scheduling. They get set up with one or more files each to run, then they all run, then they sit idle until they're all free, at which point the next cycle starts. This comes watching them with and without batching. I currently have millions of files queued for transfer. All the threads run, then there's a seven second pause while a single p4d scans the entire table, then allocates a file to each thread, and then they all run. Lather, rinse, repeat. Even without the delay of rescanning the table (instead of just pulling them off the front of a queue), this seems like it would be suboptimal for a busy server. I expected each thread to run, then see if there was work to do, and run again. It appears the master thread does all the rdb.lbr scanning, and possibly passes out files to the worker threads.

I didn't get much insight into how batching works. Instead I was told, "[font="Segoe UI", "Segoe UI Web (West European)", "Segoe UI", -apple-system, BlinkMacSystemFont, Roboto, "Helvetica Neue", sans-serif]Batching can do good in some environments but can cost you more in the long run.  We don't recommend it for that reason." That's disappointing, and the docs should reflect that, IMO. All they say is, "[/font][font="open sans"]The default value of [/font]1[font="open sans"] is usually adequate. For high-latency configurations, a larger value might improve archive transfer speed for large numbers of small files. (Use of this option requires that both master and replica be at version 2015.2 or higher.)"
[font=lucida sans unicode,lucida grande,sans-serif]And apparently 130ms ping time is not high latency. While I know it could be worse, in my world, it is high.

So, the moral of the story is to run compression and as many single-file pull threads as you can get away with.[/font]
[/font]


#7 Matt Janulewicz

Matt Janulewicz

    Advanced Member

  • Members
  • PipPipPip
  • 197 posts
  • LocationSan Francisco, CA

Posted 05 December 2019 - 02:19 AM

I think that was pretty much my experience when testing this, but you posit an interesting question I didn't think of. When I tested I had five 'big file' queues and five 'small file' queues, but I had the batch set to 10 on all of them.

What I observed is that if you had 100 large files at the start of the queue, and 1 small file at the end, the small file would not get into the pull threads until all 100 other files were at least in the queue. It still felt/smelled like the pull threads were only looking at the top of the total queue and there was no kind of smart management/distribution going on.

Of course I have all kinds of ideas on how to improve this. :) One thorn in my side is that if rdb.lbr gets big (not sure how big, seems to vary) you'll see big stalls in the queues, and getting more files into the queue takes a long time. I don't think rdb.lbr is efficient for really big transactions (for instance, every so often I build a server pedantically with 'p4 verify -qzt', but I have to break it down into small chunks.) In our configuration, five batches of a million files transfer way, way faster than one batch of 5 million files. And it doesn't hurt to remove rdb.lbr and restart the server in-between chunks.

Anyway, it would be neat if rdb.lbr was a first-class db file. Or maybe if it were managed through p4 keys. Or maybe if you had max/min/batch set, that it somehow marks the file for the target pull thread when it goes into the queue, not when it's being popped out of it.
-Matt Janulewicz
Staff SCM Engineer, Perforce Administrator
Dolby Laboratories, Inc.
1275 Market St.
San Francisco, CA 94103, USA
majanu@dolby.com

#8 Miles O'Neal

Miles O'Neal

    Advanced Member

  • Members
  • PipPipPip
  • 153 posts

Posted 05 December 2019 - 11:22 PM

Agreed. And if the pull threads ran independently, just scarfing the next appropriate files from the queue.
Oh, well.





Also tagged with one or more of these keywords: batch, min-size, max-size, configure, pull, startup

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users