Jump to content


"p4 print" slow on edge

edge server p4 print p4 sync speed slow

  • Please log in to reply
3 replies to this topic

#1 rob777

rob777

    Member

  • Members
  • PipPip
  • 14 posts

Posted 17 October 2018 - 01:24 PM

Hi Guys,

I'm currently developing an application to download files from Perforce servers.
For a few reasons the creation of a workspace per user (a few hundred on multiple projects) of the application is non ideal.

Due to this the application uses "p4 print" (multiple threads) to download revisions.
The ability to get read the bytes from stdout further simplifies our solution.

We are however running into performance issues when using "p4 print" on our edges.

Edge:
- Running "p4 sync" (parallel) for the first time against the edge results in download speeds of ~25 MiB/s.
- Running "p4 sync" (parallel) for the second time against the edge reaults in download speeds of ~100 MiB/s.
- Running "p4 print" (multiple threads) against the edge (even after having run p4 sync before on the same files) maxes out at ~25 MiB/s.

"p4 info" run against edge:
...
Server version: P4D/LINUX26X86_64/2017.2/1579154 (2017/10/12)
...
Server services: edge-server
...



Main Server:
- Running "p4 print" (single thread) against the main server results in ~25 MiB/s.
- Running "p4 print" (multiple threads) against the main server results in ~100 MiB/s.

Our guess is that the edge does not look at cached revisions when "p4 print" is used.

Is this correct? If yes, can "p4 print" (or similar) be called in a way to return the cached revision?
Could something else cause this?

Thanks,
feyrob

#2 p4rfong

p4rfong

    Advanced Member

  • Staff Moderators
  • 286 posts

Posted 18 October 2018 - 08:27 PM

I just tried renaming the versioned file on the commit server and ran "p4 print" successfully from the edge server.  Then I renamed the versioned file on the edge server and received a Librarian checkout error.  Therefore the edge server looks at its own cached revisions.  Please check to see if your edge server has the versioned files by running

p4 verify -q //<depot>/<dir>/<file>

and make sure there is no transfer of the file in place by running

p4 pull -l

(and search for the problem file).  You can use

p4 pull -d -f  //<depot>/<dir>/<file> -r 1.<changelist>

to remove entries in the "p4 pull -l" output.

I am surprised that re-running a "p4 sync" is slower.  If your edge server has the versioned files, (seen with "p4 verify'), run the "p4 -Ztrack sync" command with the -Ztrack command multiple times to verify that the performance slows down when run multiple times.  The Ztrack output may provide clues.

#3 rob777

rob777

    Member

  • Members
  • PipPip
  • 14 posts

Posted 22 October 2018 - 12:43 PM

Using your provided instructions I was able to confirm that the edge caches the file.

Running this command allowed me to confirm (using "Resource Monitor") that the file is served at ~100 MiB/s :

p4 print //depot/some_large_file.dat > c:\nul

(Note that "nul" is a special null device file in Windows that exists in every directory.)

So no problem on the server side of things.


The question then was why this is slow when piping to a file or using the "-o" parameter.
I.e.:
p4 print //depot/some_large_file.dat > c:\pipe_out_file.dat
p4 print -o c:\o_out_file.dat //depot/some_large_file.dat

Both of those commands result in ~20 MiB/s per second on my machine.

I ran the commands on a few other machines and the results varied between ~10 and 100 MiB/s.

Enabling and disabling write caching on the disks seemed to have some impact, but the speeds were inconsistent (even between runs with unchanged settings).
It seems that Windows has some background file-system work that on some machines maxes out at ~20 MiB/s (no matter if a single or multiple processes do the writing).

Since other applications (and usages of p4 (i.e. p4 sync)) can write at 100+ MiB/s without a problem I concluded that something was different about how 'p4 print' writes its data.
I guessed that 'p4 print' is just very low latency in it's output (e.g. it flushes frequently or doesn't buffer).

The solution that gives me good 'p4 print' speeds looks like this:
I create a 'p4 print' sub-process and read its stdout in 4 KiB increments.
I then buffer those up and write them to disk in 2 MiB chunks.

I don't have time to test this further, but I'm guessing that this only a problem on a subset of our Windows 10 machines.
Most server machines (some of them with iSCSI mounted drives) did not have this problem.
I did not test this on Linux.

As a little side note...
I also had slow speed when calling 'p4 print' in an application without redirecting stdout.
I assume that the slowness was caused by the sub-process 'directing' it's output to the parent process where it hits some bottleneck.

This C# code get's good speed from 'p4 print':


static void TestFast{
	var process = new Process {
		StartInfo = new ProcessStartInfo {
			FileName = "p4.exe",
			Arguments = "print -q //some_depot/some_large_file.dat",
			RedirectStandardOutput = true, // not redirecting would result in slow speed!
			CreateNoWindow = true,
			UseShellExecute = false
		}
	};

	process.Start();

	int kib = 1024;
	int mib = 1024 * kib;

	// = One Shot Speed Measurements =
	// (for better values a proper benchmark would need to be written and then run on multiple machines)
	// size of read -> speed
	// 1 byte ->  64 KiB/s
	// 1 KiB  ->  50 MiB/s
	// 2 KiB  ->  94 MiB/s
	// 4 KiB  -> 100 MiB/s
	// 1 MiB  ->  30 MiB/s

	int size = 4 * kib;

	byte[] buffer = new byte[size];

	int bytesRead = 0;
	while (!process.HasExited || bytesRead != 0) {
		bytesRead = process.StandardOutput.BaseStream.Read(buffer, 0, size);
		// ... do stuff with buffer here
	}
}


#4 p4rfong

p4rfong

    Advanced Member

  • Staff Moderators
  • 286 posts

Posted 24 October 2018 - 11:10 PM

Thanks for sharing your results!





Also tagged with one or more of these keywords: edge, server, p4 print, p4 sync, speed, slow

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users