Jump to content


Obliterate leaves a lot of files still on disk

obliterate disk usage

  • Please log in to reply
3 replies to this topic

#1 saudia

saudia

    Newbie

  • Members
  • Pip
  • 5 posts
  • LocationPittsburgh

Posted 11 June 2019 - 08:47 PM

I've noticed that my disk usage has been creeping up even though I obliterate files for projects that have concluded (I just back up the latest revision to a regular file server).

Upon inspection I've found that quite a few files are being left behind on the server's disk, 100's of GBytes in my case (over the course of many years, though).

I've read this...
https://forums.perfo..._hl__obliterate

...and ran the command to check for "lazy copies", and there /is/ some output, but the output indicates the lazy copies are just copies of files that were obliterated anyway.  There is literally nothing in any of the current folders that any obliterated file could be pointing to, or vice versa.

Am I doing this wrong?  Should I be "snapping" stuff back before obliterating?

But, more importantly, is it ok to deleted the orphaned files from the disk?

Thanks.

#2 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 894 posts

Posted 14 June 2019 - 03:27 PM

This *may* be fixed with the latest server upgrade since the db's tracking of depot archive storage was recently revamped  (I haven't played with it yet), but historically, it's been pretty common for files to accumulate in the depot archive that don't correspond to any submitted revision (usually as a byproduct of a failed/abandoned submit) and therefore won't be cleaned up by obliterate.

If you want to clean up all of these files, the typical workflow would be something like:

	p4 snap //... //depot/old_project/...
	p4 obliterate //depot/old_project/...
	rm -rf p4root/depot/old_project

The only purpose of the "snap" is to ensure that there are no files in that archive directory that do not correspond directly to depot files in the same directory, so that you can "rm -rf" without fear of breaking lazy copies in other directories.  You don't necessarily want to do this as part of every obliterate since it causes duplication of archive files.

#3 saudia

saudia

    Newbie

  • Members
  • Pip
  • 5 posts
  • LocationPittsburgh

Posted 01 July 2019 - 05:39 PM

Thanks.  I'll give it a shot.

#4 Matt Janulewicz

Matt Janulewicz

    Advanced Member

  • Members
  • PipPipPip
  • 176 posts
  • LocationSan Francisco, CA

Posted 03 July 2019 - 06:44 PM

One other note is that you might have shelved files scattered about that don't immediately seem to correspond to anything. I'm extra paranoid about deleting things that might be used someplace I'm not remembering.

In my case I'm lucky that we have an environment with multiple edge servers and warm standbys, and occasionally I get pedantic about it and build one entirely through 'p4 verify -qtz'. Then eventually that'll swap into production as, say, the commit server.

If you were really paranoid you could roll out a replica (this doesn't affect the upstream source server. much) and build it out using 'p4 verify -qtz'. Then perhaps use that server to get a file list and compare it to the master server and remove whatever's missing.

Yet another approach would be to build out a standalone clone of the master server and attempt to figure out what all the library file paths might be using 'p4 fstat'. I also occasionally do this when I want to build a new server faster (than verify -qtz) using rsync. I don't think I can attach a shell script so I'll paste it and the end here. It's insane but it gets me pretty close to a full server with no cruft. Note that it only parses out filetypes that we happen to have, it may miss some more esoteric file types.

Anyway, bottom line, I highly recommend NOT just deleting things in production. Set up a second server and populate it in a sane way, and ensure 'p4 verify -q' passes in its entirety. _Then_ figure out what you can delete.

Insane fstats to follow:

#!/bin/bash

# scans a depot path and works out the archive files/directories to rsync while seeding a new server
if [ ! $1 ]; then
	 echo "Please pass in a depot path ..."
fi

# figure out filename extension
filename_extension=`echo $1 | sed 's/^\/\/\(.*\)\/\.\.\.$/\1/' | sed 's/\//-/g'`

echo "Finding text (RCS) storage types ..."
# pull out text storage types, these have ,v extension
# all text types, minus Full and Compressed
p4 -F "/perforce/depotdata/depots%lbrFile%,v" fstat -Oc -Of -F "(lbrType=text* | lbrType=symlink* | lbrType=unicode* | lbrType=utf8* | lbrType=utf16*) & ^(lbrType=*C*) & ^(lbrType=*F*)" $1 > rsync_me_$filename_extension.tmp

echo "Finding uncompressed full (+F) storage types ..."
# pull out uncompressed storage types, no extension
p4 -F "/perforce/depotdata/depots%lbrFile%,d/%lbrRev%" fstat -Oc -Of -F "(lbrType=*F*)" $1 >> rsync_me_$filename_extension.tmp

echo "Finding possible purged (+FS) storage types ..."
# pull out possibly pruged types (+FS), only storing #head. Note filename difference
p4 -F "/perforce/depotdata/depots%lbrFile%,t" fstat -Oc -Of -F "(headType=*FS*)" $1 >> rsync_me_$filename_extension.tmp

echo "Finding compressed full (-F) storage types ..."
# pull out compressed storage types, as above but with .gz extension
p4 -F "/perforce/depotdata/depots%lbrFile%,d/%lbrRev%.gz" fstat -Oc -Of -F "(lbrType=binary* | lbrType=apple* | lbrType=resource* | lbrType=*C*) & ^(lbrType=*F*)" $1 >> rsync_me_$filename_extension.tmp

echo "Final sort ..."
sort --parallel=12 -u -o rsync_me_$filename_extension rsync_me_$filename_extension.tmp

-Matt Janulewicz
Staff SCM Engineer, Perforce Administrator
Dolby Laboratories, Inc.
1275 Market St.
San Francisco, CA 94103, USA
majanu@dolby.com




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users