Jump to content


P4 integration of deletions

delete integrate

  • Please log in to reply
6 replies to this topic

#1 mob

mob

    Newbie

  • Members
  • Pip
  • 7 posts

Posted 20 February 2019 - 02:49 PM

Hello,

I have an issue with deletions and integrations. To me it looks like if Perforce does not integrate deletions further. So the first integration always has the 'delete', further integrations have not, because the origin and the target both have no file like that anymore. But moreover it seems that Perforce forgets about the previous existence of that file.

Example:

You have a common base TRUNK. You integrate into PROJECT_A and PROJECT_B. You delete the file on PROJECT_A and integrate back to TRUNK Then you integrate TRUNK to PROJECT_C. This integration doesn't have the file anymore and no delete about that. If you integrate old PROJECT_A PROJECT_B, that has been created before the deletion, to PROJECT_C, it recreates all the new files.

Actually the p4 integrate -Di is exactly for that:

Quote

The -Di flag modifies the way deleted revisions are treated.  If the
source file has been deleted and re-added, revisions that precede
the deletion will be considered to be part of the same source file.
By default, re-added files are considered to be unrelated to the
files of the same name that preceded them.
But P4 still sees no relation. The P4V even says "Try to integrate changes when source deleted and re-added", which is a bit unclear. For every integration P4 must find the common base and see the deletion on one path that is the youngest change on that file, and keep it deleted. Otherwise no integration over several lines is possible to track deletions.

Is this the right place for that issue?

#2 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 957 posts

Posted 20 February 2019 - 06:33 PM

The -Di flag won't cause integrate to propagate a delete that isn't there.

In this scenario:

Quote

You delete the file on PROJECT_A and integrate back to TRUNK Then you integrate TRUNK to PROJECT_C. This integration doesn't have the file anymore and no delete about that. If you integrate old PROJECT_A, that has been created before the deletion, to PROJECT_C, it recreates all the new files.

the file won't be recreated, because you deleted it on PROJECT_A, and the result of a delete integrating into a nonexistent file is no integration.

What I think you're talking about is having deleted the file in, say, PROJECT_B, and then going back to PROJECT_A where it's not deleted (and integrating directly to PROJECT_C where it never existed) -- THAT will result in the file being (re)created in PROJECT_C, because now you're integrating a file that exists into a path where there is no file at all (since the history of the file "ends" in TRUNK before the creation of PROJECT_C, there's nothing in PROJECT_C to compare PROJECT_A's file to).

One way to fix that is to create the PROJECT_C branch with the "populate -f" command.  That'll force all of the deleted revisions in TRUNK to be copied explicitly into PROJECT_C, and when you integrate from PROJECT_A the existing file in PROJECT_A will be seen as older than the delete in PROJECT_C.  The downside of "populate -f" is that your branches will get increasingly bloated with deleted files over time.

The *recommended* fix is to not integrate directly from PROJECT_A to PROJECT_C.  Instead, integrate from PROJECT_A to TRUNK, and then from TRUNK to PROJECT_C.  (This is the "mainline model" where the idea is that every change ends up in the mainline at some point.)

Alternatively, if you need to integrate a specific change from PROJECT_A to PROJECT_C (but not by way of TRUNK), do a "cherry pick" (i.e. specify a start and end point for the source of the integration) rather than a wholesale merge.

#3 mob

mob

    Newbie

  • Members
  • Pip
  • 7 posts

Posted 21 February 2019 - 11:48 AM

Thank you very much. Indeed my description was wrong, corrected sentence is:

Quote

If you integrate old PROJECT_B, that has been created before the deletion, to PROJECT_C...

I understand now, that either every integration must be done with populate -f OR I need to integrate back to a line where the deletion is still "known". So without knowing that line the best would be anyway to always integrate back to the origin line where the branch was created from, and then to the next lines from there.

Questions about that populate -f: I never tried populate, is that like integrate + submit? That -f flag doesn't exist for integrate, only for populate. There is some other -f for integrate that means something else:

Quote

The -f flag forces integrate to ignore integration history and treat
all source revisions as unintegrated. It is meant to be used with
revRange to force reintegration of specific, previously integrated
revisions.
The -f for populate means:

Quote

The -f flag forces deleted files to be branched into the target.
By default, deleted files are treated as nonexistent and simply
skipped.
What is the drawback of using populate -f always? I see, there will be more and more deleted files to be kept, on every cleanup and re-sorting. But is that an issue? We hide those in P4V. I don't know much about the P4 internals, but why does every line store the deleted files? For integrations, the resolve could do that. It must find the common base anyway, and if one branch to that base has a deletion on the way and the other not, then the deletion is the result of the resolve, that's my view on that. Wrong?

#4 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 957 posts

Posted 21 February 2019 - 07:16 PM

View Postmob, on 21 February 2019 - 11:48 AM, said:

What is the drawback of using populate -f always? I see, there will be more and more deleted files to be kept, on every cleanup and re-sorting. But is that an issue? We hide those in P4V.

If you always hide them in P4V it's probably not much of an issue.  In theory there'll be some performance degradation from all the extra db records (which will continue duplicating on every branch), but that's a function of how often you delete/refactor files; if it's relatively rare (so that deleted files relative to active files aren't really significant) or if your database is relatively small to begin with, you likely won't notice the difference.

Quote

For integrations, the resolve could do that. It must find the common base anyway, and if one branch to that base has a deletion on the way and the other not, then the deletion is the result of the resolve, that's my view on that. Wrong?

"Common base" between what?  :)  Essentially the issue is that the target file doesn't exist, so there is no common base calculation, which generally only kicks in if you're integrating a file into an existing file.  In theory, every time you create a branch there could be an exhaustive search of all branches of the source file to see if there was ever a deletion in a sibling branch, and then some sort of heuristic could be applied by looking at other files in the branch to "guess" whether the file should be propagated, but since each file has its own history it would indeed have to be a guess (and it would also increase the performance cost of branching by orders of magnitude).

There is actually one special case where intermediate files are examined during a branch into a nonexistent path, which is when there's a renamed source.  Check out page 14 here: https://swarm.worksh...ase Picking.pdf

In this example B2 is a candidate for integration into A2, which would normally be a "branch" action since A2 does not exist (and in this rename case we'd end up "recreating" a file that was moved).  To avoid that, we look within the B* namespace (it's important that we only look within that namespace since otherwise we're in that "orders of magnitude" performance problem scenario) and look specifically for "moved" records, and if we find one then we do the common base analysis on the move ancestor to make everything line up.  :)

For the delete case you describe, though, those signposts don't exist.  Using "populate -f" works around the problem by adding one in the form of the "branched delete", which then serves as a starting point for the common base calculation to figure out whether it's actually appropriate to (re)branch the source into the target.

#5 mob

mob

    Newbie

  • Members
  • Pip
  • 7 posts

Posted 22 February 2019 - 08:05 AM

View PostSambwise, on 21 February 2019 - 07:16 PM, said:

In theory, every time you create a branch there could be an exhaustive search of all branches of the source file to see if there was ever a deletion in a sibling branch, and then some sort of heuristic could be applied by looking at other files in the branch to "guess" whether the file should be propagated, but since each file has its own history it would indeed have to be a guess (and it would also increase the performance cost of branching by orders of magnitude).
The exhaustive search would not be required at branch creation but at integrate/resolve. And even then it would not require to search all branches, but only up to the common base. Here you say there is no common base because the file does not exist anymore. I would expect to search the whole branch integration history to find deletion. But as P4 seems to only work on file histories there is no history without the file anymore. The "branch" has no own history, only the single files.

Thanks for the workshop PDF, going to study that..

I think about making populate -f the default for next branches. As file count of branches is increasing, so will the deletion count increase at a lower level. But large moves and re-organization could lead to lot of deletions in the history. I guess a "p4 move" is still an integrate+delete as in the early days, true?

By the way, we had issues with move and integrations, which I'm filing as a new thread topic.

#6 Sambwise

Sambwise

    Advanced Member

  • Members
  • PipPipPip
  • 957 posts

Posted 22 February 2019 - 11:03 AM

View Postmob, on 22 February 2019 - 08:05 AM, said:

The exhaustive search would not be required at branch creation but at integrate/resolve. And even then it would not require to search all branches, but only up to the common base. Here you say there is no common base because the file does not exist anymore. I would expect to search the whole branch integration history to find deletion.

Branch creation usually involves a "p4 integrate", in which case it's simply an integrate with a non-existing target.  As you say, you would need to search the entire branch integration history to find the deletion -- that means that every time you integrate into a file that does not exist (which is the case every time you create a branch -- for EVERY file in the new branch), the entire branch history needs to be searched to figure out if there's a deletion anywhere in there.  That's what I mean by things getting orders-of-magnitude slower; instead of just scanning the history of the source, now you effectively need to scan the history of (probably) the entire depot each time you use "integrate" to create a branch.  (This isn't even getting into the fact that for performance reasons integrate bases most of its decisions on db.integed, which doesn't necessarily track the history of every delete that happens in db.rev...)

And even then, you aren't out of the woods, because hooray, you found a deletion, but how do you know if it's "upsteam" or "downstream" of the current integration target?  You don't because you don't have context on the other files in the branch -- so now you need to do some kind of lateral search to see of other files have history that can grant you that context.  And we know that all of the files in branch mapping have their own history, so if you pick one at random, it might not actually give you the right answer... you can hopefully see why this is not an edge case thing that "p4 integ" even attempts to handle.  :)

Quote

But as P4 seems to only work on file histories there is no history without the file anymore. The "branch" has no own history, only the single files.

This is 100% correct and is why reasoning about the branch as a whole will sometimes be misleading with respect to how individual files get integrated.  Everything is per-file -- this is sometimes a benefit and sometimes creates some weird edge cases.

View Postmob, on 22 February 2019 - 08:05 AM, said:

I guess a "p4 move" is still an integrate+delete as in the early days, true?

No -- I think that misconception might be why you're having the problem you described in the other thread.  The "p4 move" command (which creates distinct integration records and enforces constraints around atomicity of move operations) was introduced in 2009.1 or thereabouts, and integrate has had special handling for moves (also geared at preserving those atomic properties, and minimizing the need to manually adjust branch specs) since 2013.2.  If you use the pre-2009.1 equivalent of just integ+delete, you don't get any of those useful semantics and have to do things like manually adjust branch specs every time you move a file.

#7 Jeremy718

Jeremy718

    Newbie

  • Members
  • Pip
  • 1 posts

Posted 10 May 2019 - 04:56 AM

The easiest solution is going to be to submit the first batch of integrates before proceeding. In many cases you can stack up multiple integrates into a single revision, but once the integrate operations require opening the file for different actions (branch vs integrate vs delete vs move etc) you'll get the "already opened for (action)" error and it won't be possible to continue. If you submit the already opened file, a subsequent integrate will be able to create a new revision with the appropriate action.





Also tagged with one or more of these keywords: delete, integrate

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users