Jump to content


Perforce filetype limitations

UTF8 UTF-8 text filetype utf8 utf-8 filetypes unicode

  • Please log in to reply
4 replies to this topic

#1 Justin

Justin

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 05 June 2015 - 09:47 AM

Hi all!

I'd like to know more about the impact of Perforce's filetypes out of technical interest, rather than needing to solve a particular problem. (But it may save me headaches later when diagnosing issues!) Not sure how to express it other than questions about hypothetical bad practices, so here goes:


1. How should I store UTF-8 text files in Perforce?

My understanding is that Perforce's UTF-8 provision is the 'unicode' filetype. This filetype isn't enabled on my server and I'll assume it's not going to change.

If I choose 'binary' then that will be fine, until a client syncs to it with different OS line ending expectations?


2. Why do (some?) UTF-8 text files 'survive' being stored under the 'text' filetype?

For example:
  • Download the attached utf8_as_text_filetype_text.txt (Content generated from http://generator.lor....info/_japanese)
  • Submit it with the 'text' filetype
  • Make a local copy
  • Force sync
  • Observe both the local and re-synced files are binary-identical

When will be the point when a UTF-8 text file is corrupted due to being stored as 'text'? Could I have a concrete example?


3. At what point does the 'text' filetype break down for arbitrary binary data?

I've experienced images corrupted because they were somehow ended up with the 'text' filetype, but I am curious to exactly why that happened.

I know that Perforce translates the line endings for us, but thought it's a moot point if the submitted and syncing clients are of the same OS.

If it's anything to do with deltas, then is 'text' guaranteed to work at least for just the initial 'add'?


Thanks very much!


Justin

#2 Justin

Justin

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 05 June 2015 - 09:52 AM

Here's the attached file

Attached Files



#3 P4Shimada

P4Shimada

    Advanced Member

  • Members
  • PipPipPip
  • 831 posts

Posted 09 June 2015 - 09:39 PM

Hi Justin,

Thanks for you well organized write-up. In order to better answer your questions, let us know the following:

a] Which version of the Perforce server are you using? (The full 'Server version' string from either "p4 info" command output OR from P4V Help -> System Info)

b] Which operating systems are most of your clients? (MacOSX; Linux; Windows; Windows + OSX)

c] Is your Perforce server running in Unicode mode? (See output from command "p4 -ztag info")
  
... unicode enabled

The output as above will show if the server is unicode enabled.

#4 Justin

Justin

    Advanced Member

  • Members
  • PipPipPip
  • 34 posts

Posted 16 June 2015 - 03:37 PM

Hi Shimada,

Here is my information:

a] Server version: P4D/LINUX26X86_64/2012.2/756218 (2013/12/09)
b] Windows (Only a few clients run under OSX)
c] My server does not output '... unicode enabled' when running that command.

Sorry I didn't get back to you earlier, I forgot to follow this thread to receive notifications.

#5 Mailman Sync

Mailman Sync

    Advanced Member

  • Maillist Aggregator
  • 2495 posts

Posted 30 July 2015 - 12:20 PM

Originally posted to the perforce-user mailing list by: Michael Mirman


Justin -

You don't need the UTF8 server mode in order to have UTF8 (or binary) data in files.
You need the UTF8 server mode only if your file *names* are not ASCII.

If your file is binary, line endings would not change during sync regardless of the OS.

HTH,

--
Michael Mirman
MathWorks, Inc.
508-647-7555

-----Original Message-----
From: perforce-user-bounces@perforce.com [mailto:perforce-user-bounces@perforce.com] On Behalf Of Justin
Sent: Friday, June 5, 2015 5:50 AM
To: perforce-user@perforce.com
Subject: [p4] Perforce filetype limitations


Posted on behalf of forum user 'Justin'.

Hi all!

I'd like to know more about the impact of Perforce's filetypes out of
technical interest, rather than needing to solve a particular problem. (But it
may save me headaches later when diagnosing issues!) Not sure how to express
it other than questions about hypothetical bad practices, so here goes:


1. How should I store UTF-8 text files in Perforce?

My understanding is that Perforce's UTF-8 provision is the 'unicode'
filetype. This filetype isn't enabled on my server and I'll assume
it's not going to change.

If I choose 'binary' then that will be fine, until a client syncs to it
with different OS line ending expectations?


2. Why do (some?) UTF-8 text files 'survive' being stored under the
'text' filetype?

For example:


-  Download the attached utf8_as_text_filetype_text.txt (Content generated from
    http://generator.lor....info/_japanese )

-  Submit it with the 'text' filetype

-  Make a local copy

-  Force sync

-  Observe both the local and re-synced files are binary-identical


When will be the point when a UTF-8 text file is corrupted due to being stored
as 'text'? Could I have a concrete example?


3. At what point does the 'text' filetype break down for arbitrary
binary data?

I've experienced images corrupted because they were somehow ended up with
the 'text' filetype, but I am curious to exactly why that happened.

I know that Perforce translates the line endings for us, but thought it's a
moot point if the submitted and syncing clients are of the same OS.

If it's anything to do with deltas, then is 'text' guaranteed to
work at least for just the initial 'add'?


Thanks very much!


Justin



--
Please click here to see the post in its original format:
  http://forums.perfor...ype-limitations
_______________________________________________
perforce-user mailing list  -  perforce-user@perforce.com
http://maillist.perf...o/perforce-user
_______________________________________________
perforce-user mailing list  -  perforce-user@perforce.com
http://maillist.perf...o/perforce-user






Also tagged with one or more of these keywords: UTF8, UTF-8, text, filetype, utf8, utf-8, filetypes, unicode

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users