Skip to content

What to do about portability with EBCDIC? #75

@khwilliamson

Description

@khwilliamson

With the application of #74, IO::Compress passes all its tests on an EBCDIC platform, except for two recently added ones in 006zip.t: where a zip file was created on an ASCII machine and read on the EBCDIC one. The problem is that the filename field "hello.txt" in the header is in an alien coding (ASCII) that the test doesn't expect.

The question is what to do about this. Originally, I thought just to skip the failing tests on an EBCDIC box, but reading the code and documentation a bit, I think that the goal should be to make compressed files to be completely portable. I'm willing to write a patch, but I need some guidance. I saw that the code does attempt to look at the current locale, but I didn't see how it deals with that. It appears there is a bit that indicates if the text is UTF-8 or not. A problem is that the UTF-8 on EBCDIC boxes is not the same as the UTF-8 on ASCII ones.

If EBCDIC boxes output all their text as ASCII and ASCII-based UTF-8, then we would have complete portability.

What are all the fields that would be affected?

This would be done as the last step in writing and the first step in reading a file, so that the rest of the module would work in native values, so would not have to change. It might be that Encode could hide most of it from this module. There would be effectively no change in behavior from currently when run on an ASCII box

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions