Extended Attribute Support in jWPS

What are EAs (extended attributes)?

Extended attributes also known as EAs are extra information that may be attached to any file. In short, this is OS/2's built in metadata system. By metadata we mean data about data. Examples are very common. For instance, for every file there is creation date information, size and other attributes such as whether the file is read-only. Metadata and managing it is now at the forefront of computer research and applications, since much data is itself meaningless without sufficient metadata to give it context. As an example, a file system is really just a bad database. How much time do you spend snooping around for a file? If you had better metadata, you would not need to do that and would have to spend far less time looking for things.

The authors of OS/2 realized this and supplied a metadata facility that is built into the native file system, HPFS and as well extensions for FAT and later JFS to allow for this. This was a courageous and bold stroke at the time and other computer operating systems are now starting to incoporate this too. However, this is not a bed of roses.

One of the most glaring issues is that OS/2, as an object-oriented operating system is left with an extremely counter-intuitive way to record metadata. The C API itself is optimized for disk access, not for ease of use. That the EAs are metadata is often lost on people who never see past the formidable learning curve to deal with them. This is not helped by the poor and many times simply wrong or misleading documentation, nor the fact that even many developer journals have articles on EAs that are either incorrect, or so overly simplified that using them as a basis for a more involved project is not really possible. The reality of working natively with EAs is ghastly pointer arithmetic and often times incomprehensible error messages or simply a crash. The API is quite unforgiving and on top of this, it is poorly designed, as is evidenced by the syntax having many required values that can't be changed (e.g. an argument that is identically 1). This shows that someone changed their mind but for whatever reason could not fix the resulting syntax (In reality because other people had probably used it and it would require a large re-write of what was already done). This is the point at which most people back off of using EAs and go to Plan B.

Access to EAs in OS/2 and eCS

Since the EAs are properly considered to be part of the file system, these must be accessed with very low-level API calls. In the aim to have speedy access, the data structures used are very similar to what is stored at a very low-level on the hardware. On top of this, the calls themselves have a good deal more in common with assembly language than anything else. This has earned EAs a reputation for being almost impossible to use by any but the most ardent enthusiast. Most applications simply avoid them.

A well-designed metadata system should provide some standard way to interact with its entries. This is most emphatically not the case and programmers are free to write almost anything they want with no oversight. While this has led to a some elegant tools to find and somehow fix or delete corrupted EAs, it has also stymied their use.

The one way to access them that has been used up to this point has been the well-known EAUtil, which is supplied as a standard feature. This works well and lets the user pull off all the EAs of a file and place them into another file or graft the EAs from such a file back onto a file. The idea is that if a user needs to send a file with lots of EAs to another user over some medium where EAs are not supported (such as an FTP transfer), then the EAs can be split, sent separately and rejoined. This works well.

Another way is to use the system utilities SysPutEA and SysGetEA in the REXXUtil library. This works acceptably, but is quite limited. Only two basic types of EAs are supported, text and binary. The latter is useless, unless the programmer wants to do very grisly byte-level manipulations which REXX is hardly suited for.

Access to EAs in jWPS

All this said, I wanted a more elegant system for working with EAs. I felt that it should fufill the following conditions:

In order to carry this out, I wrote a series of libraries that run under Java. This was done for a number of reasons, not the least of which that I am a Java programmer by profession. Java is also an extremely clean language to program in and has become probably the premier programming language for new technologies. I could have tried to write this in REXX (some people have asked me why I didn't) and the answer is scalability. REXX does not have good support for large applications – writing a database or word processor in REXX, while possible, would have to count as torture. Java can handle this sort of application and handle it very well.

XML support

XML, which stands for extensible markup language is a good compliment to Java. The real idea behind Java was to make a platform independent language for programming. In the same way, XML makes application independent storage for data. XML is not hard and you can be up and running with it in a day or two. In order to make a neutral interface for EAs, I decided that the best import/export mechanism would be to use XML as a basis. EAs serialized (that's just the fancy word for saying it has been converted to XML and saved somplace) have the charming feature that they are partially human-readable. In jWPS binary data is stored in base 64 encoding which is widely used (your email probably uses it on every attachment you get), so the tools to decode the binary information are quite plentiful. Text information can be edited directly. As a simple side-effect of this, you can sit down with your favorite text editor, write your EAs and stick them on any file you wish. No real programming needs to be done. For reference, here is a complete example for setting the subject and comments EAs on a file

  <?xml version="1.0" encoding="UTF-8"?>
  <eaList>
    <ea name=".SUBJECT" type="EAT_ASCII">
      <value>Hi! I'm a new subject line!!</value>
    </ea>
    <ea name=".COMMENTS" type="EAT_MVMT"/>
        <ea type="EAT_ASCII">
          <value>Here is the first comment line...</value>
        </ea>
        <ea type="EAT_ASCII">
           <value>and another comment line.</value>
        </ea>
    </ea>
  

We'll talk at length about this later, but this is such a nice feature that you should be aware of it.

Examples and such

ASSUMING you have the libraries set up on your system, you have two options. Either working directly with the EA calls or using them through JWPFilesystem calls. I'll start with the former.

The chief class you will use is called FileEAs. Here is a typical example. It will read a file from the command line and change the subject line.

  import net.jqhome.jwps.ea.*; // base support
  import net.jqhome.jwps.ea.standard.*; // standard EAs support
  
  public class MyEATest{
    public static void main(String[] args){
      try{
      // this assumes two command arguments. The first is the path of the file, 
      // the second is the path to an icon.
        FileEAs fEAs = new FileEAs(args[0]); // in a real app we'd check the arguments!
        
        // make a new subject EA.
        
        SubjectEA sEA = new SubjectEA("Mairzey doats and does eat stoats and...");
        fEAs.setEA(sEA);
        
        // now for a new icon.
        IconEA iEA = new IconEA(args[1]);
        fEAs.setEA(iEA);
        
        // that's all folks.
      }catch(Exception x){
         x.printStackTrace();
      }
    
    } //end main method
  
  } //end class
  

Compile and run this and look at the file's notebook afterwards. You'll see that new subject line and the icon has been changed.

Conclusion

This little introduction to why jWPS has EAs and how they are used has, it is hoped, eased you into the idea that EAs are now nothing to fear. If you have used EAs before, you should see this as a truly vast improvement on the native way of working with EAs. If you are new, I hope you will now consider using EAs when writing applications.