README.ciftbx

README.ciftbx

Information for CIFtbx 2.6.4, 14 June 2002

Before using this software, please read the

and please read the IUCr

on the Use of the Crystallographic Information File (CIF)

    \ | /            /##|    @@@@  @   @@@@@   |      |              @@@
     \|/ STAR       /###|   @      @   @     __|__    |             @   @
  ----*----        /####|  @       @   @@@@    |      |___  __  __     @
     /|\          /#####|   @      @   @       |      |   \   \/      @
    / | \         |#####|    @@@@  @   @       \___/  \___/ __/\__  @@@@@
                  |#####|________________________________________________
                 ||#####|                 ___________________            |
        __/|_____||#####|________________|&&&&&&&&&&&&&&&&&&&||          |
<\\\\\\\\_ |_____________________________|&&& 14 Jun 2002 &&&||          |
          \|     ||#####|________________|&&&&&&&&&&&&&&&&&&&||__________|
                  |#####|
                  |#####|                Version 2.6.4 Release
                  |#####|
                 /#######\ 
                |#########|
                    ====
                     ||
           An extended tool box of fortran routines for manipulating CIF data.
                     ||
                     ||  CIFtbx Version 2
                     ||        by
                     ||
                     ||  Sydney R. Hall (syd@crystal.uwa.edu.au)
                     ||  Crystallography Centre
                     ||  University of Western Australia
                     ||  Nedlands 6009, AUSTRALIA
                     ||
                     ||       and
                     ||
                     ||  Herbert J. Bernstein (yaya@bernstein-plus-sons.com)
                     ||  Bernstein + Sons
                     ||  5 Brewster Lane
                     ||  Bellport, NY 11713, U.S.A.
                     ||
 The latest program source and information is available from:
                     ||
 Em: syd@crystal.uwa.edu.au       ,-_|\      Sydney R. Hall
 sendcif@crystal.uwa.edu.au      /     \     Crystallography Centre
 Fx: +61 9 380 1118  ||      --> *_,-._/     University of Western Australia
 Ph: +61 9 380 2725  ||               v      Nedlands 6009, AUSTRALIA
                     ||
                     ||
_____________________||_____________________________________________________

This is a version of CIFtbx which has been extended to work with DDL 2 and mmCIF as well as with DDL 1.4 and core CIF dictionaries. CIFtbx version 1 was written by Sydney R. Hall (see Hall, S. R., "CIF Applications IV. CIFtbx: a Tool Box for Manipulating CIFs", J. Appl. Cryst (1993). 26, 482-494. The revisions for version 2 were done by Herbert J. Bernstein and Sydney R. Hall (see Hall, S. R. and Bernstein, H. J., "CIFtbx 2: Extended Tool Box for Manipulating CIFs," J. Appl. Cryst. (1996). 29, 598-603.)

This file contains the complete set of source decks and test data needed to implement and test the CIFtbx Tool Box routines, Version 2.6.4

This is a release of version 2 of the CIF Tool Box with the necessary revisions to use either DDL 1.4 or DDL 2.1.0 Dictionary style files. It has been tried against cif_core.dic (version 2.0.1) and cif_mm.dic (version of 1.0.0).

    *** ========================================================== ***
    *** ========================================================== ***
    *** ==>>> The release kit has been reorganized.  The kits<<<== ***
    *** ==>>> for CYCLOPS2 and cic2cif are now distributed   <<<== ***
    *** ==>>> separately from ciftbx.  As of version 2.5.4   <<<== ***
    *** ==>>> compressed versions of dictionaries are used   <<<== ***
    *** ========================================================== ***
    *** ========================================================== ***

We have tested this code. We believe this version is reasonably stable and ready for use. However, as with any major revision to a subroutine library, there will be some problems and bugs we have not found. Please report any difficulties with this version to either of the two authors.

See the section CHANGES, below, for a summary of the differences between this version and earlier CIFtbx 2 versions as well as between CIFtbx 2 and the CIFtbx 1 version of 6 July 1995.

1. INSTALLATION

Here is the recommended procedure for implementing and testing this version of ciftbx.

1.0. Before you try to install this version of CIFtbx

   
    *** ========================================================== ***
    *** ========================================================== ***
    *** ==>>> To test CIFtbx, you must have the dictionaries <<<== ***
    *** ==>>> cif_core.dic.Z and cif_mm.dic.Z in compressed  <<<== ***
    *** ==>>> form installed in a directory named            <<<== ***
    *** ==>>> dictionaries.                                  <<<== ***
    *** ========================================================== ***
    *** ========================================================== ***

The directory structure within which you will work is

 
                      top level directory
                      -------------------
                               |
                               |
                       ----------------
                       |              |
                  dictionaries   ciftbx.src
                  ------------   ----------

You may have acquired this package in one of several forms. The most likely are as a "C-shell Archive," a "Shell Archive", or as separate files. The idea is to get to separate files, all in the same directory, but let's start with the possibility that you got the package as one big file, i.e. in one of the archive file formats. Place the archive in the top level directory. If you picked it up in compressed format, be certain to uncompress it.

      *** ========================================================== ***
      *** ========================================================== ***
      *** ==>>> The files in this kit will unpack into a       <<<== ***
      *** ==>>> directory named ciftbx.src.  It is a good idea <<<== ***
      *** ==>>> to save the current contents of ciftbx.src     <<<== ***
      *** ==>>> and then to make the directory empty           <<<== ***
      *** ========================================================== ***
      *** ========================================================== ***

If you are on a machine which does not provide a unix-like shell, you will need to take apart the archive by hand using a text editor. We'll get to that in a moment.

1.1. ON A UNIX MACHINE

If you have the shell archive on a unix machine, follow the instructions at the front of the archive, i.e. save the uncompressed archive file as "file", then, if the archive is a "Shell Archive" execute "sh file". If the archive is a "C-Shell Archive" execute "csh file".

1.2. IF YOU DON'T HAVE UNIX

If sh or csh are not available, then it is best to start with the "C-Shell Archive" and do the steps that follow. If you must use the "Shell Archive" you should be aware that the lines you want to extract have been prefixed with "X", while most of the lines you want to discard have not. For a "C-Shell Archive" such prefixes are rare and the file is easier to read. Assume you have a "C-Shell Archive".

Use your editor to separate the different parts of the file into individual files in your workspace. Each part starts with a lot of unixisms, then several blank lines and then two lines which identify the file, and most importantly, contain the text "CUT_HERE_CUT_HERE_CUT_HERE" You can look at the line before and the line after to see if you are at the head or tail of a file. Use your editor to search for the "CUT_HERE" lines. Each part is carefully labeled and indicates the recommended filename for the separated file. On some machines these filenames may need to be altered to suit the OS or compiler. (e.g. on MS/DOS PC's you may want to change 'hash_funcs.f' to something like 'hashfunc.for'). Even though this particular release has no lines for which an "X" prefix is used within a file, be warned that, in general, you should look for lines that start with "X" and remove the "X".

1.3. MANIFEST

The partitions are as follows:

       part    filename                  description

         1     mkdecompln              decompression script used by Makefile
         2     rmdecompln              cleanup script used by Makefile
         3     ciftbx.src/README.ciftbx
                                       this file
         4     ciftbx.src/MANIFEST     a list of files in the kit
         5     ciftbx.src/Makefile     a control file for make to
                                       compile the code
         6     ciftbx.src/ciftbx.f     CIFtbx fortran source
         7     ciftbx.src/ciftbx.sys   CIFtbx common for inclusion into 
                                       ciftbx.f
         8     ciftbx.src/ciftbx.cmn   CIFtbx common for inclusion into 
                                       applications
         9     ciftbx.src/ciftbx.cmf   CIFtbx function definitions 
                                       (included in .cmn)
         10    ciftbx.src/ciftbx.cmv   CIFtbx variable definitions 
                                       (included in .cmn)
         11    ciftbx.src/clearfp.f    dummy version of clearfp_sun.f
         12    ciftbx.src/clearfp_sun.f
                                       SUN routine to clear floating 
                                       point exceptions
         13    ciftbx.src/hash_funcs.f hash-table control routines
                                       used by CIFtbx
         14    ciftbx.src/mtest.prt    print file output from tbx_exm.f 
                                       run
         15    ciftbx.src/mtest.out    CIF output by the tbx_exm.f run
         16    ciftbx.src/tbx_ex.f     example application used to test
                                       ciftbx.f against cif_core.dic 
                                       (get this from iucr)
         17    ciftbx.src/tbx_exm.f    example application used to test
                                       ciftbx.f against cif_mm.dic 
                                       (get this from rutgers)
         18    ciftbx.src/test.cif     example CIF used by tbx_ex.f
         19    ciftbx.src/test.req     example request file used by 
                                       tbx_ex.f
         20    ciftbx.src/test.prt     print file output from 
                                       tbx_ex.f run
         21    ciftbx.src/test.out     CIF output by the tbx_ex.f run
         22    ciftbx.src/testrle.f    test program for RLE routines

2. MAKING LISTINGS

Once you have separated out these files, list 'ciftbx.f', 'Makefile', 'hash_funcs.f', 'tbx_exm.f' and 'tbx_ex.f' in particular (all if possible!) and carefully read the descriptions in the front of these files. Remember that 'tbx_ex.f' and 'tbx_exm.f' are only examples of CIF applications -- they show how some basic CIF operations can be performed, but they are not necessarily sensible or typical of what an actual application would look like!

WARNING -- if you are running on a SUN, or other system which treats floating point underflows as an error, you may wish to list 'clearfp_sun.f'

3. COMPILING AND EXECUTING

You are now ready to implement the tool box and the test application. Here are the recommended steps for a UNIX system. Vary this according to the requirements of your OS and compiler. You will find it simplest to work if you place the CIFtbx2 files together in a common subdirectory named 'ciftbx.src'. Be very careful if you place them in a directory with other files, since some of the build and test instructions remove or overwrite existing files, especially with extensions such as '.o', '.lst', '.diff' or '.new'.

       *** ========================================================== ***
       *** ========================================================== ***
       *** ==>>> If you are running on a SUN or similar system  <<<== ***
       *** ==>>> which treats floating point underflow as an    <<<== ***
       *** ==>>> error, you may need to use clearfp_sun.f       <<<== ***
       *** ==>>> Please read the following paragraph carefully  <<<== ***
       *** ========================================================== ***
       *** ========================================================== ***

Before building the code, you may wish to replace the file 'clearfp.f' with code appropriate to your system. The routine is called by ciftbx to clear possible floating point underflows which may be generated when the code attempts to find the number of digits of precision supported on your system. No special action is required to clear an underflow on many systems, but on a SUN, for example, execution of the code to test machine precision generates messages about underflow and inexact arithmetic. On a SUN, these messages may be avoided by replacing 'clearfp.f' by 'clearfp_sun.f'. On other machines sensitive to underflow, you may have to use other (usually similar) code.

On a UNIX system, most of what you need to do to build and test CIFtbx2 is laid out in Makefile. *** Be sure to examine and edit this file appropriately before using it.*** But, once properly edited, all you should need to do is 'make clean' to remove old object files, 'make all' to build new versions of 'ciftbx.o', etc., and 'make tests' to test what you have built. Note that the Makefile takes some initial action to force mkdecompln and rmdecompln to be executable. See the section marked postshar.

***Note*** to execute the supplied example applications 'tbx_ex.f' and 'tbx_exm.f' identically to the test outputs supplied, a copy of the CIF Core Dictionary version 2.1beta5 'cif_core.dic' (for 'tbx_ex') and of the macromolecular CIF dictionary version 1.0.00 'cif_mm.dic' must be available in your work area. If they are not the tests will proceed with a warning message but no validations checks will occur. A copy of the dictionary 'cif_core.dic' can be obtained from iucr. A copy of 'cif_mm.dic' can be obtained from obtained from the mmCIF link on 'http://ndbserver.rutgers.edu/'. Once you have both dictionaries, compress them, and edit the definitions of MMDICPATH and COREDICPATH in Makefile to agree with the permanent locations of the formerly uncompressed dictionaries. The Makefile will create local soft links to temporary uncompressed copies of the dictionaries. If you can afford the space for permanent uncompressed copies, change the definition to EXPAND in Makefile to a non-temporary directory, such as '.'

If you don't wish to use the Makefile or can't, then here are the essential steps to build CIFtbx:

       *** ========================================================== ***
       *** ========================================================== ***
       *** ==>>> If you are familiar with version of CIFtbx     <<<== ***
       *** ==>>> released prior to version 2.6.4, read sections <<<== ***
       *** ==>>> (aa)-(dd) below very carefully.  CIFtbx has    <<<== ***
       *** ==>>> new internal cache and compression code.       <<<== ***
       *** ========================================================== ***
       *** ========================================================== ***

(aa) compile 'testrle.f' [note that provided the fortran "include" function is available to you, the files 'ciftbx.f', 'ciftbx.sys' 'hash_funcs.f' and 'ciftbx.cmn' will be automatically opened and processed by this single operation]

(bb) link 'testrle.o' as the executable file 'testrle'

(cc) execute 'testrle' so that 'cif_core.dic' is connected to device 5 (stdin). For a unix OS the command will look like this: 'testrle < cif_core.dic'

(dd) check that no output is produced. If any output is produced there is a serious problem with one or more of the routines 'xxrle', 'xxrld', 'xxd2chr' or 'xxc2dig'. These problems must be fixed before ciftbx will work. See the comments in 'xxd2chr' and 'xxc2dig'.

(a) compile 'tbx_ex.f' [note that provided the fortran "include" function is available to you, the files 'ciftbx.f', 'ciftbx.sys' 'hash_funcs.f' and 'ciftbx.cmn' will be automatically opened and processed by this single operation]

(b) link 'tbx_ex.o' as the executable file 'tbx_ex'

(c) execute 'tbx_ex' so that the list file 'test.lst' is connected to device 6 (stdout). The input CIF 'test.cif' and the output CIF 'test.new' will be automatically opened. For a unix OS the command will look like this: 'tbx_ex > test.lst'

(d) to check that the test has been successful, compare the files that you have generated 'test.lst' with the supplied 'test.prt', and 'test.new' with 'test.out'. They should be identical.

(e) compile 'tbx_exm.f', 'ciftbx.f', and 'hash_funcs.f'

(f) link 'tbx_exm.o', 'ciftbx.o' and 'hash_funcs.o' as the executable file 'tbx_exm'

(g) execute 'tbx_exm' so that the list file 'mtest.lst' is connected to device 6 (stdout). The input CIF 'test.cif' and the output CIF 'mtest.new' will be automatically opened. For a unix OS the command will look like this: 'tbx_exm > mtest.lst'

(h) to check that the test has been successful, compare the files that you have generated 'mtest.lst' with the supplied 'mtest.prt', and 'mtest.new' with 'mtest.out'. They should be identical.

(i) if you have any problems with this process please report them to Herbert J. Bernstein [em: yaya@bernstein-plus-sons.com, ph: +1-631-286-1339, fax: +1-631-286-1999] for the changes from ciftbx 1 to ciftbx 2 in particular, or to Syd Hall [em: syd@crystal.uwa.edu.au fx: +61(9)3801118] for general ciftbx issues. If in doubt as to where your problem lies, send it to whichever one of us is more likely to be convenient to your time-zone, and we will try to sort things out for you.

4. WHAT NEXT

You are now ready to implement CIFtbx for your software applications. Note that it more efficient to compile 'ciftbx.f' separately and add 'ciftbx.o' at link time. Note that the line "include 'ciftbx.cmn'" MUST appear at the start of any routine invoking the CIFtbx commands.

5. CHANGES

CHANGES FROM CIFTBX to CIFTBX2

CIFtbx2 was created from CIFtbx in response to the development of the Macromolecular CIF dictionary [Paula Fitzgerald, Helen Berman, Phil Bourne, Brian McMahon, Keith Watenpaugh, John Westbrook, "cifdic.m95", COMCIFS, 1995] and version 2.1.0 of the Dictionary Description Language [John Westbrook, Sydney Hall, "Draft DDL V 2.1.0", COMCIFS, 1995]. Since mmCIF and DDL 2.1.0 were carefully designed to ease migration from the core CIF dictionaries and DDL 1.4, very little in CIFtbx had to be changed, and the user interface remains virtually identical. The major issues that had to be dealt with were the greatly increased size of the dictionary, the rigorous use of categories to structure names, and and new system of aliases to ensure compatibility with older dictionaries. The use of save-frames in dictionaries and the presence of names longer than 32 characters also had to be dealt with.

There were two issues to address in the changes in size of the dictionary and of names: allocating appropriate storage and preserving efficiency of the code execution. New parameters were introduced for the size-dependent changes, so that future changes can go more smoothly. Efficiency is achieved by extensive use of hash-table-controlled lists. There had been a little use of a hash table in prior CIFtbx versions. All major lists of names are now controlled by hash tables. The routines used can be found in 'hash_funcs.f'. Ordinarily the user should not have to deal directly with these routines. The only change that might be made for tuning would be to adjust the parameter "NUMHASH" in 'ciftbx.sys'. This is presently set 53, which would mean, for up to 2500 names, typical searches for name matches would look at sub-lists to less than 50 names. Greater timing efficiency can be achieved at a slight expense in memory by increasing "NUMHASH" to some larger number. It is recommended that a prime be used for best efficiency in distribution of names among sub-lists.

In addition to "NUMHASH", the other size-control parameters in 'ciftbx.sys' are:

       NUMCHAR -- the maximum number of characters in a name (default 48)
       NUMDICT -- the maximum number of names in all dictionaries
                  (default 2500)  [Note:Increased to 3200 in Release 2.5.4]
       NUMBLOCK - the maximum number of names in a data block (default 500)
       NUMLOOP -- the maximum number of loops in a data block (default 50)
       NUMITEM -- the maximum number of items in a loop (default 50)
       MAXBUF  -- the maximum number of characters in a line (default 200)

The maximum number of categories is also controlled by NUMDICT, but does not compete for space with ordinary names.

***** WARNING ***** IF YOU CHANGE NUMCHAR OR MAXBUF YOU MUST CHANGE THEM IN BOTH 'ciftbx.cmn' AND IN 'ciftbx.sys' IN ORDER TO MAINTAIN ALIGNMENT OF COMMON BLOCKS [Note: Corrected in Release 2.4.5]

Starting with release 2.5.5, two additional parameters control the size of the memory cache for the direct access file:

       NUMPAGE -- the number of memory resident pages (default 10)
       NUMCPP  -- the number of characters per page (default 16384)

The number of characters per page must be at least MAXBUF and, normally should be much larger.

Starting with the release of CIFtbx version 2.6.4, additional parameters for control of compression by run length encoding (RLE) and for control of caching are provided:

       XXFLAG  -- the flag character for RLE (default '`')
       XXRADIX -- the radix for RLE digit encoding (default 64)
       NUMCIP  -- the number of characters per index pointer (default 8)

The most extensive changes were made to the routine "dict_", to recognize categories and check dictionaries for consistency among categories, save-frame or data-block names and item names. We wanted to preserve the handling of older dictionaries. This led to some compromises with the most rigorous checking. The oldest dictionary in question, 'cifdic.c91' does not use categories at all, and often names items as superstrings of data-block names. The most recent core dictionary, 'cif_core.dic' uses categories, naming them explicitly. In that case we can expect an unlooped name in a block to start with the category. The new mmCIF dictionary provides a clear break in item names with a "." between the category name and the rest of an item name, a condition for which we can and do check. We therefore make the assumption that a dictionary for which no categories were explicitly defined is one for which no categories need be checked, but if a dictionary defines any categories explicitly, we check each name to ensure that some category has been explicitly or implicitly assigned. Since the possibility exists of many messages on mismatches, we have introduced "ciftbx warning" messages similar to the "ciftbx error" message, but which allow continued execution. In the ALPHA version no further checks of categories were done after the dictionary check. In this version the names used in a loop are checked for consistency, provided categories were defined at all.

CHANGES WITHIN CIFTBX2

The 2.6.4 release added new code for XML output and for run-length encoding and cacheing. The new logical variables xmlout_ and xmlong_ control the use of XML output. The default for xmlout_ is .false. to indicate normal CIF output. If xmlout_ is set .true. the output routines are changed to produce XML style output. The conversion of CIF tag names to XML is controlled by xmlong_. If xmlong_ is .true. (the default), the XML tags are the CIF tags with the leading '_' removed. Otherwise and attempt is made to strip the leading category as well. The logic for testing machine precision was changed to allow for higher single and double precision. In addition, dictionaries may contain _xml_mapping.token, .token_type and .target items to provide for mapping of CIF tags to parametrized strings. The logic of the test programs has been changed to initialize standard deviations for numb_ to zero. This corrects an error in the test outputs on some machines. In addition, in June 2002 the release was updated to correct an error introduced in the 2.6.3 release in the handling of negative exponents. Our thanks to James Hester for the correction.

The 2.6.3 release corrected an error processing some numbers on input.

The 2.6.2 release corrected some typos the new code for category key checking introduced in release 2.6.

The 2.6.1 release suppressed the warning messages for the core dictionary caused by data blocks for groups of tags with no data type defined for the block name, but where the block name ends in an underscore. The 2.6 release added the variables esddig_ and pesddig_ to monitor and control the number of esd digits when esdlim_ is negative. Processing of category key values has been added.

The 2.5.5 release increased the speed of CIFtbx by creating a large in-core cache for the direct access file. The new parameters NUMPAGE and NUMCPP control the cache size. New variables append_, recbeg_ and recend_ were added. Logic in dict_ was changed to suppress category warnings from cif_core.dic. The pdata_ logic was changed to allow duplicate data blocks to be written. The logic form numb_ and numd_ was modified to ensure accurate handling of 90.000.

The 2.5.4 release increased NUMDICT to 3200 to accomodate the release 0.9.01 cif_mm.dic dictionary when loaded along with cif_core.dic. The definition of esdlim_ was extended to allow for cases reported by John C.Bollinger <jobollin@indiana.edu> so that a negative esdlim_ would permit esd's in the range [1,-esdlim_]. The new variables decp_, pdecp_, lzero_ and plzero_ were added to allow finer control over the presentation of numbers. All test cases were updated to use the current dictionaries. Some of the control code logic in dict_ was corrected and new codes 'catck' and 'catno' were added to turn category checking on and off.

The 2.5.3 release added logic for minimal processing of "global_". The new variable "glob_" is set true when a global is encountered. The variable "globo_" may be set try to force "pdata_" to output a global section instead of a data block. As with the save frame code, the global section code is a minimal implementation sufficient to handle dictionaries which use global sections. This code is not intended to support use of global sections within your CIFs. A bug in the processing of a text block with characters in the first line when the text block was the first value in a loop was fixed. The code to change the quotation character on a string containing that character was modified to test only for the case of a blank following the character.

The 2.5.2 release fixed a string subscript error in putstr. The new variables nblank_ and nblanko_ were added to allow control of the handling of quoted blank fields. Logic was added to avoid warning messages when the category_overview category is used in cifdic.c96. Added the variable tbxver_ (the CIFtbx version and date in the form 'CIFtbx version N.N.N, DD MMM YY '.

The 2.5.1 release added calls to the routine clearfp to allow floating point exceptions to be cleared after testing for machine precision. In addition, the common blocks were reorganized to avoid warning messages on systems sensitive to unreferenced variables. The file 'ciftbx.cmn' includes the two new files 'ciftbx.cmv' for variable definitions and 'ciftbx.cmf' for function definitions, and 'ciftbx.sys' includes 'ciftbx.cmv', but not 'ciftbx.cmf'. For use on systems with FreeBSD, the code in putnum was changed to allow for trailing blanks in writes of floating point fields.

The 2.5.0 release was a major change to CIFtbx. New variables were been added to allow controlled use of the horizontal tab character in both input and output. The user now has the the option of processing tabs as recognizable characters or to expand them to blanks on tab stops every 8 character positions. Two new routines were added. The command "bkmrk_(mark)" sets or finds bookmarks in an input cif. The command "find_(name,type,string)" searches an input cif. The logic of "ploop_" was changed to allow the "loop_" to be placed in and output CIF without a data item name, so that comments may follow. The position of the "loop_" is now controlled by "pposval_", if given. The recognition of columns of mixed numeric and character data was changed. Such columns are now treated as being character data, even for the numeric data items. The processing of rows with mixed categories was changed to produce fewer and clearer warning messages. The command "pchar_(string)" accepts a string consisting of "char(0)" as a command to terminate the current output cif line.

The 2.4.6 release fixed deficiencies and bugs found in testing release 2.4.5 (esp. by SRH). New variables were added to allow precise position of output, and the original upper/lower case versions of data item names are now retained and returned. A bug in reporting dictionary validation errors was fixed.

The 2.4.5 release was a significant revision to CIFtbx2 to support cif2cif. The processing of numbers was extensively revised and support for the reading and writing of comments was provided. The changes are as follows:

New routines, numd_ and pnumd_ were provided to read and write double precision numbers with esd's. The new variable esdlim_ controls the writing of esd's. The processing of numbers being read was expanded to allow scientific notation with E, D, or Q.

New routines, cmnt_ and pcmnt_ were provided to read and write comments. A new routine, prefx_ allows each line of an output CIF to have characters prefixed. A new variable, tabl_ controls the use of tab stops in loop output.

The logic of char_ was revised to allow "." and "?" to be read as character strings rather than type null, distinguishing this case from unquoted period or question mark. Text fields which begin in the first line are now recognized.

The logic of pchar_ was revised to ensure quotation of fields which might be confused with numbers in scientific notation, and to allow output of "." and "?" per se. When necessary, the character data is converted to text.

The behavior of the routine test_ was changed to force an advance through a loop when the same field is tested again.

The internal routine putstr was revised to avoid excess whitespace when flushing lines and to support the variable tabl_ to force internal alignment of columns in loops to tab stops determined by the column number.

The common blocks were cleaned up and consolidated. The parameter MAXTAB was defined to control the arrays for non-loop tab stops. The duplications between ciftbx.sys and ciftbx.cmn were removed, and ciftbx.sys call new forces and include of ciftbx.cmn.

The 2.4.4 release corrects two bugs. A mispositioning of an input CIF occurred if the data values in a loop included two consecutive text fields with no intervening blank. This has been corrected. Also, the output routines failed to limit output CIFs to 80 columns unless MAXBUF was set to 80. The meaning of line_ has now been extended to have effect as a right margin for output as well as for input. A warning message is issued for the rare cases where an output string which cannot be fit into the number of columns specified by line_, even by starting a new line. Finally, the internal arrays used by the subroutine getitm to keep track of positions within loops have been moved to a common block to facilitate some changes now under consideration.

The 2.4.3 release includes two minor changes from the 2.4.2 release. First, the two new data types, "line" and "uline" introduced in the transition to cifdic.m96 version 0.8.0 are recognized. Second, a blank file name is permitted in opening a cif, in which case a fortran "open" statement will not be executed within CIFtbx for the file, so that the open may be controlled by the calling routines. This has proven useful in writing filters.

The 2.4.2 release includes minor cleanups to remove variables which are no longer used and a suppression of a report of conflicting types in loading multiple dictionaries when no type checking is being done. The Makefile has been improved to include execution of tests with 'make tests', to allow rebuilding of ciftbx.shar and ciftbx.cshar with 'make shars', and to clean up the directory with 'make clean.' The files for CYCLOPS2 and related changes to the Makefile were introduced. See 'README.cyclops' and the comments in 'cyclops.f'.

The 2.4.1 release includes a fix to work around the strict interpretation of the ansi Fortran standard used by some compilers in handling write statements with concatenation of strings with inherited lengths. This caused a compilation failure in cifmsg.

With the 2.4 release, new arguments were added to dict_. The file name may be blank to allow calls which only set flags. The list of flags was extended to include 'reset' to turn off previously set flags for validity or type checking and 'close' to remove all dictionary information and reset the checking flags. A new routine, purge_, was added to close an open input CIF and clear all related data structures (but not the dictionary)

With the 2.3 release the handling of aliases changed a little. The control of use of aliases was split between the logical variables "alias_" (which when true allows routines to recognize aliased names) and "aliaso_" which when true allows the output routines to output the preferred aliases from the dictionary chosen. The variables "tagname_", "dicname_", "diccat_" and "dictype_" were added to provide the CIF input tagname, the preferred name, the category and the dictionary type. In release 2.2 the last three were called "dname_", "dcat_" and "dtype_" (see below)

With the 2.2 release, the variable, "dcat_" (now "diccat_") was defined in the common blocks in 'ciftbx.cmn' to hold the category of the last data item processed by "test_" The special category "(none)" may be reported when no category can be found.

The use of aliases in releases 2.0 and 2.1 was handled by adding lists of alias pointers for names. There are two pointers: "alias" either holds a zero if there is no next alias, or a pointer to the next alias, and "aroot" which is zero for the root definition or a pointer to the root definition if this is an alias. The new logical "aroot_" (now "aliaso_") controls output use of aliases. If "aliaso_" is true then when a request is made to output a name, the preferred alias named provided by the dictionary, if any, is substituted. If "aroot_" is false, then the name given by the user is used. The default is the release is for "aroot_" to be true. If a change is needed, it is available in 'ciftbx.cmn'. In addition, for the full 2.2 release, a new variable, "daroot_" was added to the common blocks in 'ciftbx.cmn' which holds the name of the data item of which the data item for which the last call to "test_" is an alias. This report is independent of the setting of "aroot_" but does depend on the data item actually being present in the CIF being processed, not just in the dictionary (which must, of course, also be present).

As of release 2.3, the default for "aliaso_" is false, and "daroot_" has been renamed "dicname". In most cases, "dicname" will be properly set in release 2.3, etc. even if the name is _not_ in the input CIF.

The use of save-frames was handled by including a logical "save_" to flag a data block as being a save frame. Minor changes were made in the routine 'data_' to set "save_" true at the start of a save frame (i.e. when a non-blank name is given), and to recognize the end of a save frame (i.e. when the name is blank). A warning is issued if the start and end are not consistently used.

The handling of long lines has been changed. Prior versions of CIFtbx clipped all lines at 80 characters. The hard clipping is now controlled by the parameter MAXBUF (default 200), with a warning issued for lines longer than the number of characters specified by the variable line_ (initially set to 80). Characters are processed even if they are in character positions after the warning limit set by line_. In some cases, text lines which were returned by "char_" with a length of 80 will now be returned with a different length. The new code scans for the last non-blank character on the line, searching as far as MAXBUF. In most cases the reported value will be less than in the past, reflecting the length of the line with trailing blanks stripped.

The mmCIF dictionary specifies a much wider range of item types than had been the case in the past. To ensure upward compatibility, CIFtbx maps all of the known item types to one of the primitive types: char, numb, text or null. With the 2.2 release, access to the more precise type is provide by the variable "dtype_" in the common blocks in 'ciftbx.cmn' "dtype_" is set when "test_" is called and the data item name is found in the CIF as well as in the dictionary.

KNOWN PROBLEMS

There is no way to read a comment on a data name in a loop data name list, or between the loop data names and the first data item.

There is no way to read data items within a data block after the completion of embedded save frames. Until this problem is corrected, save frames should be placed last within a data block and a new data block started for further information.

The command pchar_ forces quote marks around any string which might be confused with a number.

Updated 16 June 2002

For further information contact Syd Hall (syd@crystal.uwa.edu.au) or Herbert Bernstein (yaya@bernstein-plus-sons.com) or Herbert Bernstein's latest sources .

README.ciftbx

Information for CIFtbx 2.6.4, 14 June 2002

Before using this software, please read the and please read the IUCr on the Use of the Crystallographic Information File (CIF)

1. INSTALLATION

2. MAKING LISTINGS

3. COMPILING AND EXECUTING

4. WHAT NEXT

5. CHANGES

CHANGES FROM CIFTBX to CIFTBX2

CHANGES WITHIN CIFTBX2

KNOWN PROBLEMS

Before using this software, please read the

and please read the IUCr

on the Use of the Crystallographic Information File (CIF)