[IUCr Home Page] [CIF Home Page] [ciftbx.src] [cyclops.src] [cif2cif.src]
CIFtbx

Reading, Writing and Validating CIFS using CIFtbx2, cif2cif and Cyclops

(Revised 23 June 1998)

Based on Reading, Writing and Validating CIFS using CIFtbx2 and Cyclops, mmCIF workshop, IUCr meeting, Seattle Washington, August 1996, Abstract E0725

Herbert J. Bernstein
Bernstein + Sons, 5 Brewster Lane, Bellport, NY 11713-2803, USA
phone: 1-516-286-1339 email: yaya@bernstein-plus-sons.com

Sydney R. Hall
Crystallographic Centre, University of Western Australia, Nedlands 6009, Australia
phone: 61-9-380-2725 email: syd@crystal.uwa.edu

Work supported in part by IUCr (for HJB)

Before using this software, please read the
NOTICE
and please read the IUCr
Policy
on the Use of the Crystallographic Information File (CIF)


Introduction


The basic steps needed to adapt existing Fortran applications and write new applications which will manipulate CIFs are explained. We emphasize techniques needed to make applications compatible with both DDL1 and DDL2 CIFs. We discuss validating CIFs and more general STAR documents against dictionaries.

CIFs are becoming the standard for presentation of small molecules, and the pending adoption of the mmCIF dictionary by the IUCr is encouraging increasing use of CIFs for macromolecules. It is critically important for existing applications, such as molecular display programs, to be adapted to accept small molecule and macromolecule CIFs for input and to be able to produce CIFs as output in order to ensure a common interchange format among programs. Application programmers need to become familiar with the dictionary-based definition of CIF tokens and to design their applications so that the addition of new layered dictionaries will not require a redesign of code. We use the experience in adapting the DDL1 versions of several programs to a compatible DDL1/DDL2 environment to illustrate some to the practical issues involved. We show how tools, such as the extended version of CIFtbx2, a new version of a Fortran subroutine library for programmers developing CIF applications, cif2cif, a CIFtbx2 application which checks CIFs, copies, reformats and extract subsets of CIFs and Cyclops2, a new version of the very effective STAR data name checking program, can be used to make the transition to CIF and between DDL1 and DDL2 more manageable.

Issues that are addressed include managing large dictionaries efficiently with the use of hash-tables, the use of layered dictionaries, the implications of categories, and the implications of the more precise data types of mmCIF.

CIFtbx2

CIFtbx2 (Hall and Bernstein, 1996) is a new version of a Fortran subroutine library for programmers developing CIF applications. The functions for reading and writing CIF data in CIFtbx [H95] have been expanded and facilities for handling macromolecular CIF data and dictionaries have been added. The CIFtbx2 library facilitates applications involving all current versions of the dictionary definition language.

Since that time the CIF approach has been applied to new areas such as NMR data and macromolecular structural data. The dictionary which defines the macromolecular structural data [FBB96] uses an extended dictionary definition language (DDL2) with stronger relational attributes than that of the DDL [HC95] used for the core crystallographic dictionaries. The extended DDL2 [WH95] is not supported by the dictionary functions in CIFtbx and this was the primary motivation for creating the new CIFtbx2 library. This library, which is equally applicable to the macromolecular CIF (mmCIF) dictionary in DDL2 and to other dictionaries in DDL, is the subject of this talk.

The initial impetus for this work was to support one of us (HJB) in the use of CIF data derived from Protein Data Bank [PDB77] files. CIFtbx2 was used to map hundreds of CIF data names embedded in existing software into the mmCIF name format, and to check the existence and application of these items. As a result, code for conversion of Protein Data Bank entries to mmCIF format was developed in the form of an awk script, pdb2cif [BBB98]. CIFtbx2 is also used in the recent release of the Xtal 3.4 System [HKS95].

With minor exceptions, CIFtbx2 is a fully upward compatible extension of CIFtbx.

The user interface to CIFtbx2 consists of Fortran functions, subroutines and common variables. The user-accessible functions, subroutines and variables are identified by a trailing underscore ("_") character. The functions and subroutines, often referred to as "commands", may be divided into basic three groups: set-up commands that initialise data handling; read commands that input data from a CIF; and write commands that output CIF data. The read and write commands are logically independent and may be applied simultaneously to copy and update CIFs.

GENERAL TOOLS

init_Sets the device numbers of files. (optional)
[logical function always returned .true.]
<input CIF dev number>Set input CIF device (def=1)
<output CIF dev number>Set output CIF device (def=2)
<diracc dev number>Set direct access formatted scratch device number (def=3)
<error dev number>Set error message device (def=6)
dict_Requests a CIF dictionary be used for various data checks.
[logical function returned as .true. if the name dictionary was opened; and if the check codes are recognisable. The data item names used in the first dictionary loaded are considered to be preferred by the user to aliases found in dictionaries loaded in later calls. On exit from dict_ the variable dicname_ is either equal to the filename, or, if the dictionary had a value for the tag dictionary_name of dictionary.title, dicname_ is set to that value. The variable dicver_ is blank or set to the value of _dictionary_version or of _dictionary.version The check codes 'catck' and 'catno' turn on and off checking of dictionary catgeory conventions. The default is 'catck'. Three check codes control the handling of tags from the current dictionary which duplicate tags from a dictionary loaded earlier. These codes ('first', 'final' and 'nodup') have effect only for the current call to dict_ The default is 'first'.]
<dictionary filename>A CIF dictionary in DDL format or blank if just setting flags or resetting the dictionary
<check code string>The codes specifying the types of checks to be applied to the CIF.
'valid' data name validation check.
'dtype' data item data type check.
'catck' check datanames against categories.
'catno' don't check datanames against categories.
'first' accept first dictionary's definitions of duplicate tags.
'final' accept final dictionary's definitions of duplicate tags.
'nodup' do not accept duplicate tag definitions.
'reset' switch off checking flags.
'close' close existing dictionaries.

CIF ACCESS TOOLS ("the get_ing commands")

ocif_Opens the CIF containing the required data.
[logical function returned .true. if CIF opened]
<CIF filename>A blank name signals that the currently open input CIF file will be read.
data_Identifies the data block containing the data to be requested.
[logical function returned .true. if block found]
<data block name>A blank name signals that the next encountered block is used (the block name is stored in the variable bloc_).
bkmrk_Saves or restores the current position so that data from elsewhere in the cif can be examined.
[logical function returned as .true. on save if there was room in internal storage to hold the current position, .true. on restore if the bookmark number used was valid. If the argument is zero, the call is to save the position and return the bookmark number in the argument. If the argument is non-zero, the call is to restore the position saved for the bookmark number given. The bookmark and the argument are cleared. The position set on return allow reprocessing of the data item or loop row last processed when the bookmark was placed.
NOTE: All bookmarks are cleared by a call to data_]
<integer variable>Bookmark number.
find_Find the location of the requested item in the CIF.
[The argument "name" may be a data item name, blank for the next such item. The argument "type" may be blank for unrestricted acceptance of any non-comment string (use cmnt_ to see comments), including loop headers, "name" to accept only the name itself and "valu" to accept only the value, or "head" to position to the head of the CIF. Except when the "head" is requested, the position is left after the data item provided. If the item found is of type "name", posnam_ is set, otherwise, posval_]
<data item name>A blank name signals that the next item of the type specified is needed.
<data item type>blank, 'head', 'name' or 'valu'
<character variable>Returned string is of length long_.
test_Identify the data attributes of the named data item.
[logical function returned as .true. if the item is present or .false. if it is not. The data attributes are stored in the common variables list_, type_, dictype_, diccat_ and dicname_. The values in dictype_, diccat_ and dicname_ are valid whether or not the data item is found in the input CIF, as long as the named data item is found in the dictionaries declared by calls to dict_. The data item name found in the input CIF is stored in tagname_. The appropriate column numbers are stored in posnam_, posval_ and (for numbers in posdec_). The quotation mark, if any, used is stored in quote_.
list_ is an integer variable containing the sequential number of the loop block in the data block. If the item is not within a loop structure this value will be zero.
type_ is a character*4 variable with the possible values:
'numb' for number data
'char' for character data
'text' for text data
'null' if data missing or '?' or '.'
dictype_ is a character*(NUMCHAR) variable with the type code given in the dictionary entry for the named data item. If no dictionary was used, or no type code was specified, this field will simply agree with type_. If a dictionary was used, this type may be more specific than the one given by type_.
diccat_ is a character*(NUMCHAR) variable with the category of the named data item, or '(none)'
dicname_ is a character*(NUMCHAR) variable with the name of the data item which is found in the dictionary for the named data item. If alias_ is .true., this name may differ from the name given in the call to test_. If alias_ is .false. or no preferred alias is found, dicname_ agrees with the data item name.
tagname_ is a character*(NUMCHAR) variable with the name of the data item as found in the input CIF. It will be blank if the data item name requested is not found in the input CIF and may differ from the data item name provided by the user if the name used in the input CIF is an alias of the data item name and alias_ is .true.
posnam_, posval_ and posdec_ are integer variables which may be examined if information about the horizontal position of the name and data read are needed. posnam_ is the starting column of the data name found (most often 1). posval_ is the starting column of the data value. If the field is numeric, then posdec_ will contain the effective column number of the decimal point. For whole numbers, the effective position of the decimal point is one column to the right of the field.
quote_ is a character*1 variable which may be examined to determine if a quotation character was used on character data.]
<data name>Name of the data item to be tested.
name_Get the NEXT data name in the current data block.
[logical function returned as .true. if a new data name exists in the current data block, and .false. when the end of the data block is reached.]
<data name> Returned name of next data item in block.
numb_Extracts the number and its standard deviation (if appended).
[logical function returned as .true. if number present. If .false. arguments 2 and 3 are unaltered. If the esd is not attached to the number argument 3 is unaltered.]
<data name> Name of the number sought.
<real variable> Returned number.
<real variable> Returned standard deviation.
numd_Extracts the number and its standard deviation (if appended) as double precision variables.
[logical function returned as .true. if number present. If .false. arguments 2 and 3 are unaltered. If the esd is not attached to the number argument 3 is unaltered.]
<data name> Name of the number sought.
<double precision variable> Returned number.
<double precision variable> Returned standard deviation.
char_Extracts character and text strings.
[logical function returned as .true. if the string is present. Note that if the character string is text this function is called repeatedly until the logical variable text_ is .false.]
<data name> Name of the string sought.
<character variable>Returned string is of length long_.
cmnt_Extracts the next comment from the data block.
[logical function returned as .true. if a comment is present. The initial comment character "#" is _not_ included in the returned string. A completely blank line is treated as a comment.]
<character variable>Returned string is of length long_.
purge_Closes existing data files and clears tables and pointers.
[subroutine call]

CIF CREATION TOOLS ("the put_ing commands")

pfile_Create a file with the specified file name.
[logical function returned as .true. if the file is opened. The value will be .false. if the file already exists.]
<file name> Blank for use of currently open file
pdata_Put a data block command into the created CIF.
[logical function returned as .true. if the block is created. The value will be .false. if the block name already exists. Produces a save frame instead of a data block if the variable saveo_ is true during the call. No block duplicate check is made for a save frame.]
<block name>
ploop_Put a loop_ data name into the created CIF.
[logical function returned as .true. if the invocation conforms with the CIF logical structure.]
<data name>
pchar_Put a character string into the created CIF.
[logical function returned as .true. if the name is unique, AND, if dict_ is invoked, is a name defined in the dictionary, AND, if the invocation conforms to the CIF logical structure. The action of pchar_ is modified by the variables pquote_ and nblanko_. If pquote_ is non-blank, it is used as a quotation character for the string written by pchar_. The valid values are '''', '"', and ';'. In the last case a text field is written. If the string contains a matching character to the value of quote_, or if quote_ is not one of the valid quotation characters, a valid, non-conflicting quotation character is used. Except when writing a text field, if nblanko_ is true, pchar_ converts a blank string to an unquoted period.]
<data name> If the name is blank, do not output name.
<character string>A character string of MAXBUF chars or less.
pcmnt_Puts a comment into the created CIF.
[logical function returned as .true. The comment character "#" should not be included in the string. A blank comment is presented as a blank line without the leading "#"].
<character string>A character string of MAXBUF chars or less.
pnumb_Put a single precision number and its esd into the created CIF.
[logical function returned as .true. if the name is unique, AND, if dict_ is invoked, is a name defined in the dictionary, AND, if the invocation conforms to the CIF logical structure. The number of esd digits is controlled by the variable esdlim_]
<data name> If the name is blank, do not output name.
<real variable> Number to be inserted.
<real variable> Esd number to be appended in parentheses.
pnumd_Put a double precision number and its esd into the created CIF.
[logical function returned as .true. if the name is unique, AND, if dict_ is invoked, is a name defined in the dictionary, AND, if the invocation conforms to the CIF logical structure. The number of esd digits is controlled by the variable esdlim_]
<data name> If the name is blank, do not output name.
<double precision variable> Number to be inserted.
<double precision variable> Esd number to be appended in parentheses.
ptext_Put a character string into the created CIF.
[logical function returned as .true. if the name is unique, AND, if dict_ is invoked, is a name defined in the dictionary, AND, if the invocation conforms to the CIF logical structure.]
ptext_ is invoked repeatedly until the text is finished. Only the first invocation will insert a data name.
<data name> If the name is blank, do not output name.
<character string> A character string of MAXBUF chars or less.
prefx_Puts a prefix onto subsequent lines of the created CIF.
[logical function returned as .true. The second argument may be zero to suppress a previously used prefix, or greater than the non-blank length of the string to force a left margin. Any change in the length of the prefix string flushes pending partial output lines, but does _not_ force completion of pending text blocks or loops. This function allows the CIF output functions to be used within what appear to be text fields to support annotation of a CIF.]
<character string>A character string of MAXBUF chars or less.
<integer variable>The length of the prefix string to use.
close_Close the creation CIF. MUST be used if pfile_ is used.
[subroutine call]

Variables for Data Access Control:

alias_Logical variable: if left .true. then all calls to CIFtbx functions may use aliases of data item names. The preferred synonym from the dictionary will be substituted internally, provided aliased data names were supplied by an input dictionary (via dict_). The default is .true., but alias_ may be set to .false. in an application.
aliaso_Logical variable: if set .true. then cif output routines will convert aliases to the names to preferred synonyms from the dictionary. The default is .false., but aliaso_ may be set to .true. in an application. The setting of aliaso_ is independent of the setting of alias_.
align_ Logical variable signals alignment of loop_ lists during the creation of a CIF. The default is .true.
append_Logical variable: if set .true. each call to ocif_ will append the information found to the current cif. The default is .false.
bloc_Character*(NUMCHAR) variable: the current block name.
decp_Logical variable: set when processing numeric input, .true. if there is a decimal point in the numeric value, .false. otherwise.
dictype_Character*(NUMCHAR) variable: the precise data type code (see test_)
diccat_Character*(NUMCHAR) variable: the category (see test_)
dicname_Character*(NUMCHAR) variable: the root alias (see test_)
dicver_Character*(NUMCHAR) variable: the version of the dictionary just loaded (see dict_)
esddig_ Integer variable: The number of esd digits in the last number read from a CIF. Will be zero if no esd was given.
esdlim_Integer variable: Specifies the upper limit of esd's produced by pnumb_, and, implicitly, the lower limit. The default value is 19, which limits esd's to the range 2-19. Typical values of esdlim_ might be 9 (limiting esd's to the range 1-9), 19, or 29 (limiting esd's to the range 3-29)
file_Character*(MAXBUF) variable: the filename of the current file.
glob_Logical variable signals that the current data block
globo_Logical variable signals that the output data block from pdata_ is actually a global block (.true. for a global block).
line_Integer variable: Specifies the input/output line limit for processing a CIF. The default value is 80 characters. This may be set by the program. The max value is MAXBUF which has a default value of 200.
list_Integer variable: the loop block number (see test_).
long_ Integer variable: the length of the data string in strg_.
longf_Integer variable: the length of the filename in file_.
loop_Logical variable signals if another loop packet is present.
lzero_Logical variable: set when processing numeric input, .true. if the numeric value is of the form [sign]0.nnnn rather than [sign].nnnn, .false. otherwise
nblank_Logical variable: if set .true. then all calls to to char_ or test_ which encounter a non-text quoted blank will return the type as 'null' rather than 'char'.
nblanko_Logical variable: if set .true. then cif output routines will convert quoted blank strings to an unquoted period (i.e. to a data item of type null).
pdecp_Logical variable: if set .true. then cif numeric output routines will insert a decimal point in all numbers written by pnumb_ or pnumbd_. If set .false. then a decimal point will be written only when needed. The default is .false.
pesddig_Integer variable: if set non-zero, and esdlim_ is negative, controls the number of digits for esd's produced by pnumb_ and pnumd_
plzero_Logical variable: if set .true. then cif numeric output routines will insert a zero before a leading decimal point, The default is .false.
pposdec_Integer variable giving the position of the decimal point for the next number to be written.
pposnam_Integer variable giving the starting column of the next name or comment to be written.
pposval_Integer variable giving the starting column of the next data value to be written.
posdec_Integer variable giving the position of the decimal point for the last number read.
posend_Integer variable giving the ending column of the last data value read, not including a terminal quote.
posnam_Integer variable giving the starting column of the last name or comment read.
posval_Integer variable giving the starting column of the last data value read.
pquote_Character variable giving the quotation symbol to be used for the next string written.
quote_Character variable giving the quotation symbol found delimiting the last string read.
precn_Integer variable: Reports the record number of the last line written to the output cif. Set to zero by init_. Also set to zero by pfile_ and close_ if the output cif file name was not blank.
ptabx_Logical variable signals tab character expansion to blanks during the creation of a CIF. The default is .true.
quote_Character variable giving the quotation symbol found delimiting the last string read.
recbeg_Integer variable: Gives the record number of the first record to be used. May be changed by the user to restrict access to a CIF.
recend_Integer variable: Gives the record number of the last record to be used. May be changed by the user to restrict access to a CIF.
recn_Integer variable: Reports the record number of the last line read from the direct access copy of the input cif.
save_Logical variable signals that the current data block is actually a save-frame (.true. for a save-frame).
saveo_Logical variable signals that the output data block from pdata_ is actually a save-frame (.true. for a save-frame).
strg_ Character*(MAXBUF) variable: the current data item.
tabl_Logical variable signals tab-stop alignment of output during the creation of a CIF. The default is .true.
tabx_Logical variable signals tab character expansion to blanks during the reading of a CIF. The default is .true.
tbxver_Character*32 variable: the CIFtbx version and date in the form 'CIFtbx version N.N.N, DD MMM YY '
text_Logical variable signals if another text line is present.
type_Character*4 variable: the data type code (see test_).

cif2cif

cif2cif is a CIFtbx2 application which copies CIFs while optionally checking them against dictionaries. While doing the copy, cif2cif can reformat a CIF which has strayed past column 80 to bring it back into spec and can modify esd's for numbers to conform to the rul of 9, 19, 29, etc. It also can perform the functions of QUASAR [HS93] for CIFs, extracting selected tags in the order specified by a request list.

Using the Program cif2cif

      cif2cif [-i input_cif] [-o output_cif] [-d dictionary]
              [-f command_file] [-e esdlim_] [-a aliaso_] [-p prefix]
              [-t tabl_] [-q request_list]
              [[[input_cif] [[output_cif] [[dictionary] [request_list]]]]
      where:
              input_cif defaults to $CIF2CIF_INPUT_CIF or stdin
              output_cif defaults to $CIF2CIF_OUTPUT_CIF or stdout
              dictionary defaults to $CIF2CIF_CHECK_DICTIONARY
                (multiple dictionaries may be specified)
              request_list defaults to $CIF2CIF_REQUEST_LIST
              input_cif of "-" is stdin, output_cif of "-" is stdout
               request_list of "-" is stdin
              -e has integer values (e.g. 9, 19(default) or 29)
              -a has values of t or 1 or y vs. f or 0 or n
              -p has string values in which "_" is replaced by blank
              -t has values of t or 1 or y vs. f or 0 or n (default n)

cif2cif is used as a filter.

Cyclops2

Cyclops2 is a new version of the program Cyclops (Hall, 1993) which is used, in conjunction with CIF dictionaries, to validate data names in an ASCII file. The validated file may contain CIF or non-CIF data, text documents or a program source. The new version is able to work with DDL1 or DDL2 dictionaries, the long data names of mmCIF dictionaries and with multiple dictionaries. Cyclops2 is written incorporating the CIFtbx2 (Hall and Bernstein, 1996) library of Fortran functions and is portable to a variety of platforms.

Files are read and written by Cyclops2 as follows:

The text file to be validated is read from the standard input device (normally device 5). For Unix operating systems this is the file stdin; on other systems Cyclops2 uses the file STARTEXT.

The dictionary file or files are identified in the input text file STARDICT. This file may itself be a DDL-conforming dictionary, or, if it begins with the characters "#DICT", may list the filenames of dictionaries to be entered, one per line.

The validation report is output to the file STARCHEK.

Messages are output to the standard output device stdout (normally device 6).

Using the Program cif2cif

cyclops [-i input_text] [-o validation_output]
    [-d dictionary] [-p priority] [-c catck]
    [-f command_file] [-v verbose] [-s short]
where:
    input_text defaults to $Cyclops_INPUT_TEXT or stdin
    validation_output defaults to $Cyclops_VALIDATION_OUT or stdout
    dictionary defaults to $Cyclops_CHECK_DICTIONARY
      (multiple dictionaries may be specified)
      input_text of "-" is stdin, validation_output of "-" is stdout
    -c has values of t or 1 or y vs. f or 0 or n,
      (default f, i.e. no checking of dictionary categories),
    -v has values of t or 1 or y vs. f or 0 or n,
      (default f, i.e. non-verbose),
    -s has values of t or 1 or y vs. f or 0 or n,
      (default f, i.e. not short),
      short restricts output to items not in dictionaries'
    -p has values of first, final or nodup 
      (default first for first dictionary has priority) ',
    a command file may contain additional arguments.

Sample Cyclops2 Output


                    Cyclops Check List
                    ------------------


                Dictionary data names  = 2244
                New data names in text =    4
                [1]  Dictionary cif_core.dic 2.0.1 data names =   624
                [2]  Dictionary cif_mm.dic 0.9.01 data names =  1620


 Data names NOT in Dictionary                         Line Numbers

 _blat1  . . . . . . . . . . . . . . . . . . . . . . . .     9    11    94    96
                                               197   199   306   312   318   324
                                               330
 _blat2  . . . . . . . . . . . . . . . . . . . . . . . .    13    15    98   100
                                               201   203   303   309   315   321
                                               327
 _dummy_test   . . . . . . . . . . . . . . . . . . . . .     5     7    90    92
                                               193   195   217
 _rubish_here  . . . . . . . . . . . . . . . . . . . . .   447



 [1]  Dictionary cif_core.dic 2.0.1
 [2]  Dictionary cif_mm.dic 0.9.01
                                                      Line Numbers

 [2] _atom_site.calc_attached_atom   . . . . . . . . . .   429
 [1] = _atom_site_calc_attached_atom                       428
 [2] _atom_site.calc_flag  . . . . . . . . . . . . . . .   426
 [1] = _atom_site_calc_flag                                425
 [2] _atom_site.fract_x  . . . . . . . . . . . . . . . .    38    44    50   406
 [1] = _atom_site_fract_x                                  405
 [2] _atom_site.fract_y  . . . . . . . . . . . . . . . .    39    45    51   410
 [1] = _atom_site_fract_y                                  409
 [2] _atom_site.fract_z  . . . . . . . . . . . . . . . .    40    46    52   414
 [1] = _atom_site_fract_z                                  413
 [2] _atom_site.id   . . . . . . . . . . . . . . . . . .    37    43    49   402
 [1] = _atom_site_label                                    401
 [2] _atom_site.thermal_displace_type  . . . . . . . . .   422
 [1] = _atom_site_thermal_displace_type                    421
 [2] _atom_site.type_symbol  . . . . . . . . . . . . . .   432   436   440   444
                                               450   454   458   466   470   474
                                               478
 [1] = _atom_site_type_symbol                              431   435   439   443
                                               449   453   457   465   469   473
                                               477
 [2] _atom_site.U_iso_or_equiv   . . . . . . . . . . . .    41    47    53   418

  ... SECTION OF OUTPUT OMITTED ...


References

Useful WWW URL's

There are many useful sites on the World Wide Web where information, tools and software related to CIF, mmCIF and the PDB can be found. The following are good starting points for exploration:

The International Union of Crystallography (IUCr) provides access to software, dictionaries, policy statements and documentation relating to CIF and mmCIF at:

with mirror sites at: Information and Software for STAR and CIF can be found at:

The Nucleic Acid Database Project provides access to its entries, software and documentation, with an mmCIF page giving access to the dictionary and mmCIF software tools at:

with mirror sites at:

The Protein Data Bank provides access to entries, software and documentation with a browser, and an on-line PDB format description at:

with mirror sites at many locations (see http://www.pdb.bnl.gov/pdb-docs/mirror_sites.html).

Tutorials on mmCIF and the relationship to PDB format can be found at: http://www.sdsc.edu/pb/cif/tutorials.html


Here are direct links to copies of the IUCr CIF home page, the NDB's mmCIF home page, pdb2cif, cif2pdb and CIFtbx (with Cyclops and cif2cif).

United States
NDB, Rutgers, NJ mmCIF pdb2cif cif2pdb CIFtbx...
SDSC, San Diego, CA CIF mmCIF pdb2cif cif2pdb CIFtbx...
United Kingdom
IUCr, Chester CIF   pdb2cif cif2pdb CIFtbx...
EBI, Hinxton   mmCIF pdb2cif cif2pdb CIFtbx...
France
U. P. et M. Curie, Paris CIF pdb2cif cif2pdb CIFtbx...
Sweden
U. of Stockholm CIF   pdb2cif cif2pdb CIFtbx...
South Africa
U. of the Witwatersrand CIF   pdb2cif cif2pdb CIFtbx...
Japan
NIBH, Ibaraki   mmCIF pdb2cif cif2pdb CIFtbx...
Australia
UWA, Nedlands STAR/CIF   pdb2cif cif2pdb CIFtbx...



Updated 23 June 1998

Herbert J. Bernstein (yaya@bernstein-plus-sons.com)