August 05, 2003

Version 2: Sample Open Data Format Bill

This is an update to the first sample bill, based on discussions on the mailing list. I'd like to thank all the people who contributed to the discussion: Brad Collins, Christopher Marshall, Joseph Shraibman, Karl Pinc, Mark Alexander, Ray Robert, Scott Zak, Steve Baker, Taran Rampersad, plus two people identified only as "jks" and "E R".

If you want to see the whole discussion, you can join the mailing list and look at the archives.

A BILL FOR AN ACT

  1. The [governing body] finds that:
    • [Government] archives, handles, and transmits information which does not belong to it, but which is entrusted to it by its citizens. [Government] must take measures to safeguard the integrity and accessibility of this public data.
    • It is necessary to the functioning of [government] that computer data owned by [government] be permanently available to [government] throughout its useful life;
    • Preservation of data for the future is an important function of [government];
    • While managers of computer systems give due attention to backing up data files and preserving such backups for future retrieval, much less attention is paid to ensuring that software that can read such backups. and the computer systems needed to run such software, remains available.
    • Data stored by [government] is often released outside of [government], for auditing purposes, viewing by private citizens and companies, and the like;
    • It is not always known in advance which data will need to be used in the future, and the decision to make data public may be made at a time when the computer system that originally wrote it is no longer available.
  2. The [governing body] further finds that:
    • [Government] is often involved in exchanging computer data with entities outside of [government], including other governments, companies, and citizens;
    • It is important that those receiving data from, or sending data to, [government], be free of restrictions on using the data, with no requirement that these third parties contract with any given third-party provider;
    • It is in the public interest to ensure exchange of computer data through the use of software and products that promote data stored in open formats;
    • Storing the same data in multiple formats leads to extra work in converting the data, and leads to the possibility that different formats may not preserve all the semantic meaning of the original format.
  3. The [governing body] further finds that:
    • To guarantee the succession and permanence of public data, it is necessary that [government]'s accessibility to that data be independent of the goodwill of [government]'s computer system suppliers, or on the continued existence of those suppliers;
    • It is in the public interest that [government] be free, to the greatest extent possible, of restrictions imposed by parties outside [government]'s control on how, and for how long, [government] may access the data it is storing on behalf of its citizens.
  4. The [governing body] further finds that:
    • Storing data in open data formats, free of any restrictions or cost to use, guarantees that the encoding of the data is not tied to a single software provider;
    • Complete documentation of formats used to encode data ensures that data files could be read at a future time, by writing new software to interpret the data files, even if the original software that encoded it was unavailable due to lack of computer hardware or software;
    • Many data formats are extensible and it is important that the extensions to open data formats also be documented and have the other defined characteristics of an open data format, so that the entirety of the original format plus the extensions also are open;
    • Open data format software encourages exchange of data between different software products, and
    • Properly designed encryption systems depend on the secrecy of keys and other information that is distinct from the format in which data is stored, thus public knowledge of a data format used to store encrypted data will have no negative effect on the security of that format.
  5. Therefore, it is in the public interest that [government] use open data format software in all of its computing functions.

Be It Enacted by the [people]:

SECTION 1: DEFINITIONS

  • 'Open data format' means a data format that encodes computer data in such a way that the encoding:
    (A) Is free for all to implement and use in perpetuity, with no royalty or fee;
    (B) Has no restrictions on the use of data stored in the format;
    (C) Has no restrictions on the creation of software that stores, transmits, receives or accesses data codified in such way;
    (D) Has a specification available for all to read, in a human-readable format, written in commonly accepted techical language;
    (E) Is completely documented, so that anyone can write software that can read and interpret the complete semantics of any data file stored in the data format.
    (F) If it allows extensions, all extensions of the data format must themselves be documented and have the other herein defined characteristics of an open data format for;
    (G) Allows any file written in that format to be identified as adhering or not adhering to the format.
  • If a data format includes any use of encryption, the encryption algorithm must be usable on a royalty-free, non-discriminatory manner in perpetuity, and documented such that anyone in possession of the appropriate encryption key or keys shall be able to write software to unencrypt the data.
  • A format that is not an open data format shall be referred to as a restricted format.

SECTION 2: GOALS

  • All computer data that [government] stores shall be in an open format. This includes any data that is stored on [government] computers, or exchanged between a [government] computer and a computer outside of [government].
  • The data formats covered by this bill do not include the following:
    (A) Protocols used for network communication;
    (B) Data formats of files used only by the internal workings of a particular piece of software, as an example those that store configuration information not needed to retrieve user data.
  • Any new data formats which [government] defines and to which it owns all rights shall be open data formats.

SECTION 3: NEW SOFTWARE

  • For all new software acquisitions, the person or governing body charged with administering each administrative division of [government], including every department, division, agency, board or commission, without regard to the designation given the entity, shall ensure that all data will be written in open data formats.
  • Open data formats shall be used in situations where the other requirements of the project do not make it inappropriate or technically impossible to use. For a particular project involving storing or exchanging data, when satisfaction of essential project requirements precludes the use of an open data format, then a restricted data format may be chosen.
  • Neither the current storage format of previously-collected data, nor current utilization of specific software products, shall be deemed in and of themselves sufficient reason, in absence of other specific overriding functional requirements, to use a restricted format.
  • If a restricted data format is used, [government] shall provide written justification to the central purchasing agency as to why it was unable to use an open data format.

SECTION 4: EXISTING DATA

  • In situations where [government] has existing data stored in a data format to which it owns all rights, including data produced by software developed outside of [government] as a "work for hire", the format shall be made open within one year from the day this bill becomes law.
  • Existing data stored in a restricted format to which [government] does not own the rights may continue to be stored and processed in that format. Projects that continue to use closed formats shall be reexamined every four years to determine if the format has since become open, and if not, whether an appropriate open format exists.
  • If a project is undertaken to convert existing data in a restricted format to another format, including conversions made for the purpose of archiving data, an open data format shall be selected for the new format unless there are technical reasons preventing it.
  • If any existing data is released to the public, it shall be converted to an open format where possible.

SECTION 5: DOCUMENTATION

  • Documentation on open data formats used by [government] shall be made readily accessible from a central location on the Internet. When data in an open format is made available through [government]'s website, a link shall be provided to the corresponding data format documentation.
  • If data in an open format is made available to the public via a different method, a reasonable attempt shall be made to provide information on the data format documentation.

    Posted by Adam Barr at August 5, 2003 11:38 PM | TrackBack

Comments