Sunday, June 19, 2011

Proposed Data Library Structure

Data library structure is a small design question, but it still has consequences for every corner of the Finili design. Do data files need to be formed into groups for ease of management, and if so, what groups, and how are they formed and accessed? The syntax working group collected thoughts on security, data, and user interface issues and is proposing the following data library structure.

Project Ownership of Data Files

A Finili data file is tagged with and belongs to the project that creates it. It is anticipated that files will typically be copied using external tools to deliver them to other projects that may use them. A project cannot directly modify a file belonging to another project (for example, to add more data). A project can, however, receive transaction requests from other projects (or elsewhere) and incorporate then into its own files.

A project can take control of another project's Finili data only by creating a new file and copying the data into it. An applet (presumably Copy) is provided for this purpose.

Using this strict approach to file ownership, it is possible for Finili to pass along most security issues to the operating system level. This approach also makes it possible for Finili to run without modifications in a sandbox, an operating system device that restricts system access for security purposes.

Data Libraries

Within the project, data libraries are predefined by the Finili language and run-time environment. The same five (or six) data library names appear (potentially) in every project. Each data library has a clearly distinct purpose.

The syntax working group selected short Latin cognate names for the data libraries. Although these are Latin words, they should be already familiar to most computer programmers.

Pre
["Before"] Original source data; input data. This includes all Finili data files acquired from other projects and all text data and media files acquired from external sources. All data files dropped into the project are placed in this data library. The Pre data library will often include aliases of files stored in other directories. Within a program, this data library is read-only. (A program can, however, delete files. In the case of an alias, the program can delete only the alias.)
Post
["After"] Output data to save, deliver, or publish. This generally includes final Finili data files that may be saved for archival purposes, subsequent analysis, or to deliver to another Finili project. It also includes all output documents, including media files, web pages, and XML documents. A Finili program cannot modify a file in this data library except to append to it, except that a program can replace a file that it created in a previous run. Program units are prevented from replacing files in the Post data library that were created by other program units.
Sub
["Below," which may imply "hidden" or "secret"] The project's working files, mostly Finili data files. Run-time logs are also included in the Sub data library (as XML). A web of options controls how files in this data library can be overwritten. An applet is available (presumably Delete) that can delete all or selected files from this data library. Deleting the Sub library ensures that a run is starting fresh with the Pre data. For complex projects, this will typically be the largest data library.
Temp
[Abbreviation for "time," which may imply "a limited time"] Temporary files, mostly Finili data files. The running program may delete these files at any convenient time. In any case, they are deleted when the program ends. Some mechanism is needed to ensure that files used later in the same program are retained. If these files are not especially large, they may be kept in memory and never written to disk.
Via
["Road"] Finili data files that are passed along, a few records at a time, for use in another simultaneous program unit. This is similar to a pipe in Linux. The files are not written to disk. There are limitations on how the files can be accessed (i.e., sequential, single-pass, with some properties not necessarily known). It is an error to use the Via data library as input to a two-pass process, or to a program unit that is not simultaneous; the compiler will suggest moving the file to the Temp data library.
Ex
["Out of"] It has not yet been determined whether a separate data library name is needed for output files that are write-only from the point of view of the project. If it is, the name Ex is proposed for this purpose. If not, the Post data library will be used for these files.

It is suggested that the data library names not be used as names of Finili data sets. It has not yet been determined whether the compiler will enforce this restriction.

User Interface for Data Libraries

Data libraries are displayed as folders in the project navigation pane. The Pre and Post data libraries are always displayed; the others appear only after files are defined there. Users can access, delete, and define files using the navigation pane interface. Users can view data in the Pre, Post, and Sub data libraries. With appropriate restrictions, users may be able to view data in the Temp data library. All data libraries appear equally in run-time visualization.

Finili requires that Finili data files, including their variables, be declared at the project level before programs can access them. A separate version of the file icon, which appears hollow or empty, will identify Finili data files that have been defined but that do not contain any records at the moment. The user can view the variable definitions and other declared file properties of an empty Finili data file but will not expect to view the actual data.