Monday, September 19, 2011

Another Data Library

One outcome of the discussion at the Finili Developer Summit was one more library to add to the list — this one not so much a data library, but a document library. The “Re” library is meant to contain the mission statement, documentation, and other documents related to the project, but not used as data by the project and not created by programs in the project. Files in the Re library are not accessible as data to programs in the project, but are accessible in their entirety as documents, for example, to be added to a page layout or converted to another file format, or as templates, into which output from the project’s programs may be incorporated to create new documents.

The Re library might also contain project-level metadata, such as a project title. This kind of item is presumably accessible in some manner for use in the programs.

Re
["About"] Documents. This includes descriptive documents related to the project and other documents, media, and document templates that may be used in the project as documents rather than as data.

Sunday, September 11, 2011

The Dog Ate My Data

The line to remember from the discussion at finds2011 was “The dog ate my data.” One of the stated goals of Finili is to produce more accurate results. One of the classic errors in handling data is leaving out part of the data. We heard some wild examples of ways this can happen. The one that sticks in my mind is the story where one disk was missing, perhaps stolen, from a disk array, resulting in the loss of the records that were stored on that disk. There were many other stories, but from the end user’s point of view, they were all variations of the same excuse. You might as well be saying, “The dog ate my data.”

Regardless of the specific cause, errors won’t slip through undetected so often if there is a way for the programmer to immediately see that something is off. We think we can make that happen with real-time data visualizations that display while the program is processing the data. These would take the place of the hourglass icon that some programs display while they are running. They would show, in a compressed form, what the program is doing with the data, like an oscilloscope, or perhaps, as someone suggested, as a kind of live infographic. Infographics are part of our objective anyway, so why would we withhold them until the end of the process?

There were many signs of progress at finds2011, but the boldness of this one discussion sums it all up. If a group of people can discuss Finili and the fine points of the way it might work, it shows that the design of Finili is far enough along that it is starting to look solid. The most difficult work might still be ahead, but at least at this point we are all assured that we are working on, and talking about, the same thing.

Saturday, September 3, 2011

Finili Developer Summit 2011 Plymouth Meeting


This year’s Finili Developer Summit will be held in Plymouth Meeting, Pennsylvania, the same location as last year, on the afternoon of September 10. Discussion items this year will focus especially on syntax and style, though any topic related to the project could come up. There will be a review of the developments in syntax since last year, as well as a discussion of the recently hot issue of aliases. Another topic that may turn out to be more important than it seems at first blush is a review of the technical, compatibility, and security issues surrounding constant values.

The Summit also provides a chance to review recent work and reflect on how far we have come in the one year since the first summit, last year.

Sunday, June 19, 2011

Proposed Data Library Structure

Data library structure is a small design question, but it still has consequences for every corner of the Finili design. Do data files need to be formed into groups for ease of management, and if so, what groups, and how are they formed and accessed? The syntax working group collected thoughts on security, data, and user interface issues and is proposing the following data library structure.

Project Ownership of Data Files

A Finili data file is tagged with and belongs to the project that creates it. It is anticipated that files will typically be copied using external tools to deliver them to other projects that may use them. A project cannot directly modify a file belonging to another project (for example, to add more data). A project can, however, receive transaction requests from other projects (or elsewhere) and incorporate then into its own files.

A project can take control of another project's Finili data only by creating a new file and copying the data into it. An applet (presumably Copy) is provided for this purpose.

Using this strict approach to file ownership, it is possible for Finili to pass along most security issues to the operating system level. This approach also makes it possible for Finili to run without modifications in a sandbox, an operating system device that restricts system access for security purposes.

Data Libraries

Within the project, data libraries are predefined by the Finili language and run-time environment. The same five (or six) data library names appear (potentially) in every project. Each data library has a clearly distinct purpose.

The syntax working group selected short Latin cognate names for the data libraries. Although these are Latin words, they should be already familiar to most computer programmers.

Pre
["Before"] Original source data; input data. This includes all Finili data files acquired from other projects and all text data and media files acquired from external sources. All data files dropped into the project are placed in this data library. The Pre data library will often include aliases of files stored in other directories. Within a program, this data library is read-only. (A program can, however, delete files. In the case of an alias, the program can delete only the alias.)
Post
["After"] Output data to save, deliver, or publish. This generally includes final Finili data files that may be saved for archival purposes, subsequent analysis, or to deliver to another Finili project. It also includes all output documents, including media files, web pages, and XML documents. A Finili program cannot modify a file in this data library except to append to it, except that a program can replace a file that it created in a previous run. Program units are prevented from replacing files in the Post data library that were created by other program units.
Sub
["Below," which may imply "hidden" or "secret"] The project's working files, mostly Finili data files. Run-time logs are also included in the Sub data library (as XML). A web of options controls how files in this data library can be overwritten. An applet is available (presumably Delete) that can delete all or selected files from this data library. Deleting the Sub library ensures that a run is starting fresh with the Pre data. For complex projects, this will typically be the largest data library.
Temp
[Abbreviation for "time," which may imply "a limited time"] Temporary files, mostly Finili data files. The running program may delete these files at any convenient time. In any case, they are deleted when the program ends. Some mechanism is needed to ensure that files used later in the same program are retained. If these files are not especially large, they may be kept in memory and never written to disk.
Via
["Road"] Finili data files that are passed along, a few records at a time, for use in another simultaneous program unit. This is similar to a pipe in Linux. The files are not written to disk. There are limitations on how the files can be accessed (i.e., sequential, single-pass, with some properties not necessarily known). It is an error to use the Via data library as input to a two-pass process, or to a program unit that is not simultaneous; the compiler will suggest moving the file to the Temp data library.
Ex
["Out of"] It has not yet been determined whether a separate data library name is needed for output files that are write-only from the point of view of the project. If it is, the name Ex is proposed for this purpose. If not, the Post data library will be used for these files.

It is suggested that the data library names not be used as names of Finili data sets. It has not yet been determined whether the compiler will enforce this restriction.

User Interface for Data Libraries

Data libraries are displayed as folders in the project navigation pane. The Pre and Post data libraries are always displayed; the others appear only after files are defined there. Users can access, delete, and define files using the navigation pane interface. Users can view data in the Pre, Post, and Sub data libraries. With appropriate restrictions, users may be able to view data in the Temp data library. All data libraries appear equally in run-time visualization.

Finili requires that Finili data files, including their variables, be declared at the project level before programs can access them. A separate version of the file icon, which appears hollow or empty, will identify Finili data files that have been defined but that do not contain any records at the moment. The user can view the variable definitions and other declared file properties of an empty Finili data file but will not expect to view the actual data.

Friday, February 18, 2011

Tsedei Launches Engine Project

A consensus appears to have formed to split the project in half for management purposes, with the “Tsedei” (time series event data engine initiative) working group taking primary responsibility for all data definition and engine development, including API, efficiency, and data conversion, while the language/compiler/environment effort stays put. Tsedei is the closest to delivering anything at this point, and we don’t want to slow Tsedei down by tying it to the difficult requirements process for language/compiler/environment. It is hoped that a working prototype of Finili can be delivered faster by making Finili a client of Tsedei. In addition, this makes it easier to deliver more results from Tsedei that aren’t necessarily tied to Finili, and it should, we hope, lead to better energy efficiency in the products of both projects. Also, early errors may be avoided by having thorough formal documentation of the interface between (Finili) language and (Tsedei) engine.

This is a change in the management process only. At this point everyone is continuing to work on the same things, which means people are working on both Finili and Tsedei. However, the formal split does make it possible for people to focus on just one project or the other if they choose. It also paves the way for a possible future operational split, though no one foresees any need for that in the next couple of years.