CTF Technical Specifications
Release pre-1 (Revision A) 25 Feb 1997
By Tom Wheeley
-----------------------------------------------------------------------------
Table of contents.
0 Rationale
0.1 Other formats
0.1.1 Arbitrary text standards
0.1.2 Screenshots
0.1.3 Link-Up format
0.1.4 .CAS format
0.1.5 .CAT format
0.2 Extensibility
0.3 Portability
0.4 Terminology
0.4.1 Calculator names
0.4.2 Tokens
1 Block structure
1.1 Currently supported blocktypes
1.1.1 @@Prog header
2 Data formats
2.1 Programs and Editor files
2.1.1 Program header (OldProg)
2.1.2 Program header (Editor)
2.1.3 Data format (OldProg and Editor)
2.1.3.1 Syntax
2.1.3.2 Token choices
2.1.3.3 Modifications for increased usability
2.1.3.4 Token separation
2.1.3.5 ctoken specification
2.1.3.6 Comments
2.1.3.7 Preprocessor
2.2 Text
2.3 Raw data
3 Other Issues
3.1 Implementation
3.2 More information
3.3 Credits
-----------------------------------------------------------------------------
0 Rationale
=====================
One of the main problems with Casio calculator programs is that they are
hard to represent on a normal computer. Although some of the special symbols
available on the calculator can be represented on some systems (for example
the squared symbol is in DOS), these are not compatible with all systems.
Hence, a standard for writing programs should stick to the basic printable
ASCII character set.
0.1 Other formats
~~~~~~~~~~~~~~~~~~~~~~~~~
Many other standards have been used to represent calculator programs, but
many of these are arbitrary or proprietary. Others are well specified, but
are not as well suited for typing in by humans.
0.1.1 Arbitrary text standards
These proliferate on the internet, where people have tried to represent
the programs they have written in a text form for easy downloading and
entering into the calculator.
They are unsuitable for several reasons:
* They cannot be accurately parsed by a computer program
* They must be fully documented either in the program file or with an
accompanying web page or document.
* The standards used vary from site to site, and thus users have to get used
to a large number of variations.
0.1.2 Screenshots
Some people have decided to present their programs as a series of
catenated screenshots. Whilst these are easy for people to read, they are
unsuitable for programs and cannot be viewed by people without graphical
terminals. The image size is generally larger than the corresponding text
file would be, leading to slower download times, and wasted disk space.
Screenshot representations are also harder to produce from the calculator
program than text files, and can only be realistaclly achieved using a cable.
0.1.3 Link-Up format
The link up format (where all tokens > length 1 are enclosed using $ signs,
whilst easy to parse by a machine, is relatively unreadable by people, and
even more difficult to type.
0.1.4 .CAS format
The format used by Casio's CASIOLNK software for the 7700, 8700 and 9700
is a binary format based on the calculators internal representation of the
data. This works well for programs to read and understand, but it is
unreadable by humans without special software. The newer style CAS format
used by the 9850 programs is even better as it allows for storage of multiple
types of data, but the formaer caveat still applies.
0.1.5 .CAT format
This is the format used by the LINK software released by Casio for the
9750, 9850 and 9950 range of calculators. It is similar to CTF, in that it
uses a set of plain text tokens, but it requires the usage of a `\' before
all tokens. This makes the text files very hard to read and difficult to
type, although as with the link-up format, it does make the text easier to
parse.
0.2 Extensibility
~~~~~~~~~~~~~~~~~~~~~~~~~
A vital feature of CTF is extensibility, so that it may be added to in order
to support new models of calculator, or new data formats which have been
deciphered.
To support this, CTF is a block structured format which compartmentalises
the individual data items into separate blocks. Any blocks which are not
understood by the software reading them can be ignored, and any readable
data after them can still be read.
CTF has already been extended to cope with the new 9850 data formats and
new tokens, and it should theoretically be possible to add support for the
Texas Instruments range of programmable graphical calculators also, should
the need arise.
0.3 Portability
~~~~~~~~~~~~~~~~~~~~~~~
CTF must be as portable as possible. In that respect, *all* data within
CTF must stick to the printable ASCII character set: chars 10,13,32-126.
Programs which read CTF should be able to cope with DOS (CRLF), Unix (LF) and
Apple Macintosh (CR) style newlines without complaining.
0.4 Terminology
~~~~~~~~~~~~~~~~~~~~~~~
0.4.1 Calculator names
Currently, there are two main families of calculator marketed by Casio.
I am only mentioning the models available in the UK, but similarly numbered
calculators are equivalent in other countries. The first family is the
fx7700, fx9700 and CFX-9800 calculators. I will refer to this family as
being the 9700-style calculators.
Secondly, there is the new range of calculators, the fx9750, fx9850 and
fx9950. These are characterised by the new structured additions to the
programming language, and are referred to as 9850-style calculators.
0.4.2 Tokens
Tokens are any single letter or function name present on the calculator.
for example, `X', `log ', ` ' and `Graph Y=' are all tokens. Every token
on the calculator is referred to by a code called a `ctoken'. This is
similar to the encoding used inside the calculator, but uses a simpler
coding scheme.
1 Block structure
===========================
The most important information regarding CTF for programmers is the correct
handling of the block structure. Strictly speaking, a program which simply
handles the blocks and nothing else is CTF-compliant.
Blocks are specified as follows:
@@Blocktype Name (Parameters)
Typical commonly used examples are:
@@Program "MYPROG" ; specifies a 9700 editor file / 9850 program
@@Program A (sd ssto) ; specifies a 9700 program, with the
; `standard deviation' and `store stats' bits set
The string `@@' must be in the left 2 columns of the screen to represent
the beginning of a new block. Blocks must *not* contain the string @@ at
the beginning of any line.
If a block is unsupported, then the program should either ignore the block
until it reaches another `@@', or read it in as plain text.
Matching of blocktype names is case insensitive, so @@OldProg, @@oldprog
and @@oLDPRoG should all work.
1.1 Currently supported blocktypes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@OldProg ; Programs A-Z,r,theta and single programs on the 9700.
@@Editor ; 9700 Editor files / 9850 programs.
There are two other types designed to deal with unknown data:
@@Text ; plain text -- can be used to enter instructions for use
; of the programs / data
@@Raw ; designed to hold raw binary data read from .CAS files.
Future types may include:
@@SSMono ; monochrome screenshot
@@SSCol ; colour screenshot
@@Vars ; variable memories
@@Matrix ; matrix
@@List ; list
1.1.1 @@Prog header
Due to possible confusions over the OldProg and Editor types, these are
encapsulated within the umbrella @@Prog type. The actual type of an @@Prog
header can be ascertained from the Name as follows:
@@Prog ; OldProg
@@Prog A ; OldProg
@@Prog "EDITOR" ; Editor
Programs which write CTF should use @@Prog. People writing CTF should be
strongly encouraged to use @@Prog.
A synonym for @@Prog is @@Program. The two may be interchanged freely
without changing the syntax of the CTF file.
2 Data formats
========================
2.1 Programs and Editor files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2.1.1 Program header (OldProg)
The 9700 series of calculator support two types of program transmission.
The first, called `Single Program' does not specify a slot in the data type,
but the program can be positioned in the A-Z,r,theta range manually by the
user. The second is `Multiple Program' and the entire set of programs is
sent at once.
For multiple programs, a slot must be set in the header, thus:
@@Prog Z
Note that the program letter must be in upper case, so as to distinguish
between Prog R and Prog r.
The other set of information conatined about a program is the mode for
which the program runs. This is organised as a bitfield, for which certain
bits can be set. These bits can be set by entering keywords into the
parameter field. The keywords are case-insensitive and are:
BASE-N Program runs in base-n mode
SD Standard Deviation mode
LR Linear Regression mode
MATRIX Matrix mode
SSTO Store statisitics data
SDRAW Draw statistics graph
These can be combined using a space as a delimiter, eg:
@@Prog r (LR sSto sDraw)
2.1.2 Program header (Editor)
The header for an Editor file is simpler, as there is much less information
stored. The syntax is:
@@Prog "FILENAME" (PASSWORD)
The maximum length for the filename / password is 12 characters for the
9700 family, and 8 characters for the 9850 family. Longer filenames /
passwords will need to be truncated. Note that the data and the password
are both unencrypted in a CTF text file.
Of important note is that the filename and passwords must be parsed as a
line of tokens, not as a line of ascii text, as they are of course made up
as a set of tokens within the calculator.
Only a very limited range of tokens are allowed in filenames and passwords,
the uncomfirmed list for the 9700 (and probably rthe 9850 too) is:
A-Z 0-9 "'[ ]+-*/~ r and theta.
Note that +-*/ mean the arithmetical operators, _not_ the ascii symbols
used here as a representation. (There is a difference)
2.1.3 Data format (OldProg and Editor)
Perhaps the most difficult and important section of the CTF specification
is the specification of program data. In the calculator, the program is
stored as a series of byte codes, from which the ctoken numbering system is
devised (the actual encoding of tokens is beyond the scope of this document,
see the .CAS file specifications for more information). What is important
to CTF is the translation of textual tokens to and from ctoken numbers.
2.1.3.1 Syntax
CTF programs are made up of a series of tokens, separated by any amount
of whitespace, except within "strings" where space (ascii 32) is an actual
token in its own right. Note that tokens may also be separated by *no*
whitespace whatsoever:
is equivalent to
TAB
Tokens may contain any character permitted by the general outline of the
CTF format, ie printable ASCII. Also of note is that newlines can and are
parts of tokens and should be handled as such, not hardcoded as special cases
in the CTF accessing program, as any token can include a newline in it.
(A prime example of this is the disp triangle token)
Unlike most of the CTF format, program tokens are case sensitive. This is
to enable differentiation between upper and lower case letters and to enable
accurate matches for tokens within strings.
This strict handling of whitespace means that care must be exercised both
when defining the set of tokens used for programs, and when reading CTF
program data. To standardise the output of algorithms used to match the
data, at any point whilst reading data, the token matched should be the
longest token possible starting from that point. Thus:
`RePlot'
will match to be `ReP'`l'`o'`t' rather than `R'`e'`Plot'.
Lines of program data *should* be less than 80 characters wide.
Implementors should try to use algorithms which can cope with infinite
line length, but a reasonable minimum maximum is 255 characters.
2.1.3.2 Token choices
This section is not intended as a list of tokens, but merely to outline
the guidelines which were followed in creating the token table.
1) All text-only tokens (eg `cos') are typed exactly as on the calculator
2) Various symbols were chosen for commonly used operators; eg +-*/ for
the basic arithmetical operators, (-) for unary negate, % for
fraction delimiting and _ for the disp triangle token.
3) Symbols which exist on the calculator but are used in rule 2) are
permitted, but preceeded with a backslash. This is used for symbols
such as `/' which are not commonly used on the calculator.
4) All other tokens which are not easily represented using the ASCII
character set are given a descriptive name and preceeded with a
backslash. Note this includes tokens such as \asin for arcsin.
An important feature of tokens in CTF is that they include the implicit
space as well as the text which makes up the token. For example, the token
`Abs' is in fact `Abs ', whereas the token `Rnd' is still `Rnd' as that has
no trailing space on the calculator.
2.1.3.3 Modifications for increased usability
Although all CTF programs should output the standard tokens mentioned
above, keeping to exact capitalisation and spacing is too much to ask for the
user, so when reading CTF program data two exceptions can be made:
1) The token may be typed entirely in lower-case (except for variable
names A-Z, and variable names in tokens, such as `graph Y=').
2) The trailing space may be omitted.
3) For the more commonly used tokens such as \asin and \sqrt, the preceeding
backslash may be omitted.
The above rules are *not* followed if the token is within a string in the
program, as there the text "logical" would be read as "log ical" as the `log'
at the beginning would be matched. However, if the user had the text
"Linear", then the `Line' would be matched, and would result in no
discernable difference to the program other than smaller size, so this
matching is encouraged.
2.1.3.4 Token separation
Although very unlikely (I have never encountered a case), the CTF program
data specification allows for tokens to be manually separated to ensure
correct parsing. For example the semantically incorrect token pair:
`-'`>'
would be interpreted as meaning `->' (assignment arrow). In a program this
can be separated by a space (and cause an error in the program), but this
option is not possible in a string (where the combination is legal) where a
space would be undesirable. Hence in this situation the token separator `|'
would be used:
"-|>"
which will match to `-'`>' rather than `->'.
2.1.3.5 ctoken specification
If for some reason a specific ctoken number is wanted in the code when
sent to the calculator, this number can be sepcified in the CTF data using
the syntax:
@bnnn
where b is the base and nnn is a three digit number (must be padded with
zeros if smaller than 100 in the base b).
Legal bases are:
d decimal
h or x hexadecimal
o octal
and the number must be in the range 1 <= n <= 767.
For example, to obtain a `|' character on the calculator screen the ctoken
number must be expressly declared, as the `|' character is already used as
a control character in CTF. The ctoken code for `|' is 124, so the token:
@d124
evaluates to a `|' character.
2.1.3.6 Comments
In addition to the calculator's own comments, CTF supports two types of
comment which should be ignored by any software reading a CTF file. The
exact definition of the comments may vary between data types, so comments
will be defined for each data type, rather than globally.
In general, the comment characters are # and ; Stylistically, # should
be used for block comments with the # at the beginning of each line. ;
should be used for single line comments on the end of lines with data on.
In Program Data, both ; and # are legal tokens, so they are only deemed to
start a comment if they are preceeded by a space, tab or newline. This
definition works well for program sections as # is only used in the token
`Ran#' and `;' is generally used straight after the number it is referring
to (used to denote frequency in statistics modes). However, it breaks down
in strings as the `#' is often used for visual impact, and can easily be used
with a space before it. Thus comments cannot start on lines which have an
unclosed string. eg:
# example of comment problems
"text1" ; this is a legal comment
"text2 ; this is part of the text!
more text2" ; this is a legal comment
2.1.3.7 Preprocessor
Although not defined yet, various pieces of syntax for a preprocessor are
built into the CTF format. Modelled basically on the C preprocessor, it will
include such commands as:
$define variable value
$ifdef variable
$else
$endif
$ifndef variable
$warning text
$error text
as well as perhaps:
$define variable z
$if ${variable} == y
$error z==y
$endif
Defines can be accessed in code similarly to Makefiles and shell variables:
Range 1,${_scrwidth},0,1,63,0
is an example which would allow portability between 7700 and 9700 graphical
programs. (with other code, obviously). The $ifdef will allow things like
$ifdef _9850+
0->A~Z
$else
Mcl
$endif
Various predefines (with leading underscore) should also be included in the
software used, eg:
Models
_7700 _7700+ _9700 _9700+ _9800 _9800+ _9850 _9850+ _9950 _9950+
Capabilities (derived from model info by CTF software)
_colour _color _matrix _complex _scrwidth _scrheight _array _list
This is the section of CTF I am most uncertain about as I have not
implemented it in software myself, so I haven't run into any problems in the
standard which might require changing.
The main aim behind the preprocessor segment of CTF is to encourage
portable programs which will instantly transfer to the widest range of
calculators. The problem with the preprocessor is that it may confuse
users entering programs by hand, especially if there is a lot of trickery.
2.2 Text
~~~~~~~~~~~~
Text is a very simple format and can be regarded as providing a way of
including large `comments' in a program. The Text blocktype is intended for
CTF programs to use as a default type for blocks it doesn't understand, but
it should be useful for users to be able to use the type directly.
A CTF reading program should simply read in the lines of text as they are
and either discard them or store them depending on what the action of the
program is.
As for all blocks, a Text block ends on a line beginning with @@. Care
should be taken that no line of text starts with this either.
Comments are meaningless in text blocks.
2.3 Raw data
~~~~~~~~~~~~~~~~
Raw is a datatype invented to cope with receiving unknown data from a
calculator, yet still allowing CTF to cope with all data possible. In order
to stay within the condition that all of CTF must be printable, the raw
binary data is translated into a hex dump. This allows not only easy reading
and writing, but also allows calculator hackers to edit the binary data and
send it back to the calculator.
The exact format of Raw data is undetermined at this point. A useful
raw datatype would be one which generates checksums on the fly rather than
including them in the data. Headers and data blocks from the calculator
transmission should also be in separate blocks.
Generally the data should be arranged in lines of 16 bytes:
34 f5 38 fd ... (16 hex pairs)
Any text after the 16 hex pairs representing each byte is ignored, so CTF
outputting programs would be perfectly at liberty to add a 16 character wide
text conversion of the data, as used in the DOS program `debug' for example.
3 Other Issues
========================
3.1 Implementation
~~~~~~~~~~~~~~~~~~~~~~~~~~
* Model
The token<->ctoken translation differs for different calculator models,
even within the same series, so CTF reading programs will need to know
the model for which they are working for.
3.2 More information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information on Casio Calculator Hacking, and the Calculator Text Format
can be found on the Word Wide Web at:
http://web.archive.org/web/20041120090124/http://www.tsys.freeserve.co.uk/casio/
3.3 Credits
~~~~~~~~~~~~~~~~~~~
Tom Lynn
For making life difficult
Roy Maclean
For ideas
Magnus Werner
For ideas and the Casio Calculator Mailing LIst
Casio corporation
For making life difficult