Wednesday, February 22, 2012
   
Text Size

Site Search powered by Ajax

Lesson 6 Universe Indexes part 3

SUBROUTINE TEACHER.TO.STUDENT (STUDENT.IDS, TEACHER.ID)** Returns a list of STUDENT keys for a supplied TEACHER id.* Version 3 – accessing the raw index file*COMMON /TEACHER.TO.STUDENT/OPEN.FILE,FILE.STUDENT_TEACHER.ID*RESULT = ""IF NOT(OPEN.FILE) THEN   OPEN "","STUDENT_TEACHER.ID" TO FILE.STUDENT_TEACHER.ID ELSE RETURN   OPEN.FILE = @TRUEEND*READ STUDENT.IDS FROM FILE.STUDENT_TEACHER.ID, TEACHER.ID THEN  STUDENT.IDS = LOWER(STUDENT.IDS)END ELSE   STUDENT.IDS = ""END

RETURN

 

This example is shown to mainly to give an idea of how you might deal with pointers to the raw indexes from within a file.  Clearly, we’d want to simplify this one step further:

 
I Getting keys the newest wayTRANS("STUDENT_TEACHER",TEACHER.ID,-1,"X")
 

Note that we don’t even need the "LOWER" from the basic program; TRANS takes care of that for us.  Apart from the simplicity of this method, we do away with possible SELECT list collisions.  This simple technique takes away so much of the complexity of dealing with indexes that they become a much more viable tool.  If you doubt this, imagine the case where these detail records have indexed sub-detail records, and the header records are contain indexed pointers to a master record.

 

 

Cascading I-descriptors

Using the original methods of accessing indexes, you’d need to nest selects within each other to get from school to student. With our new technique, this is accomplished by nesting raw index translates – and because I-descriptors can include other I-descriptors, we only need to write the outer I-descriptor once – for all the files!  The STUDENT.ID I-descriptor we just looked at will work for SCHOOL, TEACHER, STUDENT, and TEST – by redefining TEACHER.ID in each dictionary. 

 DICT SCHOOL  TEACHER.ID:
I Teacher from SchoolTRANS("TEACHER_SCHOOL.ID",SCHOOL.ID,-1,"X")
 

We translate into the TEACHER_SCHOOL.ID (an index of pointers from TEACHER back to the parent SCHOOL item) using exactly the same technique as we did to get from STUDENT to TEACHER.  Note that we created a copy of @ID called SCHOOL.ID.

 STUDENT.ID:
TRANS("STUDENT_TEACHER.ID ",TEACHER.ID,-1,"X")

Becomes:

TRANS("STUDENT_TEACHER.ID ", TRANS("STUDENT_TEACHER.ID ",@ID,-1,"X"),-1,"X")
 

We end up with a list of "grandchildren" for the SCHOOL record – all the students in the school.

  DICT TEACHER TEACHER.ID:
D0 TEACHER ID10LS
 

This is our original example.

 STUDENT.ID:
TRANS("STUDENT_TEACHER.ID ",TEACHER.ID,-1,"X")
 

We end up with a list of "children" for the TEACHER record – all the teacher's students.

  DICT STUDENT TEACHER.ID:
D1 TEACHER ID10LS
 

This is very close to our original example.  The only difference is that TEACHER.ID is an attribute of STUDENT this time.

 STUDENT.ID:
TRANS("STUDENT_TEACHER.ID",TEACHER.ID,-1,"X")

Becomes:

TRANS("STUDENT_TEACHER.ID",TEACHER.ID,-1,"X")
 

We end up with a list of "siblings" for the STUDENT record – the student's classmates.

  DICT EXAMTEACHER.ID:
ITRANS("STUDENT",STUDENT.ID,TEACHER.ID,"X")
 

We translate into the STUDENT file using a more traditional lookup.

 STUDENT.ID:
TRANS("STUDENT_TEACHER.ID",TEACHER.ID,-1,"X")
Becomes:
TRANS("STUDENT_TEACHER.ID",TRANS("STUDENT",STUDENT.ID,TEACHER.ID,"X"),-1,"X")
 

We end up with a list of "aunts and uncles" for the EXAM record – the students in the class of the student who took the test.

 

You could make a good argument that this reuse of the original I-descriptor providing students for a teacher is an example of object-oriented programming.

 

 

Indexing I-descriptors

Note that the dictionary item that we index can be an I-descriptor.  This allows us to use and misuse indexes in some very creative ways.  The create.index command takes a snapshot of this dictionary item, so we’ll need to re-create the index if we change the I-descriptor.  Apart from the dictionary item specifying the key, the no.nulls keyword in the create.index is the most important.  Suppose you were creating an index on email address and two-thirds of your students have email addresses.  Normally, this would mean that if you did an index lookup for an empty email address, you’d get that third of your student file without email. This isn’t a very useful result, and is really bad when you consider that IBM recommends that B-tree items be smaller than 1k for best performance.  Using the no.nulls keyword when creating the index will cause the index logic to ignore items where the index key evaluates to null.

Using NO.NULLS

Here's our index command:

 CREATE.INDEX STUDENT EMAIL EMAIL.KEY PHONE.KEY NO.NULLS 

EMAIL.KEY is an I-descriptor:

 OCONV(EMAIL,"MCL") 

We convert the email address to lower case to make sure that the sort order is meaningful.  We don’t want to rely on every application putting the data in uniformly. We use no.nulls because many students will not supply an email address (but I bet a marketer somewhere wants a list of them at some point – a job for another less dynamic type of index).

 

PHONE.KEY is another I-descriptor:

 OCONV(PHONE,"MCN"):"000"; @1[4,7]:@1[1,3] 

We assume the raw phone number is 7 or 10 digits.  The key we use for a number like (212) 555-3333 is 5553333212.  We store the index keys in sorted, right-justified ascending order.  If the customer doesn’t supply an area code, or the area code has changed since the record was created, we are still in the right neighborhood in this index. You may be thinking that you could write a routine to go look up the appropriate Area code for the customer’s ZIP code at index time.  This won't work because the index is only updated when the STUDENT file is updated, not some ZIPAREA file – and basing keys on external data may mean that it is not possible to generate the right key to update the index.

Generating File Pointers for Secondary Indexes Programmatically

indices is a Basic function that will return a list of Secondary Indexes for a supplied file, or details about a particular index including its path.  We can use this to generate VOC entries for the raw indexes to enforce a naming convention and to quickly respond to changes that might occur in indexes:

 
CREATE.INDEX.POINTERS* Program to generate VOC pointers to Secondary Index files** Ross Morrissey  Aug 2002*    FIRST.LINE = "F Secondary Index Pointer"    OPEN "","VOC" TO FILE.VOC ELSE      PRINT "Couldn't open VOC file, stopping"      STOP    END    PRINT "Path to D_INDEX.MASTER":    INPUT DICT.PATH    LOOP      READNEXT FILENAME ELSE EXIT      OPEN "",FILENAME TO FILEPOINTER THEN        PRINT FILENAME        INDEX.LIST = INDICES(FILEPOINTER)        IF INDEX.LIST THEN          LOOP            REMOVE INDEXNAME FROM INDEX.LIST SETTING MORE.INDEXNAMES            READV FTYPE FROM FILE.VOC, FILENAME:"_":INDEXNAME,1 THEN              IF (FTYPE NE FIRST.LINE) THEN CONTINUE            END ELSE NULL            INDEX.PATH = INDICES(FILEPOINTER,INDEXNAME)<1,5>            ITEM = FIRST.LINE            ITEM<2> = INDEX.PATH            ITEM<3> = DICT.PATH            WRITE ITEM ON FILE.VOC, FILENAME:"_":INDEXNAME            PRINT CHAR(9) : INDEXNAME          WHILE MORE.INDEXNAMES DO          REPEAT          PRINT        END      END ELSE        PRINT "Couldn't open " : FILENAME      END    REPEAT
 

Running it on our test data:

 
>CREATE.FILE DICT INDEX.MASTERCreating file "D_INDEX.MASTER" as Type 3, Modulo 1, Separation 2.Added "@ID", the default record for RetrieVe, to "D_INDEX.MASTER".>SELECT VOC "STUDENT" "TEACHER" "EXAM" 3 record(s) selected to SELECT list #0.>>RUN ROSS.BP CREATE.INDEX.POINTERSPath to D_INDEX.MASTER?D_INDEX.MASTERSTUDENT        EMAIL.KEY        PHONE.KEY        TEACHER.ID TEACHER        SCHOOL.ID EXAM        STUDENT.ID
  

Working with Secondary Index Data Directly

Now that we have VOC pointers to the index files, we can do more than a simple translate.  We can read and update the indexes directly.  The Secondary Indexes themselves are ordinary UniVerse records with the index entry as their @ID and the keys to the data records stored one per line.  We can use just about any Dictionary or Basic technique on these records, but any direct updates you make to the index data will corrupt the index.

Reporting Secondary Index Data Directly

Here's an example of a live production report pulled directly from a priority based index controlling a rendering images queue for large scale digital printing:

 
Render Queue at 02:37:46AM  21 Aug 2002: Sequence Priority Process..... RDs left Ref/PD # First RD.. create.. create                                                            date.... time..        1 00319199 DIGI4X             84 73385984   08843222 08-20-02 22:58       2 00319200 DIGI4X             90 73385983   08843396 08-20-02 22:59       3 00319201 DIGI4X             90 73385982   08843306 08-20-02 22:59       4 00319202 DIGI4X             90 73385981   08843580 08-20-02 23:03       5 00319203 DIGI4X             72 73385978   08843747 08-20-02 23:04 Total Clusters to Render: 32 records counted.
 
  • Sequence is an I-descriptor @NI which displays the current line number in a report.
  • Priority is the @ID of this Queue.
  • Process is the digital process type from the First RD
  • RDs left is an I-descriptor DCOUNT(@RECORD,@AM) that gives the count of indexed items at this priority.
  • Ref/PD is a pointer back to the sales order for the First RD
  • First RD is the first attribute - @RECORD<1> - of the index item and it is used to translate over to live data files to get the remainder of the information. 
 

By default we display the first five lines of the queue (out of 32).  When a renderer process is ready for work, we pull an RD item (08843222 is next) from the front of the queue.  We null out the priority attribute in RD 08843222 and it drops out of the index. These records are generally pulled out of the queue every few seconds.  Our simple report on the queue gives a lot of information in a couple of seconds.  More importantly, we can quickly provide a prioritized queue of work to our renderers.

 

Here is the code:

 
PAIF <<C2>> EQ "" THEN GO 10LIST RD_PRIORITY P.PH FIRST <<C2>> COUNT.SUP ID-SUPP HEADING "RenderQueue at 'T':'L'"GO 2010:LIST RD_PRIORITY P.PH FIRST 5 ID-SUPP COUNT.SUP HEADING "Render Queueat 'T':'L'"20:DISPLAYDISPLAY Total Clusters to Render:

COUNT RD_PRIORITY

Using and Abusing the SET.INDEX command

We've seen how index data for STUDENT is stored in a directory I_ STUDENT.  The path to this directory is stored in the file header of STUDENT.  This file header information is normally populated with the create.index command and cleared with the delete.index command.  It can also be populated or cleared with the set.index command.  This allows us to attach a pre-built index to a file instead of creating and building an index in place.  We can use this to quickly set up an empty or " no.nulls " index without scanning a file or to use a restored version of an index instead of rebuilding from scratch.

SET.INDEX and the NO.NULLS Secondary Index

The problem with this empty index is that it still needs to be built before it is active.  This can be a big problem if you have a large file.  In order to build the index, UniVerse locks the file – opens and writes both fail.  Consider the case where we are adding a new KARMA field to the enormous BIGFILE.

 

We in the Pick world are spoiled in this regard – adding a field to a file shuts the database down for hours or days in those other databases, but not ours, right?  In the case of UniVerse Secondary Indexes, even though we know that the index on this freshly created field will end up empty, the build.index command will still select and process each record in the file, keeping the file locked the entire time.

 

We can, however, create an empty file that we'll use as a "backup" of all the records with valid KARMA values – i.e. none.  If we index this file we should end up with the same result as BIGFILE, but right away.  Here's how we do this (with some output suppressed):

 >CREATE.FILE BIGFILE.B 18 1 4>SET.INDEX BIGFILE INFORMIndices for file "BIGFILE" reside in "e:/uv/I_BIGFILE".>SET.INDEX BIGFILE.B TO e:/uv/I_BIGFILE FORCE>CREATE.INDEX USING D_BIGFILE BIGFILE.B KARMA NO.NULLS>BUILD.INDEX BIGFILE.B KARMA>SET.INDEX BIGFILE.B TO NULL FORCE>DELETE.FILE BIGFILE.B >LIST.INDEX BIGFILE ALLAlternate Key Index Summary for file BIGFILEFile........... BIGFILEIndices........ 2 (0 A-type, 0 C-type, 2 D-type, 0 I-type, 0 SQL, 0 SIndex Updates.. Enabled, No updates pending Index name      Type  Build    Nulls  In DICT  S/M  Just Unique Field KARMA            D    Not Reqd  No     Yes      S    L     N    4SMALLFILE.ID     D    Not Reqd  Yes    Yes      S    L     N    1> 

Presto – instant KARMA!  Please experiment with your own BIGFILE and KARMA before trying this approach with live data.

Summary

Secondary Indexes simplify the tasks of maintaining referential integrity and managing ordered lists and queues.  The direct access examples and techniques described are a better match to traditional Pick coding practice than the selectindex Basic command.  The various short-cuts to building large indexes should allow more existing applications to take advantage of these indexes instead of continuing with grafting on home-brewed solutions.

Newest Jobs

Universe DBA
Company: Eltas Enterprises, INC
Category: Computer/IT
Type: Internship
Posted: July 14, 2010


Our Sponsors

55°
13°
°F | °C
Overcast
Humidity: 86%
Wind: W at 12 mph
Wed

54 | 70
12 | 21
Thu

57 | 72
13 | 22
Fri

32 | 66
0 | 18
Sat

32 | 54
0 | 12