Wednesday, February 22, 2012
   
Text Size

Site Search powered by Ajax

lesson 4 - Universe Secondary indexes part1

Giving UniVerse Secondary Indexes a Second Chance

 

UniVerse Secondary Indexes promise speedy selects with automatic updates but I know programmers who avoid them because they seem too awkward to access from Basic or take too long to build. By setting up pointers directly to the internal B-tree index files, we can access data in a more familiar fashion and find a couple of short cuts that might make Secondary Indexes a more viable choice.  This article is geared towards UniVerse (either Unix or Windows, UV 9.6 and above) although Pick aficionados on other platforms may be interested in the cascading I-descriptors – who knew we were object-oriented?  We'll start by looking at indexes and why we need them.

Index Basics

A Secondary Index, like the one you’d find at the back of a book, allows us to find the physical location of something by looking through a list of key words or phrases from the text and finding one or more page numbers.   The page numbers are keys in the book's primary index and allow you to find the source data.  Instead of reading the book from cover to cover looking for a certain passage, an index provides us a short cut but it’s expensive to create an index for a book; it takes extra work during the composition or it’s another step in the books production.   Some books don't have indexes: fiction, where reading sequentially from cover to cover is the intended access method (see Vladimir Nabokov's Pale Fire for a fun counter-example) and short works where scanning the entire text is feasible.

Primary Indexes

The Primary Index contains the keys to the records in our files.  In UniVerse we have three types of Primary Index corresponding to the three file types, Hashed, Directory, and B-tree.  The Hashed Index is one of the great strengths of Pick and is an efficient choice when we know the exact key to the record and we're not interested in browsing the records in any sequence, we use a mathematical formula to find a starting point for a sequential search for the key and the record.  The Directory relies on the underlying operating system (Unix or Windows) to manage the records, typically by keeping a table of keys and the location of each corresponding record.  The B-tree keeps the keys and records in sorted key order; there is additional overhead involved in storage and update but this is a very effective structure if you need to access a range of records.  If you always know the key to the record you require, a primary index is all you need.

Secondary Indexes

If you need to find your data by any attribute other than the primary key, you have two choices: browse through all the data until you find your record, or construct a Secondary Index.  The keys in the secondary index are attributes from the records in the primary index. The keys from the primary index make up the data in the Secondary Index.  This is why Secondary Indexes are sometimes called inverted indexes.  Let's look at a couple of examples that hint at the special features of UniVerse Secondary Indexes; file updates and selects are aware of them.

Managing Foreign Keys with Secondary Indexes

There is a second type of Secondary Index that you may have already implemented.  The multi-value data model lends itself to storing lists of foreign keys to detail records in the header record.  This is really handy when we keep an associated list of data about these details, we get a really quick way to put reports together, but is this the best way to handle these relationships?

  

This is actually a Secondary Index into the detail using the header key as the key, but it's stored in the header.  There are at least three reasons that this is a bad thingÔ.  Referential Integrity is enforced by your application – we need to make sure the list of details is complete and accurate at all times.  Philosophically, is the header the right place for a list of detail pointers?  The detail inherits from the header, not the other way around.  Physically, does it make sense to rewrite the header record each time we add or delete a detail?  Let's create a dedicated index:

 By divorcing the Index function from the Header record, we gain in three ways.  If we use UniVerse Secondary Indexes, referential integrity is enforced for us by UniVerse – any updates in the detail file affecting the header ID are reflected in the Index.  The Header record no longer contains Detail information providing greater clarity and fewer record updates.

Secondary Indexes as Cross-References

If you need to access records using data in the record, i.e. SELECT STUDENT WITH EMAIL EQ " This e-mail address is being protected from spambots. You need JavaScript enabled to view it ", you can do it one of two ways: browse through the entire database looking for matching EMAIL fields, or set up a UniVerse Secondary Index with MAIL as key and the primary index key as data.  

 

If you have just a few STUDENT records, you might as well browse through all the records – much the same as the short book without an index.  If you choose to set up a Secondary Index, you add overhead in terms of storage and update effort that you'll want to "amortize" over a number of accesses. One of the best features of UniVerse Secondary Indexes is that the select statement is aware of Secondary Indexes and will use them to speed selects when possible – which may speed your "amortization" too.

Foreign Keys - Linking Header and Detail records

 

In our example, we'll start with a homeroom teacher as the header record and student as the detail.

Setting Up Indexes

There are two steps to setting up Secondary Indexes – create.index and build.index.  Creating the indexes tells UniVerse what field to index within the record.  We do this by specifying a dictionary item to use for index updates.  We'll assume for our example that each record contains its parent's id in attribute 1; the TEACHER.ID is in STUDENT<1>.

 DICT STUDENT TEACHER.IDD1 

We need to create and then build the index with these two commands:

 CREATE.INDEX STUDENT TEACHER.IDBUILD.INDEX STUDENT TEACHER.ID 

Now if we issue the command:

 SELECT STUDENT WITH TEACHER.ID EQ "12345" 

UniVerse will use the index we just created instead of scanning the entire file.  Let's look at how these indexes are implemented and what happens when we create and build them.

Lesson 3 Reserved Words 

 

Newest Jobs

Universe DBA
Company: Eltas Enterprises, INC
Category: Computer/IT
Type: Internship
Posted: July 14, 2010


Our Sponsors

55°
13°
°F | °C
Overcast
Humidity: 86%
Wind: W at 12 mph
Wed

54 | 70
12 | 21
Thu

57 | 72
13 | 22
Fri

32 | 66
0 | 18
Sat

32 | 54
0 | 12