Unexpected alarms and fading labelsFor a number of years, library administrators have been watching the development of sophisticated optical character recognition (OCR) techniques with a view to their application in the retrospective conversion of data from printed catalog cards to machine-readable form. The multifont technology incorporated in equipment such as the Kurzweil Data Entry Machine has been mature enough to support such an application for several years; the major stumbling block has been manufacturers' lack of interest in developing special handling features such as card-size masks and appropriate formatting software for what they regard as the too-limited library market. The large number of typefaces found in the average card catalog has also made the approach impractical for them. Even so, for libraries with unusually uniform card files, this might be a workable method for bibliographic record conversion.
In recent months, London-based Optiram Automation Limited has been promoting an OCR retrospective conversion service to libraries in North America. The company claims that its system not only can handle cards typed or printed in a variety of fonts, but can also process legible handwritten cards. In addition to converting the alphanumeric characters on the cards to ASCII-coded form, the system automatically processes this data into the MARC format, using special software and format recognition techniques.
The Optiram system, specifically developed for the automated conversion of paper-based human-readable information into digital form, utilizes a digital scanner or a Group III telefacsimile unit to capture the pattern of images on an input page. The captured images are processed and broken down into recognizable units, such as individual characters, and these are compared with a store of image shapes in the memory of the processor. When a "match" is achieved, the appropriate ASCII code for the character is assigned to the image unit.
The next stage of processing assembles the coded character data into words, comparing each against a dictionary to validate the assessment. The system can automatically compile specialized dictionaries when the material being scanned contains a number of words not normally used in everyday language. Complex algorithms are used throughout this "word" processing. These include the application of probability analysis based on language patterns, such as the fact that in English, a "th" is followed by an "e" in 30 cases out of 100. The system can also be programmed to standardize certain data, changing all occurrences of "volume", 'vol.", "VOL", and "v." in defined sections of the input card to a single format.
Such format-recognition techniques (relating data to their position on the initial input card and to other data on the card) and post-processing techniques are used to assign the data to the appropriate MARC fields.
Optiram claims that it can process typed or printed catalog cards into the MARC format with 99 percent accuracy before correction. Performance on legible, handwritten cards is said to be excellent. The company is currently undertaking a major retrospective conversion for the University of Edinburgh.
One of the editors has visited the Optiram facility in London. The performance of the system in catalog card conversion appears to be almost as good as claimed. Reliability is very good, but throughput was observed to be only actually a fraction of the 8 million characters per day claimed by the company. Optiram is interested in extending its services to North American libraries. Tentative prices quoted for a sample of North American input suggest that the approach would be a cost-effective solution for libraries with large numbers of handwritten cards. The cost benefit of using Optiram for the conversion of typed or printed cards does not appear to be as high when compared with the variety of relatively low-cost matching and extraction options already available in the U.S. and Canada.
[Contact: Optiram Automation Limited, Suite 411, London International Press Centre, 76 Shoe Lane, London EC4A 3JB England (01) 353-0186]
Automation penetrates large librariesTwo notes in recent newsletters are potentially relevant for LSN readers. Library Hotline notes that the Cornell University Libraries have experienced a rash of false alarms at their security checkpoints caused by the sensitized marks and labels used by some bookstores and inadequately desensitized at the point of sale.
The RTSD Newsletter contains a report of Northeastern Illinois University's tests to find a solution to the fading of OCLC-produced book labels, which can become nearly illegible after several years on the shelves. The answer appears to be the use of silver-backed (foil-backed) labels. Northeastern Illinois has found that such labels withstand fading and perform well regardless of which brands of labels and inks are used. Silver-backed labels from University Products, Inc., and Denney-Rayburn Co. were tested.
Carlyle Systems updateThe editors recently undertook an informal survey to determine the extent to which the nation's largest libraries have adopted automated library systems. Using available published sources and material from our files, we determined which of the 46 largest public libraries and which of the 116 non-public library members of the Association of Research Libraries (ARL) have installed automated systems to handle functions such as technical services, circulation, or patron access catalogs. (No attempt was made to survey bibliographic utility usage among the libraries. It should also be noted that some institutions use more than one system, developing a local system for circulation and purchasing a turnkey system to support an online catalog, for instance.)
Of 105 academic libraries, 81 (77 percent) were found to be using, or had recently contracted for, automated systems; of the 11 nonacademic, non-public library members of ARL, at least 7 (64 percent) were using automated library systems, and of 46 major public libraries checked, 34 (74 percent) were using automated systems.
Overall, 122 institutions were using 139 systems. The majority (76 percent) of the systems were commercially available products, mostly turnkey systems and some software packages. Most of the local, noncommercial systems development was in the large academic libraries, which accounted for 27 of the 34 "local" systems identified.
In the commercial arena, the following systems were represented in more than five institutions: CLSI (28), Geac (23) Data Phase (17), NOTIS (8), WLN/BLIS (7), and Carlyle (6). Geac accounted for the largest single share of the academic libraries (21), followed by CLSI (17). CLSI (11) was just ahead of Data Phase (10) in the large public libraries. The other "name" systems identified in the survey were: AdLib, Data Research Associates, DOBIS, Innovative Interfaces, LS/2000, Sirsi, Systems Control Inc., Universal, and VTLS.
Recent developments in automated serials controlAs seen in the preceding news item on automation in large libraries, Carlyle Systems Inc. is beginning to have an impact with its TOMUS online system. The system currently supports data base creation and maintenance, and an online catalog. Although there are plans to develop additional capabilities such as circulation, serials control, and acquisitions, the company does not stress these in its marketing literature, preferring to concentrate on those system components it can deliver now.
TOMUS is sold as a turnkey system, in a variety of configurations to suit the needs of libraries of various sizes. The programs are written in C, and the CPU is specially designed for Carlyle. The smallest viable system, capable of supporting five terminals and a data base of some 40,000 full MARC records, costs in the region of $30,000. The largest system sold to date, that configured for the Research Libraries of the New York Public Library, was bid as a 60-terminal system supporting 1.2 million bibliographic records. The price quoted for this system was approximately $300,000. Carlyle appears confident that it can handle much larger systems with hundreds of terminals and millions of records. Annual maintenance charges on Carlyle systems run from 12 to 15 percent of the hardware and software purchase price per annum.
In the 12 months since it began actively marketing TOMUS, Carlyle has made sales to the New York Public Library, the SUNY campuses, Rice University, and the University of Miami in Florida. Several other sales are currently being finalized.
[Contact; Carlyle Systems, Inc., 2930 San Pablo Ave., Berkeley, CA 94702, (415) 843-3538]
Boss, Richard W., OCLC issuesThe development or enhancement of automated serials control capabilities is currently a major focus of activity among the vendors of both single-function and integrated automated library systems. This news item focuses on the activities of only a small number of the vendors in the field. Readers can safely assume that most other vendors are pursuing similar developments or enhancements. It is expected that a number of advances in serial control automation will be exhibited at this month's Midwinter Meeting of the American Library Association in Washington, D.C.
CLSI and Blackwell Library Systems, Inc., have entered into an agreement whereby CLSI will market Blackwell's Perline 100 Serials Control System and its Bookline 100 Acquisitions System. A library seeking standalone Perline or Bookline systems will be able to obtain them from either Blackwell or CLSI. It is expected that the software in systems from either source will be identical, but that the range of hardware options supported by CLSI will be more limited than that available from Blackwell. CLSI will also offer Perline and Bookline as parts of its turnkey integrated multifunction LIBS 100 automated library system. However, details of the nature of the interface that CLSI is developing to integrate Perline and Bookline into its LIBS 100 system had not been announced at the time that this issue went to press.
OCLC is expected to announce the release of a product based on the MetaMicro Serials Control System. The MetaMicro product was briefly marketed as a turnkey system mounted on Southwest Technical Product's hardware. The version of the system being developed for OCLC will run on IBM PC hardware. It is expected that the serials control system will be offered as a standalone system with interfaces to the OCLC online data base, and that a version with special linkages to the LS/2000 local system will serve as the serials control module of that system.
Detailed descriptions of both Perline and MetaMicro are included in the Editors' report on automated serials control systems published in the March/April 1984 issue of Library Technology Reports.
Professional Software's Serial Control System, originally developed for the TRS 80, is now available on the IBM PC and PC/XT. The system requires 256 KB of internal memory and a dual disk drive or a hard disk. The company's approach to pricing has also changed. Purchasers may now select only those modules they require. The checkin/renewal/claiming module is essential for all functions. It is priced at $900. The optional holdings list module is $400, bindery management $300, and routing $700.
Georgetown University Medical Center Library is finalizing the development of the serial control module for its integrated automated library system software package. The module provides for the detailed recording of holdings data. Checkin is based on prediction algorithm with the result that only a single keystroke is required to register receipt of a piece. In addition to checkin, this module also supports claiming, routing, and binding. Management functions include editing and report generation and an accounting capability.
DTI is revising the serials module of its Card Datalog system. The module, which can be purchased separately or as part of a package of programs, will be expanded to accommodate standard routes, binding support, checkin of multiple copies with a single transaction, and expansion of the claims form output to accommodate extra information required by subscription agencies. The company recently opened a New York office at 260 Manor Road, Douglastown.
Sydney Development Corp., vendor of the Easy Data turnkey system, is currently developing a serials control module to be available as part of the integrated system or in standalone mode. The functions will include checkin, using prediction capabilities to allow single keystroke recording of receipt of an item; claiming; routing; subscription management; and accounting. The module is expected to be available in March.
[Contact: Professional Software, 21 Forest Ave., Glen Ridge, NJ 07028, (201) 748-7658; Dahlgren Memorial Library, Georgetown University Medical Center, 3900 Reservoir Rd., NW, Washington, DC 20007, (202) 625-7673; DTI Data Trek, 121 West E St., Encinitas, CA 92024, (619) 436-5055; Sydney Development Corp., 1385 W. Eighth Ave., Vancouver, BC V6H 3V9, Canada, (604) 734-8822]
LIBRARIAN option for library automationIn the months since OCLC registered copyright of its data base, there have been a number of attempts to overcome the suspicions of many librarians that such a copyright may not be in their best interests. That OCLC appears to be sympathetic in its approach ("Ownership of Machine-Readable Records," June 1984 LSN) has not quieted all misgivings. Recently, the OCLC Users Council attempted to respond to those doubts by sending to all members the following "Code of Responsible Use for the OCLC Online Union Catalog," which is reprinted below in its entirety:
OCLC is a cooperative library service. Participants accept a responsibility to use efficiently the OCLC system as well as a responsibility to share resources and services. Use of the system should benefit the membership as a whole and support the mission and goals of each institution. All participants will;
- Provide basic and continuing training and education for staff to enable them to use the system effectively and responsibly.
- Avoid creating duplicate records.
- Improve the quality of the Online Union Catalog by reporting errors promptly.
- Input current cataloging promptly to promote resource sharing and collection development.
- Input original cataloging according to current national standards and practices as promulgated in OCLC Bibliographic Input Standards.
- Enter current cataloging into the OCLC Online Union Catalog using the appropriate MARC formats.
- Limit use of OCLC Online Union Catalog subsystems to OCLC-authorized institutions.
- Use the information from the Online Union Catalog only for purposes which do not violate cooperative use of the OCLC systems among participants.
Item 8 does not throw a great deal of light on the reuse issue. The utility's most detailed public statement on this issue, Guidelines on the Use, Transfer and Sharing of OCLC Records," was reprinted in the August 1983 LSN.
Data base software for microsInformation Management Consultants of Jericho, NY, has developed a series of library automation programs aimed at special and corporate libraries. The LIBRARIAN system contains modules for cataloging and online catalog (LiCat), and serials control (LiSerial). Acquisitions (LiAcquire) and circulation (LiCirc) modules are under development. The modules may be implemented separately or in combination. The software, programmed in C, is compatible with a variety of hardware, including the IBM PC and DEC mini- and microcomputers. Each software module is priced at $2,750.
The LiCat module provides data base access by call number, author, title, and subject. Up to three subjects may be combined in a search. Only initial significant words are searched in the alphabetic fields; full keyword searching is not supported. The same module supports data entry and the production of cards, book labels, and spine labels.
LiSerial manages the development and maintenance of the serial data base, checkin, and routing. The routing function supports some prioritization of recipients and provides for the establishment of standard routes. In checkin, data relating to the volume and issue number of the item must be entered by keying.
[Contact: Information Management Consultants, Inc., 333 North Broadway, Jericho, NY 11753, (516) 933-8750]
Second National Conference on Integrated Online Library Systems proceedings availableAlthough not as numerous as the library automation products on display at recent meetings, there does appear to be an upsurge in the number of new software packages for general file creation and access displayed at recent information conferences.
Aaron/Smith Associates' Finder retrieval system claims to give microcomputer users the same kinds of retrieval power available with commercial data base systems such as LEXIS, BRS, and Dialog. This is achieved by the use of inverted files capable of indexing every word in a data base. For retrieval, the system offers Boolean operators, left and right truncation, and the ability to mask characters within words. Users can create their own stop word lists. Searching can be limited to specific fields in a record, as can indexing, if required. The system provides flexible display and print options.
Finder runs on the IBM PC, PC XT, PC AT, and compatible machines with PC-DOS operating systems. A hard-disk configuration is usually required to accommodate the data base. The number of records handled by the system is limited only by the amount of disk storage available. Except for the system requirement that each record must fit on a single screen, a record may have up to 50 fields, and each field can accommodate up to 255 characters.
The software is priced at $1,495. This includes the programs, documentation, and two hours of data base design consultancy. A retrieval-only module is available separately for $295, and a data entry/retrieval module, which permits the updating and searching of a data base but does not support the creation of new data bases, is $495. These options are designed to permit the searching and updating of copies of a data base in locations remote from that in which it was created.
Despite its name, the MARCON software is not designed to support library cataloging operations. Instead, it was developed by AIRS, Inc., of Baltimore to support the creation and searching of Micro Archives and Records ONline. It handles records of all types, formatted and full text.
MARCON is based on a mainframe system design developed for use in university archives and record centers. It runs on the IBM PC, the AT&T PC, and compatible micros. Most applications require hard disk storage. The programs are written in Pascal and utilize the UCSDP operating system.
Among the applications supported by MARCON are the creation of bibliographic data bases and hierarchical thesauri. Records may contain up to 7,560 characters and up to 175 fields. Boolean techniques may be applied in searching. MARCON is priced at $895.
Cucumber Information Systems of Rockville, MD offers SIRE for information retrieval on the IBM PC, DEC PDP/ 11, and VAX and other computers utilizing the UNIX, MS-DOS, or RSX operating system. The system may be used to create documents of any length. If required, these may be formatted into fields-256 fields per record can be accommodated. There is no limit to field length and all fields are variable in length. Field names are parameterized and may be ready set and changed by the user. Data may be automatically indexed as it is entered.
A variety of search options is sup>, ported: keyword, Boolean, adjacency, and truncation. Searches can be expanded by the system identifying the characteristic of relevant retrieved documents and automatically searching for other items that share these characteristics. Search results may be ranked in order of their potential relevance to the user's query. Output may also be formatted by specifying alphabetic sorting on the contents of any required field.
Capable of supporting data bases of up to 64,000 documents, the MS-DOS version of SIRE costs $600 and the "small" UNIX system version is priced at $2,500. For data bases of up to 16 million documents, the REX version costs $5,000 and the large system UNIX version will be the same price.
Readers should note the current pricing for Cuadra Associates' STAR system, which many libraries and information centers have chosen for information retrieval applications. Since STAR is marketed as a turnkey system rather than a software package, the prices include both hardware and software. Configured around Alpha Micro computers, STAR accommodates both fulltext and formatted files. It is available in a range of configurations, from a single-user system with a 10 MB hard disk for $19,560 to eight-user systems that begin at $30,355. The largest systems can accommodate up to 20 users and high-speed hard disks of 400 MB. These systems, priced at around $80,000, also include a magnetic tape drive.
[Contact: Aaron/Smith Associates, Inc., Suite 518, 1422 West Peachtree St., N.W., Atlanta, GA 30309, (404) 876-0085; AIRS, Inc., P.O. Box 16322, Baltimore, MD 21210; Cucumber Information Systems, 5611 Kraft Dr., Rockville, MD 20852, (301) 984-3539 and (301) 881-2722; Cuadra Associates, Inc., 2001 Wilshire Blvd., Suite 305, Santa Monica, CA 90403, (213) 829-9972]
UTLAS Inc. to be acquired by International Thomson OrganisationThe full 335-page Proceedings of the Second National Conference on Integrated Online Library Systems are now available, as are copies of the Proceedings of the First National Conference (held in Columbus, OH in 1983). Each volume is priced at $39.95, plus a $3 handling charge. Orders should be di-rected to the conference organizers:
Genaway & Associates, Inc., P.O. Box 477, Canfield, OH 44406.
The conference, incorrectly identified in our lead story about it in the November 1984 issue of LSN, was held in Atlanta on September, 13-14.
MICROCON: a new conversion option from OCLCIn late December the University of Toronto, UTLAS Inc., and International Thomson Organisation announced that they had signed a letter of intent under which International Thomson Limited will acquire UTLAS Inc. from the University of Toronto and will make a significant capital contribution to UTLAS Inc. A long-term association between the University and UTLAS is contemplated under which the University will continue to be served by UTLAS and the parties will explore ways of cooperating in other areas.
International Thomson Organisation is a large Canadian-based multinational company with significant interest in the publishing of a wide variety of professional, educational, and library information services. Special emphasis has been placed, in the last few years, on the electronic distribution of information and the building of data bases.
UTLAS Inc. has been supplying online database services and related products in English and French to Canadian libraries since 1973. In recent years, it has also added a number of U.S. and Japanese libraries to its user network. Over 300 institutions, members of consortia, and government agencies maintain individual databases through the utility's facilities, and more than 2,000 individual libraries of all types and sizes receive products and services from this system.
OCLC is offering a new retrospective conversion option for member and non-member libraries that, have at least 20,000 titles to convert. OCLC lends libraries contracting for MICROCON service one M300 microcomputer workstation per 50,000 titles on a rent-free basis. Libraries use the workstations to key search statements-OCLC control number, LCCN, ISBN, ISSN, CODEN, personal name/ title, or title-and local data onto floppy disks. The disks are sent to OCLC in batches of 30-approximately 20,000 search statements.
OCLC converts the disks to tape and runs them against the Online Union Catalog. When a single hit is found, the local data is merged with the record and the record is copied to tape. When a search results in between two and ten hits, the output is printed and the library selects the appropriate record by keying its OCLC number. When the number of multiple hits exceeds 10, the search key is printed with a statement "Exceeds 10 hits." When there are no hits, the search key is printed with a "No hit" note.
OCLC members are charged $.40 per hit. There is no charge for multiple hits or no hits. The charge to non-members depends on the service fee levied by the regional network through which they contract for service, but OCLC is thought to be recommending a unit charge of $.42 per hit. Non-members are also charged a one-time tape pre-filing fee of up 'to $2007 Holdings for all records developed through MICROCON are added to the OCLC Online Union Catalog. The reuse provision applied to records created under MICROCON are the same as those applied to the retrospective conversion service in which OCLC staff perform all tasks. These provisions were detailed in LSN Vol. IV, No. 6.
[Contact: OCLC, Online Products and Services Department, 6565 Frantz Rd., Dublin, OH 43017-0702 (614) 764-6000]
Publisher | Library Systems Newsletter was published by the American Library Association. |
---|---|
Editor-in-Chief: | Howard S. White |
Contributing Editor: | Richard W. Boss |
ISSN: | 0277-0288 |
Publication Period | 1981-2000 |
Business model | Available on Library Technology Guides with permission of the American Library Association. |
|
|