Site logo 14114

<REFERENCE>PRIME>BORIS>DISKER


When a disk or disk controller error occurs, PRIMOS prints a terse message on the console describing what happened, and where. The message consists mostly of octal numbers. This is how to decode them. This material came from various sources, especially the "EastOps ditty book".

/* DISKER.HELP, INSTALLER>HELP*
/* Doco on Pr1me disk status numbers.
/* Copyright 1986, BORIS systems, Lansing MI 48933
DISKER                          DISK ERRORS: Doco on Pr1me disk status numbers.

   Format: DISK x ER pdev record actual status retries [sectors nch]

   where:   x        Is RD or WT for read or write.
                        pdev     Is the physical device number.
                        record   Is the requested record number (two words long)
                        actual   Is actual record read (two words, invalid on write)
                        status   Is the disk status returned by the controller
                        retries  Varies from 1 to octal '12.
                        sectors  New at Rev. 20.2.3, sectors per track.
                        nch      New at Rev. 20.2.3, number of channels and keys.

   status word bit definitions:

                        177777   Bad record identifier.  Software noticed that the
                                         record  read from disk was not the one requested.
                        177776   Device not ready, or seek  timeout.   Check  pdev
                                         number  for  validity.  In combination with other
                                         servo  status  errors,  this  indicates  a  servo
                                         problem.   In  combination  with  other interface
                                         status  errors,  this  indicates   an   interface
                                         problem.
                        177775   Memory parity error during DMX.
                        177774   Controller not responding.  (Bad controller)
                        177773   Disk hung or controller halted.   This is usually
                                         a disk servo problem.
                        177772   Same as 177773, or Winchester seek failure.
                        100000   Valid status flag.  May  only  appear  by  itself
                                         after  a  warmstart.  Always set in a real error,
                                         since the pin is tied to a pull  up  resistor  in
                                         the controller.
                        040000   DMX overrun.  The CPU could not keep up with disk
                                         data  transfer  rate.   This  can  happen  if the
                                         controller boards are in the wrong order.
                        020000   Write protect.  Also, for CDC drives,  a  voltage
                                         fault or an up-to-speed fault.
                        010000   Data Read check error.  The  CRC  word  generated
                                         while  reading  the  data  did  not match the one
                                         stored with the data during write.
                        004000   Checksum error.   Parity  error  in  data.   Also
                                         listed as internal controller parity error.
                        002000   Header check fail.  The CRC word generated  while
                                         reading  the  header did not match the one stored
                                         with the header during write.
                        001000   Not used.  Grounded in controller.
                        000400   Not used.  Grounded in controller.
                        000200   Not used.  Grounded in controller.
                        000100   Not used.  Grounded in controller.
                        000040   Busy error.  Drive in use by other controller.
                                         Dual port devices only.
                        000020   Not used.  Grounded in controller.
                        000010   Seek error.  Disk drive seeking.   Set  when  the
                                         selected drive is not on cylinder.
                        000004   Seek error:  Either the  seek  did  not  complete
                                         within  the  time  allowed,  or  the  drive  lost
                                         on-cylinder  (heads  drifted)   or   an   illegal
                                         cylinder was requested.
                        000002   Select error.  More than one device responded  to
                                         a  unit  select command, or a unit number greater
                                         than the number of devices supported by the  prom
                                         on a '4005 responded to a unit select.
                        000001   Not available or  not  ready.   Same  as  177776.
                                         Indicates the status of the drive ready signal.

Common problems:
                        110000 or 112000
                                         If these are the only two types of  errors,  you  have  a
                                         read or write problem.
                                         If all of the errors are on one head, then it is probably
                                         the read/write board, or the hda.
                                         If all of the errors are on  the  same  track,  head  and
                                         sector then it is probably a badspot.
                                         Other possible problems:  Worn out static eliminator, bad
                                         hda, r/w electronics, bad media, bad heads, power supply.
                        102000 or 100004 or 100005 or 100014 or 100015
                                         This is some kind of servo problem.  The drive is  having
                                         trouble  getting to a cylinder and staying there.  Things
                                         to check:  Servo  adjustments,  servo  cards,  hda,  worn
                                         spindle,  power  supply,  bad  media,  actuator, velocity
                                         transducer.  Sometimes kind of  problem  causes  loss  of
                                         clocks  or  drive  faults  which cause status errors like
                                         120047 or 120057.
                        177776 and 12xxxx or xxxx4x or xxxxx7
                                         These are interface problems.  Things to check:   Cables,
                                         controller, io board(s), device prom, terminator, control
                                         electronics boards,  power  supply.   These  errors  also
                                         accompany servo problems, so check for that first.
                        104000   Bad controller.
                        315MB and picks on heads 16 ~ 19
                                         Servo may be set too fast.  Set to 50 milliseconds.

NCH  = BIT 1 ON - Do record ID check
           BIT 2 On - Ignore read error
           BIT 3 On - Do not retry on an error
           BIT 4 ON - Format track beginning at CRA
           BIT 5 ON - ECC not desired
           BIT 6 ON - IBM floppy format
           BIT 7 ON - Move servo in  -- only on read --
           BIT 8 ON - Move servo out -- only on read --
           BIT 9 ON - Strobe eary    -- only on read --
           BIT 10 ON - Strobe early  -- only on read --
           BITS 11-16 - Number of channels

ICOP ERRORS, accumulated lore (courtesy of NCC prime):

          INTERRUPT 4 - Recovered
                                5 - Unrecovered

          LSW word1 word2
                        word1 = 100400 - successful with word2 retries
                                  word2 = number of retries
                        word1 = 101400 - unsuccessful
                                  word2 = undefined

          PSW word1 word2
                        word2 =  10000 - Read check
                        word2 =   4000 - Data parity error
                        word2 =   2000 - Header not found

          ERRORS FOR NO REASON were generated by a bad revision of IDC1.DL at some
          Primos 20.2 revisions.  The bad DL files were:
                        software revision: 103
          dated
                        SEP  7 1988
          or earlier.  Disable ICOP mode by cnaming the file to ICOP or re-jumpering
          the disk controller.

1987, EASTOPS Technical Handbook